# 1.0 Introduction to Functions

## Section Objectives

1) Create a function with a single return path.

2) Create a function with multiple return paths.

3) Create a function with multiple arguments.

4) Create a function with optional argument.

5) Call a function inside of another function.

What are Functions ?

Functions allows us to make code more reusable by maximising its usage multiple times across the project. By creating a generic function on top of the code, we do not have to re-define the same property of the code over and over again across the body of the code.

More on functions can be found on the python official doc.

https://docs.python.org/3/tutorial/controlflow.html#defining-functions

Download the dataSet for this tutorial 1:
https://www.dropbox.com/s/go6wsi99kqzsbmv/movie_metadata.csv?dl=0

In [1]:
# Explore DataSet
import pandas as pd
# Let's open and read the file into a string variable
# Here we will get a general idea of the dataSet

file = open('C:\\Users\\Maged Helmy\\Dropbox\\GitHub\\movie_metadata.csv', 'r',encoding="utf8")
movies = file.read()
            
# Split the data into rows on the newLine character

split_movies = movies.split("\n")

# Create a for loop through each row 

movie_data = []
for row in split_movies:
    movie_data.append(row.split(","))
    
pd.DataFrame(movie_data[0:5])

Unnamed: 0,0,1,2,3,4,5,6,7
0,movie_title,director_name,color,duration,actor_1_name,language,country,title_year
1,Avatar,James Cameron,Color,178,CCH Pounder,English,USA,2009
2,Pirates of the Caribbean: At World's End,Gore Verbinski,Color,169,Johnny Depp,English,USA,2007
3,Spectre,Sam Mendes,Color,148,Christoph Waltz,English,UK,2015
4,The Dark Knight Rises,Christopher Nolan,Color,164,Tom Hardy,English,USA,2012


## 1.1  Create a function with a single return path

In [None]:
# lets write a function that prints out the names of the movies in the dataSet 

def first_elt(aList):
    elements = []
    for row in aList:
        elements.append(row[0])
    
    return elements

movie_names = first_elt(movie_data)
print(movie_names[0:5])

## 1.2 Create a function with multiple return paths.

In [None]:
# For example, multiple returns can occur if we use an if function 
# Lets write a function that checks if a particular is made in the USA or not

def is_usa(aList):
    for row in aList:
        if row[0] == 'The Dark Knight Rises':
            if row[6] == 'USA':
                print('Dark Knight Rises is Made in the USA')
                return True
            else:
                return False
        else:
            continue
        
wonder_woman_usa = is_usa(movie_data)

## 1.3 Create a Function with Multiple Requirements

In [None]:
# The above is not so convential nor acessible !
# Lets add another layer of abstraction to the above

# Let's write a function that takes the list and string to check for
# let's check if a certain movie is in color or not
wonder_woman = ['Wonder Woman','Patty Jenkins','Color',141,'Gal Gadot','English','USA',2017]

def index_equals_str(list_name,column_number,checkValue):
    if list_name[column_number] == checkValue:
        return True
    else:
        return False
    
check_color = index_equals_str(wonder_woman,2 ,'Color' )
print(check_color)

## 1.4 Create a function with optional argument.

In [4]:
# Count the number of movies made in the USA

def feature_counter(list_name, column_number, checkValue,header_row = False):
    list_of_american_movies = 0
    
    if header_row == True:
        list_name = list_name[1:len(list_name)]
    
    for row in list_name:
        if row[column_number] == checkValue:
            list_of_american_movies = list_of_american_movies+1
        else:
            continue
               
    return list_of_american_movies
    
    
feature_counter(movie_data,6,'USA',True) 

3732

## 1.5 Call a function inside another function.


In [5]:
def summary_statistics(dataSet):
    '''Return information about a dataSet specific property'''
    num_japan_films = feature_counter(dataSet,6,'Japan',True)
    num_color_films = feature_counter(dataSet,2,'Color',True)
    num_color_in_english = feature_counter(dataSet,5,'English',True)
    
    summary_dict = {'Japan Films':num_japan_films, 'Coloured Movies':num_color_films, 'English Movies':num_color_in_english}
    
    return summary_dict
    
    
summary_statistics(movie_data)

{'Japan Films': 22, 'Coloured Movies': 4714, 'English Movies': 4611}

# 2.0 Introduction to Classes

## Section Objectives


Download dataSet: https://www.dropbox.com/s/lyr08qvue5uq5wh/nfl.csv?dl=0

2.1) Create a class with custom methods like, print certain amount of rows, print data of a specific column, print count of unique values of a specific column

Other important sources

https://docs.python.org/3/tutorial/classes.html
https://docs.python.org/3/reference/datamodel.html#basic-customization

In [25]:
# Explore DataSet
import csv
# Let's open and read the file into a string variable
# Here we will get a general idea of the dataSet

file = open('C:\\Users\\Maged Helmy\\Dropbox\\GitHub\\nfl.csv', 'r',encoding="utf8")
csvreader = csv.reader(file)
            
nfl_data = list(csvreader)

pd.DataFrame(nfl_data)

Unnamed: 0,0,1,2,3
0,year,week,winner,loser
1,2009,1,Pittsburgh Steelers,Tennessee Titans
2,2009,1,Minnesota Vikings,Cleveland Browns
3,2009,1,New York Giants,Washington Redskins
4,2009,1,San Francisco 49ers,Arizona Cardinals
5,2009,1,Seattle Seahawks,St. Louis Rams
6,2009,1,Philadelphia Eagles,Carolina Panthers
7,2009,1,New York Jets,Houston Texans
8,2009,1,Atlanta Falcons,Miami Dolphins
9,2009,1,Baltimore Ravens,Kansas City Chiefs


## 2.1 Creating a Class in Python with Custom Methods

In [27]:
# __init__() is a method, which is another way of saying a class's function.
# We will access our data using classes

class Dataset:
    def __init__(self,data): #instantiate an instance of the class to an object
        self.header = data[0] #dot notation to access attribute
        self.data = data[1:] # added header to the initiailzer so it is accessed only ones
        
    def print_data(self,num_rows): #here we created our own method to print a certain amount of rows
        print(self.data[:num_rows])
        
    def column(self,label): # print column of specific header
        index = 0
        if label not in self.header:
            return None
        for idx,value in enumerate(self.header):
            if value == label:
                index = idx
        column = []
        for row in self.data:
            column.append(row[index])
        
        return column
    
    def count_unique(self,label): #we will count the number of unique values in a column
        unique_values = set(self.column(label))
        count = len(unique_values)
        
        return count
    
    def __str__(self):
        return str(self.data[:10])

# Make an instance of the class on the Dataset

nfl_dataset = Dataset(nfl_data)

# try print_data method

nfl_dataset.print_data(5)

# try header annd data row

header_row = nfl_dataset.header
data_row = nfl_dataset.data

# print specfic column
year_column = nfl_dataset.column('year')
player_column = nfl_dataset.column('player')

# print unique values of specfic column
nfl_dataset.count_unique('year')

[['2009', '1', 'Pittsburgh Steelers', 'Tennessee Titans'], ['2009', '1', 'Minnesota Vikings', 'Cleveland Browns'], ['2009', '1', 'New York Giants', 'Washington Redskins'], ['2009', '1', 'San Francisco 49ers', 'Arizona Cardinals'], ['2009', '1', 'Seattle Seahawks', 'St. Louis Rams']]


5

# 3.0
