# Functions: Intermediate

## Default Arguments

- Write a function named open_dataset() that takes the name of a CSV file in as the input and returns the file as a list of lists.

    - Initiate the function with the default argument 'AppleStore.csv'. Name the parameter as you wish.
    - Inside the function definition:
    - Open the file using the open() function.
    - Read in the opened file using the reader() function, which we import from the csv module.
    - Use list() to convert the output returned by reader() to a list of lists.
    
- Use the open_dataset() function to open the AppleStore.csv file.
    - Use the open_dataset() function by taking advantage of the default argument.
    - Assign the data set to a variable named apps_data.
    
- Inspect the apps_data data set after you run the code to confirm that everything went as expected. 
    - You can try to print the first few rows or just use the variable inspector of the code editor.

In [7]:
def open_dataset(file_name='AppleStore.csv'):
    opened_file = open(file_name, encoding="utf8")
    from csv import reader
    read_file = reader(opened_file)
    data = list(read_file)
    
    return data

apps_data = open_dataset()

In [8]:
apps_data = [row[1:] for row in apps_data]
print(apps_data[0])

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


## The Official Python Documentation

Use this [link](https://docs.python.org/3/) to access the official python documentation

In [1]:
open?

In [2]:
import sys
sys.executable

'C:\\Users\\USER\\Anaconda3\\python.exe'

In [1]:
from platform import python_version
print(python_version())

3.7.2


In [2]:
round?

In [5]:
one_decimal = round(3.43, 1)
print(one_decimal)
two_decimals = round(0.23321, 2)
print(two_decimals)
five_decimals = round(921.2225227, 5)
print(five_decimals)

3.4
0.23
921.22252


## Using Multiple Return Statements

In [9]:
apps_data[0]

['id',
 'track_name',
 'size_bytes',
 'currency',
 'price',
 'rating_count_tot',
 'rating_count_ver',
 'user_rating',
 'user_rating_ver',
 'ver',
 'cont_rating',
 'prime_genre',
 'sup_devices.num',
 'ipadSc_urls.num',
 'lang.num',
 'vpp_lic']

In [10]:
def open_dataset(file_name='AppleStore.csv', header=True):
    opened_file = open(file_name, encoding="utf8")
    from csv import reader
    read_file = reader(opened_file)
    data = list(read_file)
    
    if header == True:
        return data[1:]
    else:
        return data
    
apps_data = open_dataset()
apps_data = [row[1:] for row in apps_data]
print(apps_data[0])

['281656475', 'PAC-MAN Premium', '100788224', 'USD', '3.99', '21292', '26', '4', '4.5', '6.3.5', '4+', 'Games', '38', '5', '10', '1']


In [13]:
def sum_and_difference(a, b):
    a_sum = a + b
    difference = a - b
    return a_sum, difference

sum_diff =sum_and_difference(10, 5)
print(sum_diff)
type(sum_diff)

(15, 5)


tuple

- One thing you might have found a bit odd is the structure of the output `(20, 10)`. `(20, 10)` is a tuple, which is a data type that is very similar to a list (recall that examples of data types include integers, strings, lists, dictionaries, etc.).
- Just as a list, a tuple is usually used for storing multiple values. Creating a tuple is similar to creating a list, with the exception that we need to use parentheses instead of brackets.
- Just as lists, tuples support positive and negative indexing.
- The main difference between tuples and lists boils down to whether we can modify the existing values or not. In the case of tuples, we can't modify the existing values, while in the case of lists, we can. Below, we're trying to modify the first value of a list and a tuple.
- Tuples are called **immutable** data types because we can't change their state after they've been created. Conversely, lists are mutable data types because their state can be changed after they've been created. The only way we could modify tuples, and immutable data types in general, is by recreating them. This is a list of all the **mutable** and immutable data types we've learned so far.

| Mutable| Immutable
| --- | --- |
| Tuples | Lists |
| Integers | Dictionary |
| Floats |
| Strings |
| Booleans |


In [26]:
def open_dataset(file_name='AppleStore.csv', header=True):
    opened_file = open(file_name, encoding="utf8")
    from csv import reader
    read_file = reader(opened_file)
    data = list(read_file)
    
    if header:
        return data[1:], data[0]
    else:
        return data


In [29]:
all_data = open_dataset()
header = all_data[1]
header = header[1:]
print(header)
apps_data = all_data[0]
# print(apps_data)

['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic']


## More About Tuples

When we create a **`tuple`**, surrounding the values with parentheses is optional. It's enough to write the individual values and separate each with a comma. Below, we see two ways of creating a tuple (on the right, we're not using parentheses):

In [1]:
a_tuple = (1, 'a')
print(a_tuple)
print(type(a_tuple))

a_tuple = 1, 'a'
print(a_tuple)
print(type(a_tuple))

(1, 'a')
<class 'tuple'>
(1, 'a')
<class 'tuple'>


With this in mind, remember the syntax we used in the **`return`** statement to return multiple values:

In [3]:
def sum_and_difference(a, b):
    a_sum = a + b
    difference = a - b
    return a_sum, difference

sum_diff = sum_and_difference(15, 5)
print(sum_diff)
print(type(sum_diff))

(20, 10)
<class 'tuple'>


When we use return a_sum, difference, Python thinks we want the tuple a_sum, difference returned. This is why multiple variables are returned as **`tuples`**. If we wanted to return a **`list`** instead of a tuple, we need to use brackets: $[value]$

In [4]:
def sum_and_difference(a, b):
    a_sum = a + b
    difference = a - b
    return [a_sum, difference]

sum_diff = sum_and_difference(15, 5)
print(sum_diff)
print(type(sum_diff))

[20, 10]
<class 'list'>


When we work with tuples, we can assign their individual elements to separate variables in a single line of code.

In [6]:
a_tuple = 1, 2
first_element = a_tuple[0]
second_element = a_tuple[1]

print(first_element)
print(second_element)

# Alternatively we can:

a_tuple = 1, 2
first_element, second_element = a_tuple

print(first_element)
print(second_element)

1
2
1
2


We can do the same with lists — we can assign individual list elements to separate variables in a single line of code:

In [7]:
a_list = [3, 4]
first_element = a_list[0]
second_element = a_list[1]

print(first_element)
print(second_element)

# Alternatively we can:

a_list = [3, 4]
first_element, second_element = a_list

print(first_element)
print(second_element)

3
4
3
4


We can use this variable assignment technique with functions that **`return`** multiple variables.

In [8]:
def sum_and_difference(a, b):
    a_sum = a + b
    difference = a - b
    return a_sum, difference

a_sum, a_diff = sum_and_difference(15, 5)

print(a_sum)
print(a_diff)

20
10


- Use the open_dataset() function to open the `AppleStore.csv` file, which has a header row.
    - Do the variable assignment step in a single line of code.
        - Assign the header to a variable named header.
        - Assign the rest of the data set to a variable named apps_data.

In [10]:
def open_dataset(file_name='AppleStore.csv', header=True):        
    opened_file = open(file_name, encoding="utf8")
    from csv import reader
    read_file = reader(opened_file)
    data = list(read_file)
    
    if header:
        return data[1:], data[0]
    else:
        return data

apps_data, header = open_dataset()

print(type(apps_data))
print(type(header))

<class 'list'>
<class 'list'>


## Code Running Quirks

So far, we've been using parameters and `return` statements for all of our functions. Note, however, that parameters and `return` statements are optional:

In [11]:
def print_constant():
    x = 3.14
    print(x)

print_constant()

3.14


Functions without a `return` statement don't return any value. However, strictly speaking, they `return` a **`None`** value, which practically represents the absence of a value. The None value is an instance of the **`NoneType`** data type (just like 5.321 is an instance of the float data type).

In [12]:
def print_constant():
    x = 3.14
    print(x)

j = print_constant()
print(j)
print(type(j))

3.14
None
<class 'NoneType'>


In the function above, notice also that we assigned **`3.14`** to a variable named **`x`**. Although we clearly defined **`x`**, it turns out that we can't access **`x`** outside the function definition — Python raises a **`NameError`** and says that **`x`** is not defined.



In [13]:
def print_constant():
    x = 3.14
    print(x)
    
x

NameError: name 'x' is not defined

To debug the code above, let's start by mentioning that **_Python doesn't run the code we write inside a function's definition until we call that function_**. In the code example above, x = 3.14 is never run.

This behavior applies to every function we create. For instance, below we're trying to perform a division between an empty list and a string, but no error is returned, which is proof that the code inside the function's definition is not executed:

In [15]:
def divide():
    [] / 'abc'
    
print('Code finished running, but no error was returned')

Code finished running, but no error was returned


As we've already mentioned, Python executes the code inside a function's definition only when the function is called. Above, we'd get an error only if we called divide() because only then Python would run [] / 'abc'.

In [16]:
divide()

TypeError: unsupported operand type(s) for /: 'list' and 'str'

If Python runs the code inside a function definition only when the function is called, it seems that in order to debug the code below we first need to run **`print_constant()`** to get **`x = 3.14`** executed.

- Rewrite the print_constant() function above.
- Call the print_constant() function to make sure x = 3.14 gets executed.
- Print the variable x using the print() function.
    - What do you notice about the output?
    - This may be totally unexpected, and we'll explain why this happens in the next screen.

In [19]:
def print_constant():
    x = 3.14
    print (x)
    
print_constant()
print(x)

3.14


NameError: name 'x' is not defined

## Scopes - Global and Local

You might have found the error we got in the previous exercise completely unexpected. After all, we called the print_constant() function, which means that x = 3.14 must have been executed. So why did we still get an error telling us that x is undefined?

When print_constant() is called, x = 3.14 is indeed executed, but the quirk is that Python only saves the x variable temporarily. Python saves x into a kind of temporary memory, which is immediately erased after the print_constant() finishes running.

This explains why x is still undefined even after print_constant() is called — the temporary memory associated with print_constant() is immediately erased after the function finishes running, being freed up for later use.

This kind of temporary memory storage doesn't also apply to the code that is being run outside function definitions. If we define x = 3.14 in our main program (outside function definitions), we can use x later on without having to worry that it was erased from memory.



In [20]:
x = 3.14

print('random code')
print('more random code')

random code
more random code


The temporary memory associated with a function is isolated from the memory associated with the main program. The consequence of this is that we can initialize a variable `x = 10` in the main program, and then execute `x = 3.14` in the body of a function without overwriting the x variable of the main program.

<img src="memory_isolation_image.PNG">

In [22]:
e = 'mathematical constant'
a_sum = 1000
length = 50

def exponential(x):
    e = 2.72
    print (e)
    return e**x

result = exponential(5)
print(result)

def divide():
    print(a_sum)
    print(length)
    return a_sum / length

result_2 = divide()
print(result_2)

2.72
148.88279736320004
1000
50
20.0
