# <br><br><span style="color:purple">Python Fundamentals Bootcamp - Wednesday</span>

#### <br>*This is a Jupyter Notebook. To run a gray code cell, click in the cell and either click on the "play" arrow, or type shift+enter (or shift+return on a Mac).*

#### <br>*Importing modules*

A quick bonus lesson about importing modules. Later in this notebook we are going to be using the `mean()` function again, from the `statistics` module. We learned yesterday that we can import the package like this:

In [None]:
import statistics

In [None]:
statistics.mean([6, 4, 2, 7, 6])

This can sometimes make our function names long, like `statistics.variance()`.

We can also import modules with a shortened nickname so that we don't have to type out the full module name every time we use a function:

In [None]:
import statistics as st

In [None]:
st.mean([6, 4, 2, 7, 6])

OR, if we know that we are only going to use one or two functions from a module, we can import only those functions. When we do this, we do not have to include the module name when calling the function:

In [None]:
from statistics import mean

In [None]:
mean([6, 4, 2, 7, 6])

In [None]:
from statistics import mean, mode

In [None]:
mean([6, 4, 2, 7, 6])

In [None]:
mode([6, 4, 2, 7, 6])

#### <br>By popular demand: a very quick lesson on installing modules

You can install and update Python modules onto your computer using the command line, but you can also do it from inside a Jupyter Notebook. We don't have time to cover command line in this workshop, but I will teach you a shortcut.<br><br>If you use `!` directly before a command in a Jupyter Notebook, it tells the computer that you are going to be speaking to the computer in your command line language instead of Python. We will practice by installing Pandas, which we will be using Friday, and making sure the Statistics package is upgraded.

In [None]:
!pip install pandas

In [None]:
!pip install statistics --upgrade

<br>*Huge caveat:* If you are working in Jupyter Lab, you will sometimes need to restart Jupyter Lab before the packages will be available to import. There are more complicated workarounds for this, but I think everyone is ok to have to restart once in a while. YOU **DO NOT** NEED TO RESTART RIGHT NOW.

# <br><br>BACK TO THE SLIDES

## <br><br>Dictionaries

**The key must be a string. The value can be any object.**

In [None]:
grade_dict = {"Charlie": [90, 96, 89, 79], 
             "Tony": [99, 98, 96, 93], 
             "Suman": [85, 88, 83, 87],
             "Yuvie": [66, 76, 80, 62],
             "May": [97, 94, 89, 91]}

print(grade_dict)

#### <br>Indexing a dictionary

In [None]:
grade_dict["Tony"]

In [None]:
grade_dict["Tony"][-1]

#### <br>Adding an entry to a dictionary

You don't have to use a function to add to a dictionary:

In [None]:
grade_dict["Ben"] = [82, 88, 90]

In [None]:
print(grade_dict)

### <br>Looping through a dictionary

In [None]:
for entry in grade_dict:
    print(entry)

If you have an updated version of Python 3, it will print out the keys in the order you gave them when you first created the dictionary. If you have a slightly older version of Python 3, it might give you an error.

<br>We can be more explicit to tell the computer that we only want to loop through the keys:

In [None]:
for key in grade_dict.keys():
    print(key)

Or we can loop through the values:

In [None]:
for value in grade_dict.values():
    print(value)

<br>Remember that we can give our temporary variable any name we want in our for loop. This is commonly used:

In [None]:
for k in grade_dict.keys():
    print(k)

In [None]:
for v in grade_dict.values():
    print(v)

<br>But it's also good to use more appropriate variable names:

In [None]:
for student in grade_dict.keys():
    print(student)

In [None]:
for grade_list in grade_dict.values():
    print(grade_list)

<br>We can also loop through both the keys and values:

In [None]:
for k, v in grade_dict.items():
    print(k)
    print(v)

In [None]:
for student, grade_list in grade_dict.items():
    print(student)
    print(grade_list)

<br><br>Since our values are list objects, we can also loop through the lists:

In [None]:
for student, grade_list in grade_dict.items():
    print(student)
    for grade in grade_list:
        print(grade)

That code is called a **nested loop** - a loop inside a loop!

### <br>Adding key:value pairs to an empty dictionary

Here we will create a new dictionary from the data in the grades_dict. The keys will be the student's names and the values will be their final score for the class. The final score will be calculated as the mean of all the scores in their grade list.

<br>First, we create an empty dictionary:

In [None]:
final_dict = {}

<br>Next, we loop through the old dictionary, calculate each person's final grade, and add them to the new dictionary:

In [None]:
for student, grade_list in grade_dict.items():
    final_score = statistics.mean(grade_list)
    final_dict[student] = final_score

In [None]:
print(final_dict)

#### <br><br>Working with messy data

<br>If you remember, one of our students, "Ben", only had 3 grades entered, while everyone else had 4. That's something we might want to know when we're calculating final grades. Let's add an if/else statement to our code:

In [None]:
final_dict = {}
for student, grade_list in grade_dict.items():
    if len(grade_list) >= 4:
        final_score = statistics.mean(grade_list)
        final_dict[student] = final_score
    else:
        print(student + " is missing grades.")
print(final_dict)

<br>This code is ok, but it contains that number `4` for the length of the list. Let's say you teach the same class next year and you want to reuse the code, only next year you give 5 tests instead of 4. 

<br>When there are details in the code specific to your data, we say they are **hard coded**.
<br>As a beginner, it is likely that you will do a lot of hard coding to solve your problems, but if you ever want to reuse your scripts or share them with someone else, you will need to try to not hard code.

<br>First, let's change our grade dictionary to reflect Ben's missing grade. The grade dictionary looks like this:

In [None]:
grade_dict

<br>Ben's value is:

In [None]:
grade_dict["Ben"]

<br> We can reflect Ben's missing grade by adding another data point to Ben's list:

In [None]:
grade_dict["Ben"].append("Missed")
print(grade_dict)

<br>Now we will remove the hard coding and instead handle the missing data through a try/except statement. First, let's run the previous code we wrote, but with our altered grade_dict:

In [None]:
final_dict = {}
for student, grade_list in grade_dict.items():
    if len(grade_list) >= 4:
        final_score = statistics.mean(grade_list)
        final_dict[student] = final_score
    else:
        print(student + " is missing grades.")
print(final_dict)

<br>The error gets thrown because we added a string, `Missed`, to the grade_list. Python cannot calculate the mean of a list that included a string. <br><br>Instead of specifying "4" as the number of grades required, we can use a try/except statement that references the error we just saw:

In [None]:
final_dict = {}
for student, grade_list in grade_dict.items():
    try:
        final_grade = statistics.mean(grade_list)
        final_dict[student] = final_grade
    except TypeError:
        print(student + " has missing grades.")

print(final_dict)

# <br><br>BACK TO THE SLIDES

## <br><br>Files

#### <br>First, where are the files we are working with today?

#### <br>*If you are using Jupyter Lab:*

The files should be in your working directory - where you are right now - "wednesday". You should see them in the folder on the left side of your screen (if the folder isn't visible, click on the folder icon on the top left).

#### <br>*If you are using Google Colab:*

You will need to run the line of code directly below this to upload the files from GitHub. *Do not run the next line if you are not using Google Colab.*

In [None]:
!wget https://raw.githubusercontent.com/aGitHasNoName/pythonBootcampWednesday/master/alice.txt
!wget https://raw.githubusercontent.com/aGitHasNoName/pythonBootcampWednesday/master/dogs.txt

### <br><br><br>Reading files

<br>We can first store the names of the files we will be working with as strings:

In [None]:
alice_filename = "alice.txt"
dog_filename = "dogs.txt"

<br>We will use a with/as statement to open the file. Let's try opening the file "alice.txt" and printing it to see what it looks like. We will use the read mode:

In [None]:
with open(alice_filename, "r") as f:
    print(f)

<br>The file object isn't directly readable. We can use a file object method function, `read()`, to change the file object into a string:

In [None]:
with open(alice_filename, "r") as f:
    alice_text = f.read()

We have now exited the with/as statement, so the file is closed. `alice_text` is stored in memory, but `f` is closed and cannot be accessed again without reopening the file.

In [None]:
type(alice_text)

In [None]:
f.read()

In [None]:
print(alice_text)

<br>Notice that `alice_text` is now stored as one long string. Sometimes you will want that. Other times it will be convenient to instead store your text as a list of individual lines instead of one big string.

To store the text as a list of strings, use the file method `readlines()` instead of `read()`:

In [None]:
with open(alice_filename, "r") as f:
    alice_list = f.readlines()

In [None]:
type(alice_list)

In [None]:
len(alice_list)

In [None]:
for line in alice_list:
    print(line)

<br>**Question:** The `len()` function told us that the list was 7 lines long, but when we print it it looks like there are only 4 lines. What do you think is causing that? What code could you run to test your theory?

<br>We can now do anything with this list that we could do with any other list:

In [None]:
for line in alice_list:
    if "Alice" in line:
        print(line)

<br><br>As a reminder, the `f` variable I've been using in the with/as statement is a temporary variable that can be anything, just like when writing a for loop. `f` is just a commonly used shorthand in with/as statements. 

In [None]:
with open(alice_filename, "r") as FN_2187:
    alice_list = FN_2187.readlines()
len(alice_list)

### <br>Exercise

We saved another filename as `dog_filename`. Write a with/as statement to open the file in read mode. Inside the with/as statement, save the file as a list of lines called `dog_list`. Then, outside the with/as statement, print the list.

### <br><br><br>Writing files

*Remember that when you open a file in write mode, it will first create a new empty file. If you already have a file with the same name, it will empty that file.*

Let's work with our `alice_list`:

In [None]:
for line in alice_list:
    print(line)

<br>Let's open a new file and write the Alice text without those extra empty new lines.

First, we'll save the name we want for our new file as a string:

In [None]:
new_alice = "alice_clean.txt"

Now we will open this new file in write mode using a with/as statement. Inside that statement, we will write each line of the `alice_list` as long as the line contains more than just the new line character:

In [None]:
with open(new_alice, "w") as f:
    for line in alice_list:
        if line != "\n":
            f.write(line)

<br>To check the file, we can open it in read mode. We will just print the file inside the with/as statement without even saving it as a string or list:

In [None]:
with open(new_alice, "r") as f:
    print(f.read())

<br><br>You will get more practice with files in the homework.

# <br><br>BACK TO THE SLIDES

## <br><br>Writing functions

First we'll write a function that just does something whenever it's called. It takes no arguments.

In [None]:
def hello():
    # Prints Hello!
    print("Hello!")

<br> Let's call the `hello()` function:

In [None]:
hello()

<br>We can add an argument. Whatever you call the arguments in your function definition must match exactly to how they are used inside the function definition, just like we saw with for loops and with/as statements:

In [None]:
def hello_you(name):
    #Prints Hello You! replacing You with whatever string you give it.
    print("Hello " + name + "!")

Now we can pass it any string as an argument:

In [None]:
hello_you("Eeyore")

<br><br>Next we'll write a function that creates a new object.

Let's write our own function to find the area of a rectangle.

The arguments our function will need are length and width. 

In [None]:
def area(length, width):
    #This function takes a length and width of a rectangle and returns the area.
    answer = length * width

In [None]:
area(10, 12)

In [None]:
print(answer)

<br>So we created `answer` inside our function definition, but it doesn't exist outside that definition. We need to include a **return** statement if we want our function to return the value of an object created inside the function.

In [None]:
def area(length, width):
    #This function takes a length and width of a rectangle and returns the area.
    answer = length * width
    return answer

In [None]:
area(10, 12)

<br>We can also assign the output of a function to a variable. Let's say my kitchen is 10 feet long and 12 feet wide:

In [None]:
kitchen_area = area(10, 12)

In [None]:
print(kitchen_area)

<br>We can also pass variables to the function:

In [None]:
kitchen_l = 10
kitchen_w = 12

In [None]:
kitchen = area(kitchen_l, kitchen_w)
print(kitchen)

# <br><br>BACK TO THE SLIDES