# <br><br><span style="color:purple">Python Fundamentals Bootcamp - Wednesday</span>

## <br><br><span style="color:teal">More About Modules

### <br>Importing modules

A quick bonus lesson about importing modules. Later in this notebook we are going to be using the `mean()` function again, from the `statistics` module. We learned yesterday that we can import the package like this:

In [1]:
import statistics

In [2]:
statistics.mean([6, 4, 2, 7, 6])

5

This can sometimes make our function names long, like `statistics.variance()`.

We can also import modules with a shortened nickname so that we don't have to type out the full module name every time we use a function:

In [3]:
import statistics as st

In [4]:
st.mean([6, 4, 2, 7, 6])

5

OR, if we know that we are only going to use one or two functions from a module, we can import only those functions. *When we do this, we do not have to include the module name when calling the function*:

In [5]:
from statistics import mean

In [6]:
mean([6, 4, 2, 7, 6])

5

In [7]:
from statistics import mean, mode

In [8]:
mean([6, 4, 2, 7, 6])

5

In [9]:
mode([6, 4, 2, 7, 6])

6

### <br>A very quick lesson on installing modules

You can install and update Python modules onto your computer using the command line, but you can also do it from inside a Jupyter Notebook. We don't have time to cover command line in this workshop, but I will teach you a shortcut.<br><br>If you use `!` directly before a command in a Jupyter Notebook, it tells the computer that you are going to be speaking to the computer in your command line language instead of Python. We will practice by installing Pandas, which is a very commonly-used package for working with dataframes, and making sure the Statistics package is upgraded.

In [10]:
!pip install pandas



In [11]:
!pip install statistics --upgrade



<br>*Huge caveat:* If you are working in Jupyter Lab, you will sometimes need to restart Jupyter Lab before the packages will be available to import. There are more complicated workarounds for this, but I think everyone is ok to have to restart once in a while. YOU **DO NOT** NEED TO RESTART RIGHT NOW.

### <br><br>Today's Objects
- Dictionaries
- Files

### Today's Functions
- *Writing your own functions*

### Today's Concepts
- looping through dictionaries

## <br><br><span style="color:teal">Dictionaries

During yesterday's lecture, we practiced looping through fictional characters from Avengers, Star Wars, and Moana. But how did the computer know who was in which movie, or who could fly?

A dictionary is **a collection of *key: value* pairs.** 
- Dictionaries are surrounded by curly brackets {}
- **key: value pairs** inside the dictionary are separated by commas
- In each **key: value pair**, the key and value are separated by a colon :
- The key must always be a string
- The value can be any object

<br>Here's a dictionary of heights, in inches. The keys are people's names and the values are integers:

In [12]:
inches_dict = {"Jo": 60, "Rae": 68, "Tom": 65}

`dict` *is a common abbreviation for a dictionary in Python.*

<br>This dictionary has info about someone's mom. The keys are trait categories and the values are a mix of integers and strings:

In [13]:
mom_dict = {"height": 65, "eyes": "hazel", 
            "hair": "gray", "age": 70}

<br>This dictionary contains results of an experiment. The keys are the names of the test runs and the values are lists of floats:

In [14]:
results = {"test1": [3.4, 0.2, 1.4, 2.2, 8.0], 
           "test2": [0.9, 3.4, 2.5, 4.7, 2.6], 
           "test3": [4.9, 2.4, 0.4, 8.4, 2.5]}

*If a dictionary is long, you can write it on multiple lines, just like a list.*

### <br><br>Indexing a dictionary

In [15]:
grade_dict = {"Charlie": [90, 96, 89, 79], 
              "Tony": [99, 98, 96, 93], 
              "Suman": [85, 88, 83, 87],
              "Yuvie": [66, 76, 80, 62],
              "May": [97, 94, 89, 91]}

print(grade_dict)

{'Charlie': [90, 96, 89, 79], 'Tony': [99, 98, 96, 93], 'Suman': [85, 88, 83, 87], 'Yuvie': [66, 76, 80, 62], 'May': [97, 94, 89, 91]}


<br>**Unlike lists, dictionaries are indexed by the name of the key. They cannot be indexed by position in the dictionary.** In the latest versions of Python, dictionaries are saved in order, but the purpose of a dictionary isn't to keep entries in numerical order - would you ever need to know what the 110th word in the Oxford English Dictionary is?

In [16]:
grade_dict["Tony"]

[99, 98, 96, 93]

In [17]:
grade_dict[3]

KeyError: 3

<br>To index something inside a value, first index the key, then the position in the value. In our `grade_dict` example, the values are lists, so to index Tony's last grade:

In [18]:
grade_dict["Tony"][-1]

93

### <br><span style="color:red">Exercise: Creating and indexing dictionaries

Create a dictionary called `favorites`. The keys should be "color", "food", and "song". The values should be your favorite color, food, and song. 

In [19]:
favorites = {"color": "magenta", "food": "cheese", "song": "You're Welcome"}

Write code to index your favorite song:

In [20]:
favorites["song"]

"You're Welcome"

Write code to index the third letter of your favorite food.

In [21]:
favorites["food"][2]

'e'

### <br><br>Adding an entry to a dictionary

You don't have to use a function to add to a dictionary. Just **index** a new key and **assign** it a value:

Let's look at the `grade_dict` as it is now, and then add a new student.

In [22]:
print(grade_dict)

{'Charlie': [90, 96, 89, 79], 'Tony': [99, 98, 96, 93], 'Suman': [85, 88, 83, 87], 'Yuvie': [66, 76, 80, 62], 'May': [97, 94, 89, 91]}


In [23]:
grade_dict["Ben"] = [60, 57, 63]

In [24]:
print(grade_dict)

{'Charlie': [90, 96, 89, 79], 'Tony': [99, 98, 96, 93], 'Suman': [85, 88, 83, 87], 'Yuvie': [66, 76, 80, 62], 'May': [97, 94, 89, 91], 'Ben': [60, 57, 63]}


<br>If the item already exists in the dictionary, you will overwrite it:

In [25]:
grade_dict["Ben"] = [82, 88, 90]
print(grade_dict)

{'Charlie': [90, 96, 89, 79], 'Tony': [99, 98, 96, 93], 'Suman': [85, 88, 83, 87], 'Yuvie': [66, 76, 80, 62], 'May': [97, 94, 89, 91], 'Ben': [82, 88, 90]}


### <br><span style="color:red">Exercise: Adding to dictionaries

Add a new key:value pair to your `favorites` dictionary. The key could be tv_show, movie, book, or anything else you'd like to add. 

In [26]:
favorites["movie"] = "Dune"

In [27]:
print(favorites)

{'color': 'magenta', 'food': 'cheese', 'song': "You're Welcome", 'movie': 'Dune'}


### <br><br>Looping through a dictionary

In [28]:
for entry in grade_dict:
    print(entry)

Charlie
Tony
Suman
Yuvie
May
Ben


If you have an updated version of Python 3, it will print out the keys in the order you gave them when you first created the dictionary. If you have a slightly older version of Python 3, it might give you an error.

<br>We can, and should, be more explicit to tell the computer that we want to loop through only the keys by adding the `keys()` method to the end of our dictionary:

In [29]:
for key in grade_dict.keys():
    print(key)

Charlie
Tony
Suman
Yuvie
May
Ben


Or we can loop through the values:

In [30]:
for value in grade_dict.values():
    print(value)

[90, 96, 89, 79]
[99, 98, 96, 93]
[85, 88, 83, 87]
[66, 76, 80, 62]
[97, 94, 89, 91]
[82, 88, 90]


<br>Remember that we can give our temporary variable any name we want in our for loop. This is commonly used:

In [31]:
for k in grade_dict.keys():
    print(k)

Charlie
Tony
Suman
Yuvie
May
Ben


In [32]:
for v in grade_dict.values():
    print(v)

[90, 96, 89, 79]
[99, 98, 96, 93]
[85, 88, 83, 87]
[66, 76, 80, 62]
[97, 94, 89, 91]
[82, 88, 90]


<br>But it's also good to use more appropriate variable names:

In [33]:
for student in grade_dict.keys():
    print(student)

Charlie
Tony
Suman
Yuvie
May
Ben


In [34]:
for grade_list in grade_dict.values():
    print(grade_list)

[90, 96, 89, 79]
[99, 98, 96, 93]
[85, 88, 83, 87]
[66, 76, 80, 62]
[97, 94, 89, 91]
[82, 88, 90]


<br>We can also loop through both the keys and values using the `items()` method. We include two temporary variables in our `for` loop statement instead of one:

In [35]:
for k, v in grade_dict.items():
    print(k)
    print(v)

Charlie
[90, 96, 89, 79]
Tony
[99, 98, 96, 93]
Suman
[85, 88, 83, 87]
Yuvie
[66, 76, 80, 62]
May
[97, 94, 89, 91]
Ben
[82, 88, 90]


In [36]:
for student, grade_list in grade_dict.items():
    print(student)
    print(grade_list)

Charlie
[90, 96, 89, 79]
Tony
[99, 98, 96, 93]
Suman
[85, 88, 83, 87]
Yuvie
[66, 76, 80, 62]
May
[97, 94, 89, 91]
Ben
[82, 88, 90]


<br><br>Since our values are list objects, we can also use a **nested loop** to loop through both the dictionary and the lists:

In [37]:
for student, grade_list in grade_dict.items():
    print(student)
    for grade in grade_list:
        print(grade)

Charlie
90
96
89
79
Tony
99
98
96
93
Suman
85
88
83
87
Yuvie
66
76
80
62
May
97
94
89
91
Ben
82
88
90


That code is called a **nested loop** - a loop inside a loop!

### <br><span style="color:red">Exercise: Looping through a dictionary

Run the cell below to store the `nicknames` dictionary. The keys are the full names, and the values are the nicknames.

In [38]:
nicknames = {"Charles": "Charlie", 
             "Anthony": "Tony", 
             "Suman": "Suman", 
             "Yuval": "Yuvie", 
             "May-Lin": "May", 
             "Benjamin": "Ben"}

Write a nested loop to print out each letter in each person's nickname:

In [39]:
for n in nicknames.values():
    for letter in n:
        print(letter)

C
h
a
r
l
i
e
T
o
n
y
S
u
m
a
n
Y
u
v
i
e
M
a
y
B
e
n


In [52]:
for k, v in list(nicknames.items())[:4]:
    print(k)
    print(v)

Charles
Charlie
Anthony
Tony
Suman
Suman
Yuval
Yuvie


In [53]:
nicknames.keys

<function dict.keys>

### <br><br>Adding key:value pairs to an empty dictionary

Yesterday we learned how to loop through a list and add items to a new empty list. We can also do that with dictionaries.
<br><br>Here we will create a new dictionary from the data in the `grades_dict`. The keys will be the students' names and the values will be their final score for the class. The final score will be calculated as the mean of all the scores in their grade list.

View the `grade_dict`:

In [54]:
print(grade_dict)

{'Charlie': [90, 96, 89, 79], 'Tony': [99, 98, 96, 93], 'Suman': [85, 88, 83, 87], 'Yuvie': [66, 76, 80, 62], 'May': [97, 94, 89, 91], 'Ben': [82, 88, 90]}


In [55]:
import statistics

<br>First, we create an empty dictionary:

In [56]:
final_dict = {}

<br>Next, we loop through the old dictionary, calculate each person's final grade, and add them to the new dictionary:

In [57]:
final_dict = {}
for student, grade_list in grade_dict.items():
    final_score = round(float(statistics.mean(grade_list)), 2)
    final_dict[student] = final_score

In [58]:
print(final_dict)

{'Charlie': 88.5, 'Tony': 96.5, 'Suman': 85.75, 'Yuvie': 71.0, 'May': 92.75, 'Ben': 86.67}


### <br><br>Working with messy data: a messy example

<br>If you remember, one of our students, "Ben", only had 3 grades entered, while everyone else had 4. That's something we might want to know when we're calculating final grades. Let's add an if/else statement to our code:

In [59]:
final_dict = {}
for student, grade_list in grade_dict.items():
    if len(grade_list) >= 4:
        final_score = statistics.mean(grade_list)
        final_dict[student] = final_score
    else:
        print(student + " is missing grades.")
print(final_dict)

Ben is missing grades.
{'Charlie': 88.5, 'Tony': 96.5, 'Suman': 85.75, 'Yuvie': 71, 'May': 92.75}


<br>This code is ok, but it contains that number `4` for the length of the list. Let's say you teach the same class next year and you want to reuse the code, only next year you give 5 tests instead of 4. 

<br>When there are details in the code specific to your data, we say they are **hard coded**.
<br><br>As a beginner, you will do a lot of hard coding to solve your problems, but if you ever want to reuse your scripts or share them with someone else, you will need to try to not hard code.

<br>First, let's change our grade dictionary to reflect Ben's missing grade. The grade dictionary looks like this:

In [60]:
grade_dict

{'Charlie': [90, 96, 89, 79],
 'Tony': [99, 98, 96, 93],
 'Suman': [85, 88, 83, 87],
 'Yuvie': [66, 76, 80, 62],
 'May': [97, 94, 89, 91],
 'Ben': [82, 88, 90]}

<br>Ben's value is:

In [61]:
grade_dict["Ben"]

[82, 88, 90]

<br> We can reflect Ben's missing grade by adding another data point to Ben's list. Ben's value in the dictionary is a list, so we can index the list and then append to it.

In [62]:
grade_dict["Ben"].append("Missed")
print(grade_dict)

{'Charlie': [90, 96, 89, 79], 'Tony': [99, 98, 96, 93], 'Suman': [85, 88, 83, 87], 'Yuvie': [66, 76, 80, 62], 'May': [97, 94, 89, 91], 'Ben': [82, 88, 90, 'Missed']}


<br>Now we will remove the hard coding and instead handle the missing data through a try/except statement. First, let's run the previous code we wrote, but with our altered grade_dict, in order to get the error that we want to except:

In [63]:
final_dict = {}
for student, grade_list in grade_dict.items():
    if len(grade_list) >= 4:
        final_score = statistics.mean(grade_list)
        final_dict[student] = final_score
    else:
        print(student + " is missing grades.")
print(final_dict)

TypeError: can't convert type 'str' to numerator/denominator

<br>The error gets thrown because we added a string, `Missed`, to the `grade_list`. Python cannot calculate the mean of a list that includes a string. <br><br>Instead of specifying "4" as the number of grades required, we can use a try/except statement that references the error we just saw:

In [64]:
final_dict = {}
for student, grade_list in grade_dict.items():
    try:
        final_grade = statistics.mean(grade_list)
        final_dict[student] = final_grade
    except TypeError:
        print(student + " has missing grades.")

print(final_dict)

Ben has missing grades.
{'Charlie': 88.5, 'Tony': 96.5, 'Suman': 85.75, 'Yuvie': 71, 'May': 92.75}


## <br><br><br><br><span style="color:teal">Files

#### <br>First, where are the files we are working with today?

#### <br>*If you are using Jupyter Lab:*

The files should be in your working directory - where you are right now - "wednesday". You should see them in the filetree on the left side of your screen (if the files aren't visible, click on the folder icon on the top left).

#### <br>*If you are using Google Colab:*

You will need to run the line of code directly below this to upload the files from GitHub. *Do not run the next line if you are not using Google Colab.*

In [None]:
!wget https://raw.githubusercontent.com/aGitHasNoName/pythonBootcamp_3Day/main/alice.txt
!wget https://raw.githubusercontent.com/aGitHasNoName/pythonBootcamp_3Day/main/dogs.txt

### <br><br><br>Reading files

<br>We can first store the names of the files we will be working with as strings:

In [65]:
alice_filename = "alice.txt"
dog_filename = "dogs.txt"

<br><br>Python has a basic way to open files, but I'm going to teach you the better way. The way I teach you is the way all Python coders open files. You may someday encounter a logic situation where you need to use the old way to open a file, so I'll show the syntax to you briefly.

`f = open(filename, "r")`
<br>*`#do something with the file`*
<br>`f.close()`

This leaves the file needlessly open until you close it, which takes up memory. It also leaves you open to potentially forgetting to close the file.
<br><br>Files tend to take up more memory inside Python than other Python objects like strings, lists, and dictionaries.

### <br>with/as statement: The better way to open files 

Here is the syntax to read a file. (This code isn't ready to run, it's just to look at to see the syntax.)

In [None]:
with open(filename, "r") as f:
    #save file as some other object
    #or save part of a file

The first line is the **with/as** statement. The `f` is a temporary variable that will store the file object. Just like in a for loop, you can use anything for the temporary variable, but `f` is commonly used.
<br><br>**Inside** the with/as statement, you want to save the file as a different object type - something that doesn't take up as much memory as a file.
<br><br>The file will automatically close when we exit the with/as statement (exit the indentation).

<br>**The open function**
<br>The `open()` function takes two arguments: the filename and the mode.

Mode options:
- "r"  read
- <span style="color:red">"w"  write (wipes the file clean if it already exists)
- "a"  append (add to the end of whatever is already in the file)


<br>**Filenames**
<br>If you are accessing a file in your current working directory, you can just include the filename, but if the file is in a different directory, you must include either the relative or absolute path.

<br><br>Let's try opening the file "alice.txt" and printing it to see what it looks like. We will use the read mode:

In [66]:
with open(alice_filename, "r") as f:
    print(f)

<_io.TextIOWrapper name='alice.txt' mode='r' encoding='UTF-8'>


<br>The file object isn't directly readable, so we need to change it into another object before exiting the with/as statement.

#### <br><br>Storing a file as a string

We can use a file object method function, `read()`, to change the file object into a string:

In [67]:
with open(alice_filename, "r") as f:
    alice_text = f.read()

We have now exited the with/as statement, so the file is closed. `alice_text` is stored in memory, but `f` is closed and cannot be accessed again without reopening the file.

In [68]:
type(alice_text)

str

In [69]:
f.read()

ValueError: I/O operation on closed file.

In [70]:
print(alice_text)

Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"

So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid) whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her.

There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, "Oh dear! Oh dear! I shall be too late!" (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed quite natural); but when the Rabbit actually took a watch out of its waistcoat-pocket, and looked at it, 

<br><br>Notice that `alice_text` is now stored as one long string. Sometimes you will want that. Other times it will be convenient to instead store your text as a list of individual lines instead of one big string.

#### <br><br>Storing a file as a list of strings (lines)

To store the text as a list of strings, use the file method `readlines()`. This will break the whole text up by any new line characters.

In [71]:
with open(alice_filename, "r") as f:
    alice_list = f.readlines()

In [72]:
type(alice_list)

list

In [73]:
len(alice_list)

7

In [74]:
for line in alice_list:
    print(line)

Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"



So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid) whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her.



There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, "Oh dear! Oh dear! I shall be too late!" (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed quite natural); but when the Rabbit actually took a watch out of its waistcoat-pocket, and looked at 

<br>**Question:** The `len()` function told us that the list was 7 lines long, but when we print it it looks like there are only 4 lines. What do you think is causing that? What code could you run to test your theory?

In [75]:
alice_list

['Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"\n',
 '\n',
 'So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid) whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her.\n',
 '\n',
 'There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, "Oh dear! Oh dear! I shall be too late!" (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed quite natural); but when the Rabbit actually took a watch out of its waistcoat-

<br><br>We can now do anything with this list that we could do with any other list:

In [76]:
for line in alice_list:
    if "Alice" in line:
        print(line)

Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"

There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, "Oh dear! Oh dear! I shall be too late!" (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed quite natural); but when the Rabbit actually took a watch out of its waistcoat-pocket, and looked at it, and then hurried on, Alice started to her feet, for it flashed across her mind that she had never before seen a rabbit with either a waistcoat-pocket, or a watch to take out of it, and burning with curiosity, she ran across the field after it, and was just in time to see it pop down a larg

<br><br>As a reminder, the `f` variable I've been using in the with/as statement is a temporary variable that can be anything, just like when writing a for loop. `f` is just a commonly used shorthand in with/as statements. 

In [77]:
with open(alice_filename, "r") as FN_2187:
    alice_list = FN_2187.readlines()
len(alice_list)

7

### <br><span style="color:red">Exercise: Reading a file

We saved another filename as `dog_filename`. Write a with/as statement to open the file in read mode. Inside the with/as statement, save the file as a list of lines called `dog_list`. Then, outside the with/as statement, print the list.

In [79]:
with open(dog_filename, "r") as f:
    dog_list = f.readlines()
print(dog_list)

['affenpinscher\n', 'Afghan hound\n', 'Airedale terrier\n', 'Akita\n', 'Alaskan Malamute\n', 'American Staffordshire terrier\n', 'American water spaniel\n', 'Australian cattle dog\n', 'Australian shepherd\n', 'Australian terrier\n', 'basenji\n', 'basset hound\n', 'beagle\n', 'bearded collie\n', 'Bedlington terrier\n', 'Bernese mountain dog\n', 'bichon frise\n', 'black and tan coonhound\n', 'bloodhound\n', 'border collie\n', 'border terrier\n', 'borzoi\n', 'Boston terrier\n', 'bouvier des Flandres\n', 'boxer\n', 'briard\n', 'Brittany\n', 'Brussels griffon\n', 'bull terrier\n', 'bulldog\n', 'bullmastiff\n', 'cairn terrier\n', 'Canaan dog\n', 'Chesapeake Bay retriever\n', 'Chihuahua\n', 'Chinese crested\n', 'Chinese shar-pei\n', 'chow chow\n', 'Clumber spaniel\n', 'cocker spaniel\n', 'collie\n', 'curly-coated retriever\n', 'dachshund\n', 'Dalmatian\n', 'Doberman pinscher\n', 'English cocker spaniel\n', 'English setter\n', 'English springer spaniel\n', 'English toy spaniel\n', 'Eskimo dog\

### <br><br><br>Writing files

*Remember that when you open a file in write mode, it will first create a new empty file. If you already have a file with the same name, it will empty that file.*

Let's work with our `alice_list`:

In [80]:
for line in alice_list:
    print(line)

Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"



So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid) whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her.



There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, "Oh dear! Oh dear! I shall be too late!" (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed quite natural); but when the Rabbit actually took a watch out of its waistcoat-pocket, and looked at 

<br>Let's open a new file and write the Alice text without those extra empty new lines.

First, we'll save the name we want for our new file as a string:

In [81]:
new_alice = "alice_clean.txt"

Now we will open this new file in **write mode** using a with/as statement. Inside that statement, we will write each line of the `alice_list` as long as the line contains more than just the new line character. To write, we use the file method `write()`.

In [82]:
with open(new_alice, "w") as f:
    for line in alice_list:
        if line != "\n":
            f.write(line)

<br>To check the file, we can open it in read mode. We will just print the file inside the with/as statement without even saving it as a string or list:

In [83]:
with open(new_alice, "r") as f:
    print(f.read())

Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"
So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid) whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her.
There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, "Oh dear! Oh dear! I shall be too late!" (when she thought it over afterwards, it occurred to her that she ought to have wondered at this, but at the time it all seemed quite natural); but when the Rabbit actually took a watch out of its waistcoat-pocket, and looked at it, an

<br><br>You will get more practice with files this afternoon.

## <br><br><br><br><span style="color:teal">Writing Functions

<br>You already know how to **call** a function.

In [84]:
len("How long is this string?")

24

In [85]:
round(6479.382029, -2)

6500.0

<br><br>You can also write your own custom functions. Why would you want to do that?
- If you find yourself using the same code repeatedly, so that you don't have to write it over and over
- If you want to break your code up into chunks to make it much more readable

<br>To create our own function, we create a **function definition**.

The function definition starts with a **def statement**.
<br><br>The next line (inside the indentation) should be a short **comment** that says what your function does. This is just good practice. Comments start with a `#` and are ignored by Python. 
<br><br>Next (still inside the def statement), you write the code for what the function does.

Here is the syntax. (This code won't work, it's just to look at.)

In [None]:
def function_name(arguments, if_needed):
    #a useful comment
    do something or create a new object

#### <br><br>Writing a function with no arguments

First we'll write a function that just does something whenever it's called. It takes no arguments.

In [86]:
def hello():
    # Prints Hello!
    print("Hello!")

<br> Let's call the `hello()` function:

In [87]:
hello()

Hello!


#### <br><br>Writing a function with one argument

We can add an argument. Whatever you call the arguments in your function definition must match exactly to how they are used inside the function definition, just like we saw with for loops and with/as statements:

In [88]:
def hello_you(name):
    #Prints Hello You! replacing You with whatever string you give it.
    print("Hello " + name + "!")

Now we can pass it any string as an argument:

In [89]:
hello_you("Eeyore")

Hello Eeyore!


#### <br><br>Writing a function that returns an object

Let's write our own function to find the area of a rectangle.

The arguments our function will need are length and width. 

In [90]:
def area(length, width):
    #This function takes a length and width of a rectangle and returns the area.
    answer = length * width

In [91]:
area(10, 12)

In [92]:
print(answer)

NameError: name 'answer' is not defined

<br><br>So we created `answer` inside our function definition, but it doesn't exist outside that definition. We need to include a **return statement** if we want our function to return the value of an object created inside the function.

In [93]:
def area(length, width):
    #This function takes a length and width of a rectangle and returns the area.
    answer = length * width
    return answer

In [94]:
area(10, 12)

120

<br>Like any function, we can assign the output of a custom function to a variable. Let's say my kitchen is 10 feet long and 12 feet wide:

In [95]:
kitchen_area = area(10, 12)

In [96]:
print(kitchen_area)

120


<br>Also like other functions, we can pass variables to the function as our arguments:

In [97]:
kitchen_l = 10
kitchen_w = 12

In [98]:
kitchen = area(kitchen_l, kitchen_w)
print(kitchen)

120


### <br><span style="color:red">Exercise: Writing a function

Define a function called `initials`. It should take two strings as arguments - `first` and `last`. The function should return the first letters of each argument, combined into one string.
<br><br>For example, if I called `initials("Colby", "Wood")` it should return `'CW'`.

In [108]:
def initials(a_list):
    #Returns the initials created from first and last names
    final_initials = ""
    for word in a_list:
        init = word[0]
        final_initials += init
    return final_initials

Test the function with your name:

In [109]:
initials(["Colby", "Witherup", "Wood"])

'CWW'

*Did you remember to include a comment in your function?*

### <br><br>More function practice

Let's try to create a new title function that won't capitalize letters after an apostrophe. We'll walk through it together.

This is how the string method `title()` works:

In [110]:
"I'll be there".title()

"I'Ll Be There"

In [117]:
def new_title(a_string):
    #Makes a string title case, but correctly
    list_of_words = a_string.split(" ")
    new_list = []
    for word in list_of_words:
        new_list.append(word.capitalize())
    new_string = " ".join(new_list)
    return new_string

In [118]:
new_title("I'll be there")

"I'll Be There"

# <br><br><span style="color:rebeccapurple">LUNCH BREAK - meet back at 1 pm Central

## <br><br><span style="color:teal">More fun with dictionaries

### <br><br><span style="color:red">Dictionaries Exercise 1

Here's an example dictionary. Run the cell below:

In [120]:
hero_dict = {"Captain Marvel": "Avengers", "Finn": "Star Wars", 
             "Maui": "Moana", "Captain America": "Avengers", 
             "Princess Leia": "Star Wars"}

<br>Using `hero_dict`, write a for loop/if statement to print the name of all of the characters in Star Wars: 

In [121]:
for k, v in hero_dict.items():
    if v == "Star Wars":
        print(k)

Finn
Princess Leia


In [122]:
for k in hero_dict.keys():
    if hero_dict[k] == "Star Wars":
        print(k)

Finn
Princess Leia


In [123]:
for k, v in hero_dict.items():
    if "Star Wars" in v:
        print(k)

Finn
Princess Leia


<br>**Create your own dictionary.** Choose at least 5 characters from books/movies/tv/comics and create a dictionary. Each character should have a value, which can be anything you'd like about the character. Type and store the dictionary below:

In [124]:
character_dict = {"Bob": "chef", 
                  "Linda": "singer", 
                  "Tina": "Jimmy Jr.", 
                  "Gene": "keyboard", 
                  "Louise": "bunny ears"}

<br>Which character is your favorite? Write code to index the value of your favorite character:

In [125]:
character_dict["Tina"]

'Jimmy Jr.'

### <br><br>List of dictionaries

Sometimes it is useful to have a list of dictionaries because that is how your data is best represented. You can index individual data points in the list or dictionaries, and you can loop through both levels.

In [126]:
gradebook = [{"name": "Zygon", "HW1": 10, "HW2": 10, "HW3": 10}, 
             {"name": "Vogon", "HW1": 10, "HW2": 10, "HW3": 10}, 
             {"name": "Cylon", "HW1": 10, "HW2": 10, "HW3": 10}, 
             {"name": "Mudokon", "HW1": 7, "HW2": 8, "HW3": 6}]

<br>To return an individual dictionary, you use list indexing because each dictionary is an item in the list:

In [127]:
gradebook[2]

{'name': 'Cylon', 'HW1': 10, 'HW2': 10, 'HW3': 10}

<br>To return a value in one of the dictionaries, you first index the dictionary, and then index the key in your key:value pair of interest:

In [128]:
gradebook[2]["HW1"]

10

<br>Looping through the list:

It's often useful to first just print each item in a loop, to confirm that you know what you're looking at. I do this all the time when I code:

In [129]:
for dictionary in gradebook:
    print(dictionary)

{'name': 'Zygon', 'HW1': 10, 'HW2': 10, 'HW3': 10}
{'name': 'Vogon', 'HW1': 10, 'HW2': 10, 'HW3': 10}
{'name': 'Cylon', 'HW1': 10, 'HW2': 10, 'HW3': 10}
{'name': 'Mudokon', 'HW1': 7, 'HW2': 8, 'HW3': 6}


Then you can do more and slowly build up your loop:

In [130]:
for dictionary in gradebook:
    name = dictionary["name"]
    print(name)

Zygon
Vogon
Cylon
Mudokon


In [131]:
for dictionary in gradebook:
    name = dictionary["name"]
    HW_total = dictionary["HW1"] + dictionary["HW2"] + dictionary["HW3"]
    print(HW_total)

30
30
30
21


In [132]:
for dictionary in gradebook:
    name = dictionary["name"]
    HW_total = dictionary["HW1"] + dictionary["HW2"] + dictionary["HW3"]
    print(name + " scored " + str(HW_total) + " points on Homework")

Zygon scored 30 points on Homework
Vogon scored 30 points on Homework
Cylon scored 30 points on Homework
Mudokon scored 21 points on Homework


<br>That code worked well, but it wouldn't work if more than 3 homework assignments were added. Instead, you can loop through the list and then loop through the dictionary. I've included comments in the code to explain what I'm doing:

In [133]:
for dictionary in gradebook: #loop through the list
    name = dictionary["name"] #get student's name
    HW_total = 0 
    for k, v in dictionary.items(): #loop through the key:value pairs
        if k != "name": #get every key:value pair except name
            HW_total = HW_total + v #add the value to our HW total
    print(name + " scored " + str(HW_total) + " points on Homework")

Zygon scored 30 points on Homework
Vogon scored 30 points on Homework
Cylon scored 30 points on Homework
Mudokon scored 21 points on Homework


In [134]:
for dictionary in gradebook: #loop through the list
    name = dictionary["name"] #get student's name
    HW_total = 0 
    for v in dictionary.values(): #loop through the key:value pairs
        if type(v) != str: #get every key:value pair except name
            HW_total = HW_total + v #add the value to our HW total
    print(name + " scored " + str(HW_total) + " points on Homework")

Zygon scored 30 points on Homework
Vogon scored 30 points on Homework
Cylon scored 30 points on Homework
Mudokon scored 21 points on Homework


### <br><br>Dictionary of dictionaries

You can also format your data as a dictionary of dictionaries.

### <br><br><span style="color:red">Dictionaries Exercise 2

In [135]:
grade_dict = {"Zygon": {"HW1": 3, "HW2": 2, "HW3": 4}, 
              "Vogon": {"HW1": 10, "HW2": 10, "HW3": 10}, 
              "Cylon": {"HW1": 10, "HW2": 10, "HW3": 10}, 
              "Mudokon": {"HW1": 7, "HW2": 8, "HW3": 6}}

Use dictionary indexing to write code to return all of Cylon's grades:

In [136]:
grade_dict["Cylon"]

{'HW1': 10, 'HW2': 10, 'HW3': 10}

Use dictionary indexing to write code to return Vogon's score on HW2:

In [137]:
grade_dict["Vogon"]["HW2"]

10

## <br><br><span style="color:teal">More fun with files

#### *If you are using Google Colab ONLY, run this line:*

In [None]:
!wget https://raw.githubusercontent.com/aGitHasNoName/pythonBootcamp_3Day/main/dogs.txt
!wget https://raw.githubusercontent.com/aGitHasNoName/pythonBootcamp_3Day/main/gradebook.csv

<br><br>**Everyone:** Store the filenames that we will be working with today:

In [138]:
dog_file = "dogs.txt"
gradebook_file = "gradebook.csv"

### <br><br>Turning a file into a clean list of lines

Let's read in the dog file and see what it looks like:

In [139]:
with open(dog_file, "r") as f:
    print(f.read())

affenpinscher
Afghan hound
Airedale terrier
Akita
Alaskan Malamute
American Staffordshire terrier
American water spaniel
Australian cattle dog
Australian shepherd
Australian terrier
basenji
basset hound
beagle
bearded collie
Bedlington terrier
Bernese mountain dog
bichon frise
black and tan coonhound
bloodhound
border collie
border terrier
borzoi
Boston terrier
bouvier des Flandres
boxer
briard
Brittany
Brussels griffon
bull terrier
bulldog
bullmastiff
cairn terrier
Canaan dog
Chesapeake Bay retriever
Chihuahua
Chinese crested
Chinese shar-pei
chow chow
Clumber spaniel
cocker spaniel
collie
curly-coated retriever
dachshund
Dalmatian
Doberman pinscher
English cocker spaniel
English setter
English springer spaniel
English toy spaniel
Eskimo dog
Finnish spitz
flat-coated retriever
fox terrier
foxhound
French bulldog
German shepherd
German shorthaired pointer
German wirehaired pointer
golden retriever
Gordon setter
Great Dane
greyhound
Irish setter
Irish water spaniel
Irish wolfhound
Jack 

<br>In the lecture, we learned that we can save this text as a list:

In [140]:
with open(dog_file, "r") as f:
    dog_list = f.readlines()

In [141]:
print(dog_list)

['affenpinscher\n', 'Afghan hound\n', 'Airedale terrier\n', 'Akita\n', 'Alaskan Malamute\n', 'American Staffordshire terrier\n', 'American water spaniel\n', 'Australian cattle dog\n', 'Australian shepherd\n', 'Australian terrier\n', 'basenji\n', 'basset hound\n', 'beagle\n', 'bearded collie\n', 'Bedlington terrier\n', 'Bernese mountain dog\n', 'bichon frise\n', 'black and tan coonhound\n', 'bloodhound\n', 'border collie\n', 'border terrier\n', 'borzoi\n', 'Boston terrier\n', 'bouvier des Flandres\n', 'boxer\n', 'briard\n', 'Brittany\n', 'Brussels griffon\n', 'bull terrier\n', 'bulldog\n', 'bullmastiff\n', 'cairn terrier\n', 'Canaan dog\n', 'Chesapeake Bay retriever\n', 'Chihuahua\n', 'Chinese crested\n', 'Chinese shar-pei\n', 'chow chow\n', 'Clumber spaniel\n', 'cocker spaniel\n', 'collie\n', 'curly-coated retriever\n', 'dachshund\n', 'Dalmatian\n', 'Doberman pinscher\n', 'English cocker spaniel\n', 'English setter\n', 'English springer spaniel\n', 'English toy spaniel\n', 'Eskimo dog\

<br>Each item in the list is a string. Most strings end in a new line character, which we would like to remove.

We can combine what we learned today about opening files with what we learned yesterday about making new lists in a for loop with what we learned Monday about string functions.

First, make an empty list:

In [None]:
dog_list = []

Now, inside the with/as statement, you can loop through the lines in the file and append them to the empty list. But you also need to use a string function to remove the new line characters:

In [142]:
dog_list = []
with open(dog_file, "r") as f:
    for line in f.readlines():
        dog_list.append(line.rstrip("\n"))
print(dog_list)

['affenpinscher', 'Afghan hound', 'Airedale terrier', 'Akita', 'Alaskan Malamute', 'American Staffordshire terrier', 'American water spaniel', 'Australian cattle dog', 'Australian shepherd', 'Australian terrier', 'basenji', 'basset hound', 'beagle', 'bearded collie', 'Bedlington terrier', 'Bernese mountain dog', 'bichon frise', 'black and tan coonhound', 'bloodhound', 'border collie', 'border terrier', 'borzoi', 'Boston terrier', 'bouvier des Flandres', 'boxer', 'briard', 'Brittany', 'Brussels griffon', 'bull terrier', 'bulldog', 'bullmastiff', 'cairn terrier', 'Canaan dog', 'Chesapeake Bay retriever', 'Chihuahua', 'Chinese crested', 'Chinese shar-pei', 'chow chow', 'Clumber spaniel', 'cocker spaniel', 'collie', 'curly-coated retriever', 'dachshund', 'Dalmatian', 'Doberman pinscher', 'English cocker spaniel', 'English setter', 'English springer spaniel', 'English toy spaniel', 'Eskimo dog', 'Finnish spitz', 'flat-coated retriever', 'fox terrier', 'foxhound', 'French bulldog', 'German s

<br>A clean list of dogs!

### <br><br><span style="color:red">Files Exercise 1

Make a clean list of dogs from the dog_file that only includes dogs with the word "terrier" in their names. I've pasted the code we wrote to make a list of all dogs. You need to add an if statement inside the for loop to only append the terriers.  *Bonus: while you're at it, make the dog names all lowercase.*

In [145]:
dog_list = []
with open(dog_file, "r") as f:
    for line in f.readlines():
        if "terrier" in line:
            dog_list.append(line.rstrip("\n").lower())
print(dog_list)

['airedale terrier', 'american staffordshire terrier', 'australian terrier', 'bedlington terrier', 'border terrier', 'boston terrier', 'bull terrier', 'cairn terrier', 'fox terrier', 'jack russell terrier', 'kerry blue terrier', 'lakeland terrier', 'manchester terrier', 'norwich terrier', 'scottish terrier', 'sealyham terrier', 'silky terrier', 'skye terrier', 'staffordshire bull terrier', 'soft-coated wheaten terrier', 'tibetan terrier', 'welsh terrier', 'west highland white terrier', 'yorkshire terrier']


### <br><br>Turning a file into a dictionary

Let's open the gradebook file and see what it looks like:

In [144]:
with open(gradebook_file, "r") as f:
    print(f.read())

name,hw1,hw2,hw3,exam1,exam2
Mary,10,7,9,91,89
Flo,6,6,7,79,82
Lia,8,9,10,92,95
Tim,7,6,7,93,87
Terry,8,10,10,93,90


<br>**Our end goal is to have a dictionary with the student's name as the key and a list of their grades as the values.**

<br>Ok, first let's store it as a list, but we want to leave out the first line of headers. When we call `f.readlines()` it turns the file into a list. We can index a list, so let's take all the lines except the first one:

In [146]:
with open(gradebook_file, "r") as f:
    gradebook = f.readlines()[1:]

In [147]:
for line in gradebook:
    print(line)

Mary,10,7,9,91,89

Flo,6,6,7,79,82

Lia,8,9,10,92,95

Tim,7,6,7,93,87

Terry,8,10,10,93,90


We can see that there are new line characters at the end of each line (because it is printing extra empty lines between the lines). Let's make a note of that.

<br>We can apply what we know about lists and strings to make a list of what we need to code:
- make an empty dictionary
- loop through the gradebook list
- remove the new line characters from the end
- split the line on the commas
- separate the first item to be the key
- store the rest of the items as a list
- assign the key:value pairs to our dictionary

In [152]:
grade_dict = {} #make an empty dictionary
for line in gradebook: #loop through the gradebook list
    line2 = line.rstrip("\n") #remove the new line characters from the end
    line_list = line2.split(",") #split the line on the commas
    name = line_list[0] #separate the first item to be the key
    grades = line_list[1:] #store the rest of the items as a list
    grade_dict[name] = grades #assign the key:value pairs to our dictionary

In [153]:
print(grade_dict)

{'Mary': ['10', '7', '9', '91', '89'], 'Flo': ['6', '6', '7', '79', '82'], 'Lia': ['8', '9', '10', '92', '95'], 'Tim': ['7', '6', '7', '93', '87'], 'Terry': ['8', '10', '10', '93', '90']}


<br>**Question:** We wrote out all the steps and wrote the code with one line per step. This code could be condensed into fewer lines or left how it is - as explicit as possible. **Can you think of ways that the code could be condensed to fewer lines?**

In [151]:
grade_dict = {} #make an empty dictionary
for line in gradebook: #this works
    line_list = line.rstrip("\n").split(",") #remove the new line characters from the end
    grade_dict[line_list[0]] = line_list[1:] #assign the key:value pairs to our dictionary
print(grade_dict)

{'Mary': ['10', '7', '9', '91', '89'], 'Flo': ['6', '6', '7', '79', '82'], 'Lia': ['8', '9', '10', '92', '95'], 'Tim': ['7', '6', '7', '93', '87'], 'Terry': ['8', '10', '10', '93', '90']}


### <br><br>Writing files

### <br><br><span style="color:red">Files Exercise 2

We just created a dictionary, `grade_dict`. The first three items in each value are homework grades. **Write a new file** that includes one complete sentence for each student in the dictionary. The sentence should include the student's name and their three homework grades. A sample sentence is "Mary's homework grades are 10, 7, and 9."

<br>You will need to:
- store a variable that contains the filename of the new file you want to create
- write a with/as statement to open the new file in write mode
- loop through both the keys and values in the key:value pairs in the `grade_dict` dictionary
- store a string that composes a sentence that uses the key (student's name) and indexes each of the first three items in the value
- add a new line character to the end of the string so that each student will have their own line in the file
- use `f.write()` to write the string to the `f` file object.

In [154]:
homework_file = "homework.txt"

In [155]:
with open(homework_file, "w") as f:
    for k, v in grade_dict.items():
        sentence = k + "'s homework grades are " + v[0] + ", " + v[1] + ", and " + v[2] + ".\n"
        f.write(sentence)
        
        
        

### <br><br>Reading files line by line

Sometimes you might be working with a very large file, with millions of lines, and you don't want to read it all into memory as a string or list.

There is a file method, `readline()`, that reads in only one line at a time. **I don't expect you to practice this method here, but I will give an example, so that you know it exists if you ever need to look it up.**

Let's imagine that there are millions of types of dogs (if only!) and our dogs.txt file is millions of lines long. We can use readline to loop through it and only store the dogs that we need for this notebook or script. This doesn't work the same way as `readlines()` because `readlines()` is a list and `readline()` is the string of only the first line. We need to use a while loop, which is something we aren't learning this week:

In [156]:
hounds = []
with open(dog_file, "r") as f:
    line = f.readline()
    while line:
        if "hound" in line:
            hounds.append(line.rstrip("\n").lower())
        line = f.readline()

In [157]:
print(hounds)

['afghan hound', 'basset hound', 'black and tan coonhound', 'bloodhound', 'foxhound', 'greyhound', 'irish wolfhound', 'norwegian elkhound', 'otterhound', 'scottish deerhound']


## <br><br><span style="color:teal">More fun with functions

Let's write a simple function to convert a volume in teaspoons to a volume in cups. There are 48 teaspoons in 1 cup.

In [158]:
def tspToCup(tsp):
    #converts a number from tsps to cups
    return tsp / 48

In [159]:
tspToCup(8)

0.16666666666666666

Let's improve it by rounding the answer:

In [160]:
def tspToCup(tsp):
    #converts a number from tsps to cups
    cup = round(tsp / 48, 2)
    return cup

In [161]:
tspToCup(8)

0.17

### <br><br><span style="color:red">Functions Exercise 1

Write a function to convert  miles per hour to kilometers per hour. 1 mph is equal to 1.60934 kph. Round the answer to the nearest kph.

In [162]:
def mphTOkph(mph):
    return round(mph * 1.60934)

Test your function with 60 mph.

In [163]:
mphTOkph(60)

97

### <br><br><span style="color:red">Functions Exercise 2

Here is a dictionary containing the conversion factors from 1 pound to a variety of other units of weight or mass:

In [164]:
pound_dict = {"ounce": 16, "gram": 453.592, "kilogram": 0.453592, 
              "ton": 0.0005, "stone": 0.0714286}

<br>Write a function called `poundTo` that takes two arguments, a weight in pounds and the unit of measure that you want to convert it to, as included in the dictionary above. The function should return the converted weight. For example, someone might call `poundTo(150, "stone")` and the function should return `10.71`. The answer should be rounded to 2 places after the decimal.

In [165]:
def poundTo(pound, unit):
    #convert pound to other unit
    return round(pound * pound_dict[unit], 2)

Test your function:

In [166]:
poundTo(150, "stone")

10.71

## <br><br><br>Wednesday Quiz

On your own, complete the Wednesday Quiz. The Jupyter Notebook file is called "wednesdayQuiz.ipynb" and is in the same folder as this notebook. If you are using Colab, go to <https://colab.research.google.com/github/aGitHasNoName/pythonBootcamp_3Day/blob/main/wednesdayQuiz.ipynb>.

<br>If you have questions about the quiz material, send me an email.
<br><br>As a reminder, the quiz is self-graded - you do not need to turn anything in! The answer key is called "wednesdayQuiz-answers.ipynb". For Colab users, the answer key is at <https://colab.research.google.com/github/aGitHasNoName/pythonBootcamp_3Day/blob/main/wednesdayQuiz-answers.ipynb>.

**The last question of the quiz is a real thinking question that will take some time. There are many ways to do it - the answer key contains several different answers.**