# Lecture 7 - Dictionaries and Files
___
___

## Purpose

- Review lists and tuples
- Create dictionaries
- Access individual items dictionaries
- Learn about and use dictionary methods
- Read files
- Access items in files
- Write files


## Some Creative Commons Reference Sources for This Material

- *Think Python 2nd Edition*, Allen Downey, chapters 11 and 14
- *The Coder's Apprentice*, Pieter Spronck, chapter 13, 16, and 26
- *A Practical Introduction to Python Programming*, Brian Heinold, chapters 11 and 12
- *Algorithmic Problem Solving with Python*, John Schneider, Shira Broschat, and Jess Dahmen, chapters 10 and 14

## Review of Lists and Tuples

- Lists and tuples can be used to hold a number of objects or values
- For example, **`my_list = [1, 2, 3, 4, "Python", 10.3, 56.78, math.pi, (3, 4, 5)]`**
- Access the items in lists (and tuples) using indexing and slicing
  - Use **`my_list[4]`** to access the string **`"Python"`**
  - Use **`my_list[:4]`** to access **`[1, 2, 3, 4]`**
- Tuples work the same as lists except they use parentheses
- Remember that lists are mutable (can be altered by changing, adding, or removing objects)
- Tuples are not mutable

___

**Practice it**

Create a list with 4 strings named **`string_list`** and print it. Also, create a tuple with 4 numeric values named **`num_tuple`** and print it.

- Lists and tuples hold (or contain) objects so are sometimes referred to as containers
- Also call them collections since they can hold a collection of objects
- Another useful data type that is also a container or collection is the **dictionary**

## Introduction to Dictionaries

- Dictionaries (like lists) can be used to contain or collect objects
- They are mutable
- They can contain a variety of object types
  - Integers
  - Floats
  - Strings
  - Lists
  - Tuples
  - Other dictionaries
- The primary difference between dictionaries and lists is how objects are stored and accessed
- In lists objects are accessed by their position, i.e. **`my_list[2]`**
- Dictionaries do not rely upon position to access objects
- Items in dictionaries are stored using key-value pairs and the values are accessed via the keys
- Dictionary keys...
  - Must be immutable objects
    - Strings
    - Integers
    - Floats
    - Tuples
  - Keys must be unique within a dictionary because values are linked to keys 
  - Cannot have more than one of any specific key
- Values can be nearly any object type, such as...
  - Integers
  - Floats
  - Strings
  - Lists
  - Tuples
  - Dictionaries

## Creating Dictionaries

- Dictionaries are enclosed with curly braces **`{ }`**
- An empty dictionary can be created two ways
  - Empty curly braces assigned to a variable name, i.e. **`MECH_classes = {}`**
  - Using the function **`dict()`**, i.e. **`my_dict = dict()`**
- Can add key-value pairs via indexing the dictionary name with a key and assigning it a value
  - **`MECH_classes['MECH 111'] = 'MET Seminar'`** creates the key-value pair **`'MECH 111':'MET Seminar'`** in **`MECH_classes`**
- Can create a dictionary with values in it using curly braces and key-value pairs
  - **`MECH_classes = {'MECH 111':'MET Seminar'}`**
- Can create a dictionary with key-value pairs and the **`dict()`** function
  - One way to do this is to use a list of lists (or tuples) that each contain key-value pairs
  - **`MECH_classes = dict([["MECH 111", "MET Seminar"], ["MECH 122", "Computer Apps 1"]])`**

___
**Practice it**

Create the empty dictionary **`MECH_classes`** and then add the key-value pair **`'MECH 111':'MET Seminar'`** to it. Print the dictionary when done.

___
**Practice it**

Create **`MECH_classes`** dictionary again using the direct method with a key-value pair inside of curly braces and then print it.

___
**Practice it**

Use the **`dict()`** function to create and then print the same dictionary with the classes MECH 111 and MECH 122.

___
**Practice it**

You can use one list or tuple that contains your desired keys and another that contains the matching values and zip them together inside a call to **`dict()`**. Use **`string_list`** and **`num_tuple`** to create and print the dictionary named **`my_dict`**.

___
**Practice it**

If you have simple string keys (no spaces), you can create a dictionary using keyword arguments like the following code cell.

In [0]:
MECH_faculty = dict(Brady='JOH 422',
        Hollenbeck='JOH 410',
        Stein='JOH 407',
        Wiltshire='JOH 409')
print(MECH_faculty)

___
**Practice it**

Not all dictionaries need to use strings as keys. The following dictionary definition is a valid one even though the keys are not all the same type. Execute it to be sure.

In [0]:
import math
my_dict = {1: 'one',
           2: 'two',
           math.e: 'e',
           3: 'three',
           3.1416: 'pi',
           4: 'four',
           5: 'five',
           6: 'scared',
           7: 'ate nine'}
print(my_dict)

## Why Dictionaries?

- "Can't we just use lists instead of dictionaries?"
- "Yes" most of the time
- Lists can be used in many places where dictionaries are used
- Dictionaries are sometimes easier or more effective
- An example
  - Access material properties of specific common engineering materials
  - Lots of indexing and searching if using lists
  - Could use the material name as the key to return all of the property values
  - Could even implement nested dictionaries so each property can be accessed by its name
- Another example
  - Take a user input string and turn it into Morse Code
  - Create a dictionary that contains all of the letters of the alphabet, numbers, and other symbols
  - Loop through each character of the string and access the dictionary by character translate to Morse code

___
**Practice it**

Execute the code cell below to create a small material property dictionary. This dictionary uses lists for the material properties (just the USCS and SI yield strengths in ksi and MPa at this time). You would need to know which list index is associated with each property to use the values in this dictionary.

In [0]:
material = {"Steel ASTM-A36":[36, 250],
           "Aluminum 2104-T6":[60, 410],
           "Bronze cold-rolled":[75, 772],
           "Nickel Alloy":[60, 414]}

___
**Practice it**

Execute the code cell below to define the function **`string_to_morse(string)`**. In the second cell call the function with the string of your choice (no exclamation points).

In [0]:
def string_to_morse(string):
    to_morse = {'A':'.-', 'B':'-...',
            'C':'-.-.', 'D':'-..', 'E':'.',
            'F':'..-.', 'G':'--.', 'H':'....',
            'I':'..', 'J':'.---', 'K':'-.-',
            'L':'.-..', 'M':'--', 'N':'-.',
            'O':'---', 'P':'.--.', 'Q':'--.-',
            'R':'.-.', 'S':'...', 'T':'-',
            'U':'..-', 'V':'...-', 'W':'.--',
            'X':'-..-', 'Y':'-.--', 'Z':'--..',
            '1':'.----', '2':'..---', '3':'...--',
            '4':'....-', '5':'.....', '6':'-....',
            '7':'--...', '8':'---..', '9':'----.',
            '0':'-----', ',':'--..--', '.':'.-.-.-',
            '?':'..--..', '/':'-..-.', '-':'-....-',
            '(':'-.--.', ')':'-.--.-'}
    test_string = string.upper()
    morse_string = ""
    for character in test_string:
        morse_string += to_morse.get(character, " ") + " "
    return morse_string

## Accessing Dictionary Values

- Access dictionary values using keys
- Use **`MECH_faculty['Brady']`** to access Mr. Brady's office location from a previous dictionary
- Trying to use numeric index to find the same thing causes an error
- Prior to *Python* 3.7 the objects collected in dictionaries were stored in no particular order
- As of 3.7, objects are stored in the created order, but numeric indexing is still not allowed
- Accessing values in dictionaries is generally more efficient and faster than searching or indexing lists
- Use multiple sets of square brackets with keys or indexes to access values that are inside lists, tuples, or other dictionaries within a dictionary
- For example, **`material["Steel ASTM-A36"][1]`** will return the second property (SI yield strength) for ASTM-A36 steel
- Trying to access a key that does not exist in a dictionary will result in an error
- The **`MECH_classes`** dictionary does not yet include the key **`'MECH 499'`**
- Trying to access **`'MECH 499'`** using **`MECH_classes['MECH 499']`** results in a **`KeyError`**
- There is a way to attempt to retrieve a value for a key that does not exist without generating an error (later)
- Test to see if a key is present in a dictionary using the `in` operator, i.e. **`"MECH 499" in MECH_classes`**
- Determine how many key-value pairs exist in a dictionary using the **`len()`** function

___
**Practice it**

Access Mr. Stein's office location from **`MECH_faculty`** and print it. In a separate code cell try to access Mr. Brady by using **`MECH_faculty[0]`**.

___
**Practice it**

Access the properties for "Nickel Alloy" from the **`material`** dictionary.

___
**Practice it**

Access the USCS yield strength for Aluminum 2014-T6 from the **`material`** dictionary.

**Practice it**

Try accessing "MECH 499" from the **`MECH_classes`** dictionary.

___
**Practice it**

Test if **`"Stainless Steel"`** is a valid key in the **`material`** dictionary.

___
**Practice it**

Determine how many key-value pairs there are in the dictionary **`material`**.

## Adding or Replacing Dictionary Values

- Add or replace key-value pairs using the same method as when creating a dictionary
  - For example, **`MECH_classes['MECH 211'] = 'Fluid Mechanics'`** adds MECH 211 to the dictionary
- Since there can only be one instance of each key
  - **`MECH_classes['MECH 122'] = 'Computer Apps for Technology 1'`** will replace the current value for **`'MECH 122'`**

___
**Practice it**

Add MECH 211 to **`MECH_classes`** and change the class name of MECH 122. Print the dictionary after making these changes.

## Nested Dictionaries

- Nesting dictionaries for our material properties dictionary would make selecting a property more explicit
- For each material name, use a dictionary as a value that has the keys **`'yield strength'`** and **`'ultimate strength'`**
- The values for each of these could in turn be dictionaries with keys of **`'uscs'`** and **`'si'`**
- The following expression would return the SI yield strength for ASTM-A36 steel from such a dictionary
  - **`material_2["Steel ASTM-A36"]["yield strength"]["si"]`**

___
**Practice it**

Execute the following code cell to create the dictionary **`material_2`**. Print the dictionary and then access the USCS ultimate strength for nickel alloy.

In [0]:
material_2 = {"Steel ASTM-A36":{"yield strength":{"uscs":36, "si":250}, 
                                "ultimate strength":{"uscs":60, "si":400}},
           "Aluminum 2104-T6":{"yield strength":{"uscs":60, "si":410}, 
                               "ultimate strength":{"uscs":70, "si":480}},
           "Bronze cold-rolled":{"yield strength":{"uscs":75, "si":772}, 
                                 "ultimate strength":{"uscs":100, "si":515}},
           "Nickel Alloy":{"yield strength":{"uscs":60, "si":414}, 
                           "ultimate strength":{"uscs":80, "si":552}}}

## Dictionary Methods

- Use **`print([n for n in list(dir(dict)) if "__" not in n`** to see available dictionary methods
- Most will act "in-place" instead of creating copies of the dictionary
- **`my_dict.clear()`** removes all key-value pairs from **`my_dict`**; leaving it empty
- **`new_dict = my_dict.copy()`** creates an unlinked copy of **`my_dict`** named **`new_dict`**
- **`my_dict = dict.fromkeys([1, 2, 3, 4], 10)`** creates a dictionary named **`my_dict`**
  - With the keys 1, 2, 3, and 4 as given in the iterable (list or tuple)
  - All values will be assigned 10
  - If no value is given, all of the keys will be assigned the value **`None`**
- **`my_dict.get("a")`** attempts to get the value for **`"a"`** in **`my_dict`**
  - Returns the value if **`"a"`** is present
  - Returns **`None`** if it is not present
  - Can add a second argument to change the value returned if the requested key is not present
    - For example, **`my_dict.get("a", "Not present")`**
  - This is the method mentioned earlier that keeps you from getting an error when requesting a value that is not present
- **`my_dict.keys()`** returns all keys in **`my_dict`**
  - Convert the returned values to a list using **`list(my_dict.keys())`**
- **`my_dict.values()`** returns all of the values in **`my_dict`**
  - Convert the returned values to a list using the **`list()`** function
- **`my_dict.items()`** returns all of the key-value pairs from **`my_dict`** as tuples
  - Using **`list(my_dict.items())`** returns a list of tuples where each tuple contains the key-value pairs
- **`my_dict.pop("a")`** returns the value associated with the key **`"a"`** and removes **`"a"`** and its value
- **`my_dict.popitem()`** returns the value associated with the last key and removes that key and value
  - Only when using *Python* 3.7 and later
  - Earlier versions of *Python* selected a random key from the dictionary
- **`my_dict.setdefault("a")`** adds the key **`"a"`** to **`my_dict`**
  - If it doesn't already exist then the value **`None`** is assigned to the key **`"a"`**
  - If key **`"a"`** does exist its value is returned
  - A second argument can be added to the method, i.e. **`my_dict.setdefault("a", "a value")`**
    - This adds **`"a"`** as a key with the value **`"a value"`** if the key does not already exist
    - If `"a"` already exists, the current value remains the same and is returned
- **`my_dict.update(other_dict)`** updates the values in **`my_dict`** using the key-value pairs from **`other_dict`**
  - Key-value pairs from **`other_dict`** that do not exist in **`my_dict`** will be added to **`my_dict`**
  - If a key from **`other_dict`** already exists in **`my_dict`**, its value in **`my_dict`** will be replaced by the one from **`other_dict`**

In [0]:
print([n for n in list(dir(dict)) if "__" not in n])

___
**Practice it**

Use the appropriate dictionary methods to perform the following tasks using the **`MECH_classes`** dictionary

- Show all of the key-value pairs as tuples
- Show just the keys
- Show just the values
- Add "MECH 212", "MECH 222", and "MECH 223" with values of **`None`** for each (this will require using two methods in a single expression)

Print **`MECH_classes`** when you are done

In [0]:
# all key-value pairs as tuples


In [0]:
# just the keys


In [0]:
# just the values


In [0]:
# add MECH 212, MECH 222, and MECH 223 with values of None


In [0]:
# print MECH_classes


___
**Practice it**

A common use for a dictionary is to count the number of occurrences of characters in a string or numbers in a list. The following function definition will count the number of each different character in the string and assign the counts to a dictionary. See how it works by counting the characters in the string "My name is Inigo Montoya, you killed by father, prepare to die!"

In [0]:
def count_chars(string):
    string = string.lower()
    count_dict = {}
    for char in string:
        if char in count_dict:
            count_dict[char] += 1
        else:
            count_dict[char] = 1
    return count_dict

A better version of the **`count_chars(string)`** function would use **`count_dict[char] = count_dict.get(char, 0) + 1`** instead of the **`if-else`** statement block. Using a default value of **`0`** in the **`.get()`** method causes it to add a key for **`char`** and assign it the value 0 if **`char`** is not already a key. Then the expression adds 1 to the value and assigns it to **`char`** in the dictionary. If the a key for **`char`** already exists, its current value is retrieved, incremented by one, and assigned back to **`char`** in the dictionary.

___
**Practice it**

Edit the function definition in the code cell below to use the expression described above. Test the function with the string "the quick brown fox jumps over the lazy dog."

In [0]:
# Make this function better by using the .get() method instead of if-else
def count_chars(string):
    string = string.lower()
    count_dict = {}
    for char in string:
        if char in count_dict:
            count_dict[char] += 1
        else:
            count_dict[char] = 1
    return count_dict

The string used above has at least one of each letter from the English alphabet. If you assign the function call to a variable, like **`string_count`**, then use **`sorted(string_count)`** it is fairly easy to verify this. The sorted dictionary sorts by and only shows the keys.

___
**Practice it**

Perform the calling and sorting as described above.

## Sets

- **Sets** are collections that are related to, and sometimes mistaken for, dictionaries is a **set**
- Sets do not contain key-value pairs like dictionaries
- They do use curly braces like dictionaries
- Sets can be created from the following use the **`set()`** function
  - Strings
  - Lists
  - Tuples
  - Dictionary keys
  - Dictionary values using
- Sets are unordered collections of unique values
- If **`"a"`** is added to a set that already has an **`"a"`** nothing will be changed
- Good if you want to know the number of occurrences of each character in a string

___
**Practice it**

Convert the string "the quick brown fox jumps over the lazy dog" to a set in the first code cell and assign it to the variable name **`string_set`**. In the next cell print both the original set and a sorted version of the set.

In [0]:
"the quick brown fox jumps over the lazy dog"

## Reading and Writing from/to Files

- Items in lists, tuples, dictionaries only remain available until a script finishes executing
- Sometimes you want to save values for later or use saved information
- This might include...
  - Test results saved from an instrument or machine
  - Tables of material properties
  - Steam tables
  - A dictionary of books, articles, or standards
  - Any text file
- *Python* has built-in functions and methods for reading and writing information from/to files
- We will look at using text files that can be opened with any standard text editor

### Opening a File for Reading or Writing

- Assume that the file is located in the same directory as our script or function
- The standard procedure to open a file is to use the **`open()`** function and assign it to a variable
- For example, **`my_file = open("data file.txt")`**
  - This will open the file "data file.txt" for reading
  - Assign it to **`my_file`**
  - Can add a mode as a second argument to the function to explicitly state how the file will be used
    - The default is for reading **`"r"`**
    - Can also use...
      - **`"w"`** for writing
      - **`"a"`** for appending
      - **`"r+"`** for reading and writing
  - Could have been written as **`my_file = open("data file.txt", "r")`** to state the file is to be read
- After a file is opened for reading or writing, it needs to be closed it, i.e. **`my_file.close()`**
- Another method that can be used is **`with open("data file.txt", "r") as my_file:`**
  - It does exactly the same thing as the previous method
  - Any commands done with the file need to be indented under the **`with`** expression
  - The file is automatically closed as soon as the code block below the **`with`** line is finished
  - Using **`with`** to open a file means that the **`.close()`** method does not need to be used

### Reading a File's Contents

- Just looing at two methods for reading the contents of a file that has been opened
  - **`.read()`**
  - **`.readlines()`**
  - These two methods should meet all of our needs
- **`.read()`** will read an entire text file as a single long string
  - Everywhere the file had a new line, the string will have the newline character **`\n`** added
  - Assign the read to a variable name to work with the string
  - Example below
- Using `.readlines()` is similar except that each line of the text file will become an item in a list
- Once the contents of the file assigned to a variable using either method, anything can be done with the information
- It is best to perform such operations outside of the **`with`** code block
- Keep in mind that if the file was read, anything donce within a script or function does change the original file


```python
with open("data file.txt") as my_file:
    file_string = my_file.read()
```

___
**Practice it**

Type and execute the above two lines in the following code cell. Then in second code cell display **`file_string`**.

___
**Practice it**

Copy your code from above into the first code cell below and edit it so that it uses **`.readlines()`** instead of **`.read()`** and assign the results to **`file_list`** instead of **`file_string`**. Use the second code cell to display the results.

### Writing to a File

- Take a script that performs calculations and you want to save the results instead of printint them
- Open a file and write to it
- Explicitly set the mode to either **`"w"`** or **`"a"`**
  - If **`"w"`** and the file name used in **`.open()`** does not already exist, it will be created
  - If the file already exists and the mode is `"w"`, its content will be deleted without warning
  - If the **`"a"`** mode is used and the file exists, the existing file will be appended
  - Using **`"a"`** when the file does not yet exist is just like using **`"w"`**
- Once a file is open use either the **`.write()`** or **`.writelines()`** methods
- It is important that everything written is text
- *Python* will not write integers or floats or anything else unless it is converted to a string
- The **`.write()`** method requires a string as an argument and writes the string to the open file
- The code below opens `"new file.txt"` and writes two lines to it
  - The **`\n`** at the end of each print statement are necessary for each string to be on separate lines

```python
with open("new text file.txt", "w") as new_file:
    new_file.write("Hey, new file.\n")
    new_file.write(f"{21 * 2}\n")
```

- The **`.writelines()`** method requires a list of strings as an argument
- All of the strings in the list will be written one after the other to the file
- If a string does not end in `\n`, then the next string will start on the same line
- The following code shows how a list of strings can be used to create a new text file

```python
string_list = ["My name is Brian.\n", "I work at Ferris."]
with open("brian file.txt", "w") as new_file:
    new_file.writelines(string_list)
```

___
**Practice it**

Copy the code provided above and paste it into the first code cell below. Change the code so that the file uses your first name (not mine) and the first string in the list uses your first name (again, not mine). Execute the code cell to create the file. In the next empty code write the code necessary to read your new file and put its contents into a list and print the list.

### Reading CSV Files

- CSV (comma separated values) files are special forms of text files
- All modern spreadsheet applications can be used to create CSV files
- The standard **`open()`** function with **`.read()`** or **`.readlines()`** and other coding can be used

___
**Practice it**

Make a text file named **`my_csv_file.csv`** using a plain text editor that looks like the following (make sure you do not include any spaces after commas).

```python
"x","y"
0,0
1,2
3,6
4,8
5,10
```

Then execute the following code cell to read it and place each line into a list.

In [0]:
with open("my_csv_file.csv", "r") as csv_file:
    csv_list = csv_file.readlines()
print(csv_list)

The next code cell iterates though the list of strings that the **`.readlines()`** method created and strips all newline characters. It also creates a list called **`headings`** containing the two strings from the first line. It also splits all lines after the first at the comma, converts the strings to floats, and places the x and y values into separate lists. This is not difficult code, but not all that simple either.

In [0]:
x = []
y = []
for n, string in enumerate(csv_list):
    new_string = string.strip("\n")
    if n == 0:
        new_string = new_string.strip('"')
        headings = new_string.split('","')
    else:
        xy_pair = [float(n) for n in new_string.split(",")]
        x.append(xy_pair[0])
        y.append(xy_pair[1])

print(headings)
print(x)
print(y)

- The **`csv`** module was designed to handle this specific task
- Use **`with`** to open a CSV file and assign it a name
- Then use **`csv.reader()`** with the file object as an argument plus **`quoting=csv.QUOTE_NONNUMERIC`** as a keyword argument
  - The **`quoting`** keyword tells the reader how to handle quoted items
  - In this case, it converts non-quoted items into floats
- The new CSV object can be converted to a list so the required values can be extracted
- In the code below two list comprehensions are used to extract the x and y values and place them in their own lists

In [0]:
import csv
with open("my_csv_file.csv") as csvfile:
    csv_reader = csv.reader(csvfile, quoting=csv.QUOTE_NONNUMERIC)
    csv_list = list(csv_reader)
    headings = csv_list[0]
    x = [n[0] for n in csv_list[1:]]
    y = [n[1] for n in csv_list[1:]]

print(headings)
print(x)
print(y)

### Using *Pandas* to Read a CSV File

- People who use *Python* to work with large data sets typically use the [*Pandas*](https://pandas.pydata.org/) library
- This library of modules makes working with CSVs of any size very easy
- The primary data structure of *Pandas* is the DataFrame
- The following code block will import *Pandas* and use **`.read_csv()`** to create a DataFrame named **`csv_df`** from **`"my_csv_data.csv"`**
- Once **`csv_df`** is created its headings can be converted to a list using **`list(csv_df)`**
- DataFrames use the heading names from the CSV as indexes for the DataFrame (like dictionary keys)
- In this case the headings are **`"x"`** and **`"y"`**
- **`csv_df["x"]`** will return all of the values from the `"x"` column as a *Pandas* Series
- Use the **`list()`** function to convert the Series to a list, i.e. **`x = list(csv_df["x"]`**
___
**Practice it**

Execute the following code cell to import *Pandas* and create a DataFrame from our CSV file. In separate code cells create lists that contain the CSV headings, the $x$ values, and the $y$ values. Print all three lists.

In [0]:
import pandas as pd
csv_df = pd.read_csv("my_csv_file.csv")

In [0]:
# list of the CSV headings


In [0]:
# list of the x values


In [0]:
# list of the y values


In [0]:
print(headings)
print(x)
print(y)

The *Pandas* approach definitely took fewer lines of code, but required more computing overhead due to importing a fairly sizable library. However, with that overhead comes a lot of power to work with datasets large and small.

___
**Wrap it up**

Click on the **Save** button and then the **Close and halt** button when you are done before closing your tab.