# ICT 781 - Day 6: Review and File Input/Output

This set of notes only introduces one new concept: file input and output. This also marks roughly the halfway point of the course. Therefore, we'll review some concepts before we move on with the new material.

# Core Python Concepts

These skills can be considered the 'glue' that holds Python programs together. Mastering these skills doesn't come quickly, but will enable you to create extremely useful and powerful Python programs. These skills can be summarized as follows:

* Understanding basic Python data types and arithmetic.
* Conditional control.
* Iteration through `for` loops and list comprehensions.
* Using dictionaries to store data.
* Writing functions in Python.
* Proper commenting and documentation.

# Summary: Python Types and Control Statements 

* Arithmetic operators in Python are shown in the table below:

|Key|Mathematical Operation in Python|
|---|:--|
|`*`|Multiplication|
|`/`|Division|
|`+`|Addition|
|`-`|Subtraction|
|`**`|Exponent|
|`%`|Modulo|
|`//`|Integer division (ignore the decimals)|

* The `math` module may be imported for more complicated mathematical functions such as trigonometry.
* Basic data types in Python are integers, strings, floats, and booleans.
* A **statement** is any line of code that can be read by the interpreter.
* An **expression** is a line of code that results in a value.
* Some of Python's built-in functions are:

|Python Function|Purpose|
|---|---|
|`print()`|Print out the argument|
|`input()`|Query the user for input in the console and <br>either print it or assign it to a variable|
|`int()`|Convert the argument to integer|
|`float()`|Convert the argument to float|
|`str()`|Convert the argument to string|
|`round()`|Round the argument to the nearest integer|

* Comments may be written in Python using the `#` symbol. Multi-line comments are written with either `""" """` or `''' '''`.
* The relational operators in Python are:

|Operator|Name|Way it works|
|---|---|---|
|`>`|Greater than|Returns `True` if first operand is greater than the second operand|
|`<`|Less than|Returns `True` if first operand is less than the second operand|
|`>=`|Greater than or equal to|Returns `True` if first operand is greater than or equal to the second operand|
|`<=`|Less than or equal to|Returns `True` if first operand is less than or equal to the second operand|
|`==`|Equals|Returns `True` if both operands are equal|
|`!=`|Not equal|Returns `True` if operands are not equal|

* The logical operators in Python are `and`, `or`, `is`, and `is not`.
* We control the flow of our programs with `if`, `elif` and `else`.

### Exercise 1

For this exercise, we will be coding the creation of the [Collatz sequence](https://en.wikipedia.org/wiki/Collatz_conjecture). To create the sequence, we begin with a positive integer. Subsequent terms in the Collatz sequence are defined by
$$
    a_n := \begin{cases} \frac{n}{2}, & \text{ if } n \% 2 \equiv 0 \\ 3n+1, &\text{ if } n \% 2 \equiv 1 \end{cases}
$$

The sequence ends when it reaches 1. Write code in Python to find the Collatz sequence for an input integer. Make sure that the input integer is positive (and greater than 1).

In [19]:
# Your Collatz sequence code here.

def collatz(n):
    """Prints the Collatz sequence for a positive integer n.
    
    Examples:
    >>> collatz(5)
    5
    16
    8
    4
    2
    1
    
    >>> collatz(9)
    9
    28
    14
    7
    22
    11
    34
    17
    52
    26
    13
    40
    20
    10
    5
    16
    8
    4
    2
    1
    """
    
    if type(n) is not type(1):
        raise TypeError('Problem in chair, not in computer')
    
    current = n
    
    print(current)
    
    while current > 1:
        if current % 2 == 0:
            current //= 2
        elif current % 2 == 1:
            current = 3*current + 1
        print(current)

### Exercise 2

Without using built-in functions, write a function that finds the minimum and maximum of a given list of numbers.

In [31]:
def find_extremes(num_list):
    """ Return the min and max of the passed list
        
    Examples:
    >>> find_extremes([3,0,10,5,6,2])
    (0, 10)
    """
    for x in range(len(num_list)-1):
        for y in range(x + 1, len(num_list)):
            if num_list[x] > num_list[y]:
                num_list[x], num_list[y] = num_list[y], num_list[x]
    return num_list[0], num_list[len(num_list)-1]

find_extremes(['hi','hello','whats up','wut up','hey','saladations','yo'])

('hello', 'yo')

In [28]:
doctest.testmod(verbose = True)

Trying:
    collatz(5)
Expecting:
    5
    16
    8
    4
    2
    1
ok
Trying:
    collatz(9)
Expecting:
    9
    28
    14
    7
    22
    11
    34
    17
    52
    26
    13
    40
    20
    10
    5
    16
    8
    4
    2
    1
ok
Trying:
    find_extremes([3,0,10,5,6,2])
Expecting:
    (0, 10)
ok
1 items had no tests:
    __main__
2 items passed all tests:
   2 tests in __main__.collatz
   1 tests in __main__.find_extremes
3 tests in 3 items.
3 passed and 0 failed.
Test passed.


TestResults(failed=0, attempted=3)

### Exercise 3

Suppose that you love building with blocks, but you only like laying out your buildings with square bases. Rectangles just don't cut it. Write a function that takes in a positive integer and checks if it is a square. For example, 5 is not a square, but 9 is.

In [41]:
def check_square(n):
    """ Take in a positive integer and check if it is square. 
    
        Examples:
        >>> check_square(21)
        False
        
        >>> check_square(25)
        True
    """
    
    return not n**0.5 % 1

In [47]:
def check_square1(n):
    """ Take in a positive integer and check if it is square. 
    
        (Doing this more explicitly than above)
    
        Examples:
        >>> check_square1(21)
        False
        
        >>> check_square1(25)
        True
    """
    
    if n**0.5 % 1 != 0:
        return False
    elif n**0.5 % 1 == 0:
        return True

**Note:** The square root of a number $\sqrt{n}$ can also be written as $n^{0.5}$.

In [48]:
doctest.testmod(verbose = True)

Trying:
    check_square(21)
Expecting:
    False
ok
Trying:
    check_square(25)
Expecting:
    True
ok
Trying:
    check_square1(21)
Expecting:
    False
ok
Trying:
    check_square1(25)
Expecting:
    True
ok
Trying:
    collatz(5)
Expecting:
    5
    16
    8
    4
    2
    1
ok
Trying:
    collatz(9)
Expecting:
    9
    28
    14
    7
    22
    11
    34
    17
    52
    26
    13
    40
    20
    10
    5
    16
    8
    4
    2
    1
ok
Trying:
    find_extremes([3,0,10,5,6,2])
Expecting:
    (0, 10)
ok
1 items had no tests:
    __main__
4 items passed all tests:
   2 tests in __main__.check_square
   2 tests in __main__.check_square1
   2 tests in __main__.collatz
   1 tests in __main__.find_extremes
7 tests in 5 items.
7 passed and 0 failed.
Test passed.


TestResults(failed=0, attempted=7)

In [35]:
25**0.5 % 1

0.0

In [36]:
9**0.5 % 1 

0.0

In [40]:
7**0.5 % 1

0.6457513110645907

In [32]:
9**0.5

3.0

# Summary: Iterating and Functions in Python 

* `while` loops are used when we need to iterate an unknown number of times.
* `for` loops are used when we need to iterate a known number of times.
* Lists, tuples, and dictionaries are containers for any Python data types.
* Lists and dictionaries are **mutable**; tuples are **immutable**.
* Some list methods are:

|Method|Description|
|------|-----------|
|`.append(<item>)`|Appends the item to the end of the list|
|`.remove(<item name>)`|Removes a list element by name|
|`.pop(<index>)`|Removes a list element by index|

* Some useful built-in functions for lists are:

|Function|Description|
|--------|-----------|
|`len(<list>)`|Gives the length (number of elements) of a list|
|`sorted(<list>)`|Sorts the list elements in descending order; use the `reverse = True` argument for ascending order|
|`sum(<list>)`|Adds up all list elements|

* Some useful dictionary methods are:

|Method|Use|
|---|---|
|`.keys()`|Access all keys in the dictionary|
|`.values()`|Access all values in the dictionary|
|`.items()`|Access all key/value pairs in the dictionary as a list of tuples|
|`.get(<key>)`|Access the value of the key in the argument|

* List comprehensions are used to perform calculations on an entire list, or to create new lists.
* Dictionary comprehensions are used to create new dictionaries.
* The `zip()` function creates a list of tuples from two or more lists.
* The `enumerate()` function creates a numbered list of tuples from a list.
* Functions are declared in Python using the `def <func_name>(<arguments>):` syntax.
* Functions can return any Python data type.
* Use functions for computing repetitive tasks.
* `docstrings` are indispensable for describing functions, providing examples, and offering help to the developer/user.
* Avoid re-inventing the wheel by using Python's built-in functions or packages listed on PyPI.
* Catch exceptions (errors) by using `try`/`except` blocks.
* Recursive functions are functions that call themselves.
* Define arbitrary inputs for functions using `*args` and `**kwargs`.
* A **module** is a collection of Python variables and functions. Module scripts are called from other Python scripts.

### Exercise 4

A polygon is a 2-dimensional geometric shape formed by connecting vertices (points) with edges. Polygons can be represented by their vertices. For example, the square with area of 1 unit can be represented by the set of vertices $\{(0,0), (0,1), (1,0), (1,1) \}$. The distance between any two vertices $v_1 = (x_1, y_1)$ and $v_2 = (x_2, y_2)$ is given by
$$
    d(v_1, v_2) = \sqrt{(x_1-x_2)^2 + (y_1-y_2)^2}.
$$

Write a program that takes in a list of tuples of vertices and returns the largest and the smallest distance between any two points. 

**Optional:** Plot the polygon.

In [124]:
import numpy as np

def minmax_distance(vertices):
    """ Take in a list of vertices (list of tuples) 
        and return the largest and smallest pairwise
        distances.
        
        Examples:
        >>> minmax_distance([(0, 1), (0, 0), (0, 0)])
        (1, 0)
    
    """
    
    def distance(point1, point2):
        """ Find the distance between 2 points,
            each point is given by a tuple.

            Examples:
            >>> distance((1, 1), (1, 2))
            1.0

        """

        return ((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)**0.5
    
    import itertools
    
    indices = list(itertools.product( list(range(len(vertices))), list(range(len(vertices))) ))
    
    distances = [distance(vertices[i], vertices[j]) for (i, j) in indices if i < j]
    
    return np.max(distances), np.min(distances)
            
    
    # How to store the distances? 
    # Like this? [d(0,0), d(0,1), d(0,2),...,d(0,n),d(1,0),d(1,1),d(1,2),...,d(1,n),...,d(n,0),d(n,1),...,d(n,n)]

### Exercise 5

Create general input checking functions for the following tasks:
<ul>
    <li> Check that the input is a positive integer. </li>
    <li> Check that the input is a list. </li>
    <li> Check that the input is between some specified bounds (the bounds should also be inputs to the checking function). </li>
    <li> Check that the input is a string. </li>
</ul>

You can choose to allow user input to the checking functions, or you can create the checking functions as if they will be used inside other functions.

Put all of the checking functions into a module called `checking.py`.

### Exercise 6

This problem is taken from [Project Euler, problem 6](https://projecteuler.net/problem=6), which compares the sum of the squares of the first $n$ natural numbers
$$
    1^2 + 2^2 + 3^2 + \cdots + n^2 = S
$$
with the square of the sum of the same numbers
$$
    (1 + 2 + 3 + \cdots + n)^2 = T.
$$
The difference between the sum of the squares and the square of the sum is then $T-S$.

Write a function to compute the above difference.

### Exercise 7

Rewrite the following `for` loops as list comprehensions. **Hint:** You might need three list comprehensions for the last loop.

In [49]:
# Loop 1
trees = ['elm','oak','cedar','pine','spruce','butternut','ash','aspen','basswood']
caps_trees = []

for tree in trees:
    caps_trees.append(tree.upper())
    
caps_trees1 = [tree.upper() for tree in trees]

print(caps_trees)
print(caps_trees1)

['ELM', 'OAK', 'CEDAR', 'PINE', 'SPRUCE', 'BUTTERNUT', 'ASH', 'ASPEN', 'BASSWOOD']
['ELM', 'OAK', 'CEDAR', 'PINE', 'SPRUCE', 'BUTTERNUT', 'ASH', 'ASPEN', 'BASSWOOD']


In [50]:
# Loop 2
fraction_squares = []
N = 30

for i in range(1, N):
    fraction_squares.insert(i, 1/i**2)
    # or
    # fraction_squares.append(1/i**2)
    
fraction_squares1 = [1/i**2 for i in range(1,N)]

print(fraction_squares)
print(fraction_squares1)

[1.0, 0.25, 0.1111111111111111, 0.0625, 0.04, 0.027777777777777776, 0.02040816326530612, 0.015625, 0.012345679012345678, 0.01, 0.008264462809917356, 0.006944444444444444, 0.005917159763313609, 0.00510204081632653, 0.0044444444444444444, 0.00390625, 0.0034602076124567475, 0.0030864197530864196, 0.002770083102493075, 0.0025, 0.0022675736961451248, 0.002066115702479339, 0.001890359168241966, 0.001736111111111111, 0.0016, 0.0014792899408284023, 0.0013717421124828531, 0.0012755102040816326, 0.0011890606420927466]
[1.0, 0.25, 0.1111111111111111, 0.0625, 0.04, 0.027777777777777776, 0.02040816326530612, 0.015625, 0.012345679012345678, 0.01, 0.008264462809917356, 0.006944444444444444, 0.005917159763313609, 0.00510204081632653, 0.0044444444444444444, 0.00390625, 0.0034602076124567475, 0.0030864197530864196, 0.002770083102493075, 0.0025, 0.0022675736961451248, 0.002066115702479339, 0.001890359168241966, 0.001736111111111111, 0.0016, 0.0014792899408284023, 0.0013717421124828531, 0.0012755102040816

In [72]:
# Loop 3
hours = [45, 40, 42, 41, 37, 39, 40, 43, 46, 80, 100, 121, 91, 81]
regular = 0           # Between 0 - 40 hours
overtime = 0          # Between 40 - 80 hours
double_overtime = 0   # Over 80 hours

for time in hours:
    if time <= 40:
        regular += time
    elif 40 < time <= 80:
        overtime += time - 40
        regular += 40
    else:
        double_overtime += time - 80
        overtime += 40
        regular += 40

    print(regular, overtime, double_overtime)
    
regular1 = [time if time <= 40 else 40 for time in hours ]
overtime1 = [time - 40 for time in hours if 40 < time <= 80]

overtime2 =  [ 0 if time <= 40 else time - 40 if time <= 80 else 40 for time in hours ] 

print(regular1)
print(overtime1)
print(overtime2)

40 5 0
80 5 0
120 7 0
160 8 0
197 8 0
236 8 0
276 8 0
316 11 0
356 17 0
396 57 0
436 97 20
476 137 61
516 177 72
556 217 73
[40, 40, 40, 40, 37, 39, 40, 40, 40, 40, 40, 40, 40, 40]
[5, 2, 1, 3, 6, 40]
[5, 0, 2, 1, 0, 0, 0, 3, 6, 40, 40, 40, 40, 40]


# File Input/Output (I/O)

Python has the capacity to read/write data from/to external files. For the `.txt` and `.csv` file extensions, Python handles file I/O natively with built-in functions. For other data types, such as `.xls` or `.xlsx`, we most often use `pandas`.

## Reading and Writing `.txt` Files

### *Example 1: Reading a `.txt` file line by line*

This example introduces the `with` command and the `open()` function. The `open()` function has a descriptive name: it opens a file in the Python interpreter for reading/writing. While the exact use of the `with` command is technical, we can understand it as creating a temporary variable. In context of file I/O, creating a temporary variable using the `with` command and the `open()` function ensures that the file is both open and closed inside of the `with` code block.

In [127]:
# Open the file.
with open('books.txt', 'r') as file:     # The 'r' option means 'read-only'.
    data1 = list(file)                   # Read in the file and store as list.

print(data1)

['Title\tAuthor\tGenre\tPages\tPublisher\n', 'Fundamentals of Wavelets\t"Goswami, Jaideva"\tsignal_processing\t228\tWiley\n', 'Data Smart\t"Foreman, John"\tdata_science\t235\tWiley\n', 'God Created the Integers\t"Hawking, Stephen"\tmathematics\t197\tPenguin\n', 'Superfreakonomics\t"Dubner, Stephen"\teconomics\t179\tHarperCollins\n', 'Orientalism\t"Said, Edward"\thistory\t197\tPenguin\n', '"Nature of Statistical Learning Theory, The"\t"Vapnik, Vladimir"\tdata_science\t230\tSpringer\n', 'Integration of the Indian States\t"Menon, V P"\thistory\t217\tOrient Blackswan\n', '"Drunkard\'s Walk, The"\t"Mlodinow, Leonard"\tscience\t197\tPenguin\n', 'Image Processing & Mathematical Morphology\t"Shih, Frank"\tsignal_processing\t241\tCRC\n', 'How to Think Like Sherlock Holmes\t"Konnikova, Maria"\tpsychology\t240\tPenguin\n', 'Data Scientists at Work\tSebastian Gutierrez\tdata_science\t230\tApress\n', 'Slaughterhouse Five\t"Vonnegut, Kurt"\tfiction\t198\tRandom House\n', 'Birth of a Theorem\t"Villan

In [128]:
""" The original file had lots of '\t' characters in it to specify
    tab-delimited data. We'll split the list based on these characters,
    since we don't need them anymore.
"""

data1 = [row.split('\t') for row in data1]
print(data1)

[['Title', 'Author', 'Genre', 'Pages', 'Publisher\n'], ['Fundamentals of Wavelets', '"Goswami, Jaideva"', 'signal_processing', '228', 'Wiley\n'], ['Data Smart', '"Foreman, John"', 'data_science', '235', 'Wiley\n'], ['God Created the Integers', '"Hawking, Stephen"', 'mathematics', '197', 'Penguin\n'], ['Superfreakonomics', '"Dubner, Stephen"', 'economics', '179', 'HarperCollins\n'], ['Orientalism', '"Said, Edward"', 'history', '197', 'Penguin\n'], ['"Nature of Statistical Learning Theory, The"', '"Vapnik, Vladimir"', 'data_science', '230', 'Springer\n'], ['Integration of the Indian States', '"Menon, V P"', 'history', '217', 'Orient Blackswan\n'], ['"Drunkard\'s Walk, The"', '"Mlodinow, Leonard"', 'science', '197', 'Penguin\n'], ['Image Processing & Mathematical Morphology', '"Shih, Frank"', 'signal_processing', '241', 'CRC\n'], ['How to Think Like Sherlock Holmes', '"Konnikova, Maria"', 'psychology', '240', 'Penguin\n'], ['Data Scientists at Work', 'Sebastian Gutierrez', 'data_science',

Note that, for now, I left the `\n` commands alone. In the event that I want to write this data to a new file, I won't need to specify line breaks.

There are three ways of reading in data from `.txt` files in Python, besides the method of simply assigning the contents of the data file to a list. This list is adapted from *Murach's Python Programming*.

1. `read()` - Read the entire file and save it as one long string.
2. `readlines()` - Read the file line-by-line and save each line as a string in a list.
3. `readline()` - Read one line of a file, saving it as a string.

The next table shows the optional positional arguments for the `open()` function.

|Option|Description|
|---|---|
|`r`|Open the file in read-only mode|
|`w`|Open the file in write-enabled mode|
|`a`|Append data to an already created file (create the file if it doesn't exist)|

### *Example 2: Reading in a file as a complete string, reading line-by-line, and reading a single line*

In [129]:
# Read a .txt file as a string.
with open('books.txt', 'r') as file:
    data_string = file.read()

# Print the first 50 characters of the string contained in data_string.
print(data_string[:50] + '\n')

# Read each line of a .txt file into a list.
with open('books.txt', 'r') as file:
    data_lines = list(file.readlines())
    
print(data_lines[:10])
print()

# Read a single line of a .txt file.
with open('books.txt', 'r') as file:
    data_line = file.readline()
    
print(data_line)

Title	Author	Genre	Pages	Publisher
Fundamentals of

['Title\tAuthor\tGenre\tPages\tPublisher\n', 'Fundamentals of Wavelets\t"Goswami, Jaideva"\tsignal_processing\t228\tWiley\n', 'Data Smart\t"Foreman, John"\tdata_science\t235\tWiley\n', 'God Created the Integers\t"Hawking, Stephen"\tmathematics\t197\tPenguin\n', 'Superfreakonomics\t"Dubner, Stephen"\teconomics\t179\tHarperCollins\n', 'Orientalism\t"Said, Edward"\thistory\t197\tPenguin\n', '"Nature of Statistical Learning Theory, The"\t"Vapnik, Vladimir"\tdata_science\t230\tSpringer\n', 'Integration of the Indian States\t"Menon, V P"\thistory\t217\tOrient Blackswan\n', '"Drunkard\'s Walk, The"\t"Mlodinow, Leonard"\tscience\t197\tPenguin\n', 'Image Processing & Mathematical Morphology\t"Shih, Frank"\tsignal_processing\t241\tCRC\n']

Title	Author	Genre	Pages	Publisher



**Note:** Reading in a tab-delimited text file has resulted in a somewhat nasty looking list. We can get around this by importing a tab-delimited text file as a `.csv` file.

### *Example 3: Writing data to a new file*

In this example, we'll put the first 5 rows of `data1` into a new text file.

In [130]:
with open('books2.txt', 'w') as file:     # The 'w' option means 'overwrite-enabled'.
    for i in range(5):
        file.write('\t'.join(data1[i]))

### *Example 4: Writing a dictionary to a new file*

In this example, we'll see that dictionaries can be written to `.txt` files.

In [131]:
fish = {'pike': 'ES1',
        'perch': 'ES2',
        'rainbow trout': 'ES1'}

with open('fish.txt','w') as file:
    file.write(str(fish))     # Convert the dictionary to a string to be written to text.
    

burbot = {'burbot': 'ES4'}

# Append a new line to the open file.
with open('fish.txt','a') as file:     # The 'a' option means 'append'.
    file.write('\n' + str(burbot))

## Reading and Writing `.csv` Files

The acronym 'CSV' stands for Comma Separated values. The `.csv` file format is used for files containing data of the form:
```
heading1,heading2,heading3,...
col11,col21,col31,...
col12,col22,col32,...
```

These files can be read as spreadsheets, for example, in Excel. Python reads and writes these files in a very similar way to `.txt` files, but using the `csv` module.

In [132]:
import csv

csv.__version__

'1.0'

There are many features in the `csv` module. For now, we'll focus on the `reader` and `writer` objects.

A `reader` object reads in a `.csv` file. Each line of the read file can be printed or assigned to a new Python data type.

### *Example 5: Reading a `.csv` file*

In this example, we read in the `books.csv` file and save it as a `list`. The `newline = ''` option indicates that no special character was used in the original `.csv` file to indicate a new line.

In [133]:
with open('books.csv', newline = '') as csvfile:
    bookreader = csv.reader(csvfile)     # Create the 'reader' object.
    
    # Save each row in the 'reader' object as a list within our list of books.
    books = [row for row in bookreader]
    
# Print the first 5 rows of our new data list.
for i in range(5):
    print(books[i])

['Title', 'Author', 'Genre', 'Pages', 'Publisher']
['Fundamentals of Wavelets', 'Goswami, Jaideva', 'signal_processing', '228', 'Wiley']
['Data Smart', 'Foreman, John', 'data_science', '235', 'Wiley']
['God Created the Integers', 'Hawking, Stephen', 'mathematics', '197', 'Penguin']
['Superfreakonomics', 'Dubner, Stephen', 'economics', '179', 'HarperCollins']


In [134]:
help(open)

Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
    Open file and return a stream.  Raise OSError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
    
    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the file regardless of the current seek position

### *Example 6: Writing a `.csv` file*

In this example, we'll create a new `.csv` file from a list of lists. Our data consists of simulated pH measurements of reservoir water. Each row represents hourly measurements, and there are 18 rows representing 18 days.

In [135]:
import random

# Scaling the output of random.random() to be between 5 and 8.
rainwater_ph = [[5 + random.random()*(8 - 5) for _ in range(24)] for _ in range(18)]

with open('ph.csv', 'w', newline = '') as writefile:
    writer = csv.writer(writefile)     # Create the 'writer' object.
    
    # Write all data in 'rainwater_ph' to the csv file.
    writer.writerows(rainwater_ph)

### *Example 7: Writing a dictionary to a `.csv` file*

For this last example, we'll create a new `.csv` file from a Python dictionary. This uses a special `writer` object: the `DictWriter` object. We will use the dictionary keys as the column headings by specifying the `fieldnames` keyworded argument when we create the `DictWriter` object.

The data for this example is 100 simulated patients with high blood pressure. The dictionary consists of patient identifier numbers, then systolic and diastolic measurements.

In [None]:
import random

N = 100

header = ['patient_id','systolic','diastolic']

# Create the simulated data.
patient_id = ['0011' + str(i) for i in range(N)]
systolic = [random.randint(140,160) for _ in range(N)]
diastolic = [random.randint(90,110) for _ in range(N)]

patients = list(zip(patient_id, systolic, diastolic))

with open('blood_pressure.csv', 'w', newline = '') as writefile:
    writer = csv.DictWriter(writefile, fieldnames = header)  # fieldnames are the column names in the csv file
    
    # Create the header.
    writer.writeheader()
    
    # Fill in the rows by header value.
    for row in patients:
        writer.writerow({key: value for key,value in zip(header, row)})

The basic objects for reading `.csv` files in Python are the `csv.reader()` and `csv.DictReader()` objects. The basic objects for writing to `.csv` in Python are the `csv.writer()` and `csv.DictWriter()` objects. The methods for these objects are outlined as follows:

|Object|Method|Description|
|---|---|---|
|`csv.reader()`|`reader.__next__()`|Returns the next row in the file as a list|
|&nbsp;|`reader.line_num`|Returns the total number of lines read from the file|
|`csv.DictReader()`|`dictreader.__next__()`|Returns the next row in the file as a list|
|&nbsp;|`dictreader.fieldnames`|Returns the column headings in the original file|
|`csv.writer()`|`writer.writerow(<row>)`|Writes `<row>` to the file|
|&nbsp;|`writer.writerows(<iterable>)`|Writes the elements of an `<iterable>` (i.e. a list, tuple, or dictionary) to the file|
|`csv.DictWriter()`|`dictwriter.writeheader()`|If the `fieldnames` keyworded argument is specified, this creates the column headings in the file|

# Summary: File I/O

* Python can read and write data to `.txt` and `.csv` files with no external packages.
* The `with` command automatically closes a file for reading or writing at the end of the `with` block.
* The `open()` function opens a file for reading or writing.
* Files are read into Python from `.txt` files as strings.
* The basic objects for reading `.csv` files are `csv.reader()` and `csv.DictReader()`.
* The basic objects for writing to `.csv` files are `csv.writer()` and `csv.DictWriter()`.

### Exercise 8

Go to [Project Gutenberg](https://www.gutenberg.org/) and download a public domain book in the `.txt` format. Read your book into Python and save it as a list of individual strings. 

### Exercise 9

For this exercise, write a program that creates username/password combinations for 20 users. The passwords should be randomized strings made from upper/lower case characters and numbers with at least 8 characters in them. Write your username/password combinations to a `.csv` file.

Once the `.csv` file is created, read in the username/password data as a dictionary.

In [None]:
import string
import random
import csv

N = 20

letters = string.ascii_letters
numbers = string.digits

characters = letters + numbers



### Exercise 10

Write a program that creates simulated data for a hypothetical study. The study is to be on correlations between income and house price for 300 home owners. Your data headings will be home owner identifier numbers, income (random integers between \\$30,000 and \\$150,000 incrementing by 1000), and house price (random integers between \\$200,000 and \\$1,000,000). Save your data as a dictionary and then write it to a `.csv` file called `house_prices.csv`.