## APS106 Lecture Notes
# Reading from and Writing to Files

### Lecture Structure
1. [Write to your first file!](#section1)
2. [Writing a dictionary to file](#section2)
3. [Reading Files](#section3)
4. [The with Statement](#section4)
5. [CSV Files](#section5)


 <a id='section1'></a>
## 1. Working with Files

If you are going to write programs that do something with data, you do not want to hard-code all that data or have the user enter it again and again. You need to be able to write to and read from files.

When a program is running, its data is in RAM - fast, volatile memory. Volatile means that the data disappears as soon as the program ends. Files are a way to organize data on slower, persistent media (e.g. a disk, USB, etc). This data will stay there when the program is done.

Working with files is a lot like working with a notebook. 

- A file has to be opened. 
- When you are done, it has to be closed. 
- While the file is open, it can either be read from or written to. 
- Like a bookmark, the file keeps track of where you are reading to or writing from. 
- You can read the whole file in its natural order or you can skip around. 

### Opening and Closing a File

Python has a built-in function where you specify the filename and the mode of access ("w" = write, "r" = read, "a" = append).

In [None]:
myfile = open("test.txt", "w")
#type(myfile)
#dir(myfile)

This command will open `test.txt` in the folder where the program is being executed. If `test.txt` does not exist it will be created. If it does exist, it will be **over-written!!!**

`myfile` is an object that keeps track of information about the file (e.g., where you are in it). If you want to write to (or read from) the file, you need to do so via the file object.

In [None]:
myfile.write("CATS!")

This command writes a string to myfile. It is like `print` but does not add the newline. So:

In [None]:
myfile.write("\n")
myfile.write("I <3 my second sentence\n")  #need to add \n newline character, unlike print()

myfile.close()

In [None]:
myfile = open('test.txt','w')  #what happens to file changing modes between 'a' and 'w'
myfile.write('hola')
myfile.close()

# Breakout Session
## Find which folder (or directory) your files are being saved
## Play around with writing to a file!



 <a id='section2'></a>
## 2. Writing a dictionary to file



In [None]:
#GENERAL FORM

# create a file
write_file = open("grades.txt", "w")

# write a string to file
write_file.write('something you want to write')

# close the file    
write_file.close()

In [None]:
students = {'Kendrick': 'A+', 'Dre': 'C-', 'Snoop': 'B'} 

# create a file
myfile = open("grades.txt", "w")

# store dictionary items to the file
for student in students:
    myfile.write(student + ',' + students[student] + '\n')

# close the file    
myfile.close()

The next `write` statement writes the string where ever we left off. When we are done, the file needs to be closed. This tells the file object that we are done and it should clean things up.

Now we can go to the folder where the jupyter notebook is and observe that there is a file there called text.txt containing the lines that we wrote out.

<a id='section3'></a>
## 3. Reading Files

Now that the file exists on our disk, we can open it, this time for reading, and read all the lines in the file, one at a time. This time, the mode argument is "r" for reading:

There are four common ways to read a file. 

### read

In [None]:
#declaring filename to change easily if you want to try with different files
filename = 'grades.txt'

In [None]:
# Approach: read
# When to use it: When you want to read the whole file at once and use it as a single string.
# Example code

myfile = open(filename, 'r')

contents = myfile.read() # contents is a string that contains the entire contents of the file

myfile.close()

print(contents)  #what if we don't use print and just output the variable?

### readline

In [None]:
# Approach: readline
# When to use it: When you want to process the file line-by-line
# Example code
myfile = open(filename, 'r')
line = myfile.readline()
contents = ''

''' 
line = myfile.readline() #this is showing how one might use readline one call at a time, as opposed to in a loop
print(line)
line = myfile.readline()
print(line)
'''

while line:
    contents += line 
    line = myfile.readline() # each time through the loop line contains one line of the file
myfile.close()

print(contents)
# by the end of this contents contains the entire contents of the file

### for line in file

In [None]:
# Approach: for line in file
# When to use it: When you want to process the file line-by-line
# Example code
myfile = open(filename, 'r')
contents = ''
for line in myfile: # each time through the loop line contains one line of the file
    contents += line
    print(line)  #why is there a gap between rows?
myfile.close()

print('Final contents:', contents, sep='\n')
# by the end of this contents contains the entire contents of the file

### readlines

In [None]:
# Approach: readlines
# When to use it: When you want to process the file line-by-line with an index
# Example code
myfile = open(filename, 'r')
lines = myfile.readlines() # lines is a list of strings. Each entry in lines is a line of the file
myfile.close()
print(lines)

Now let's go through the options in more depth, one at a time.

### The read approach

Read the whole file into a string. **Beware: If the file is huge, this can create problems!**

In [None]:
flanders_file = open('flanders.txt','w')
flanders_file.write('''
In Flanders Fields

In Flanders fields the poppies blow 
Between the crosses, row on row,
That mark our place; and in the sky
The larks, still bravely singing, fly
Scarce heard amid the guns below.
We are the Dead. Short days ago
We lived, felt dawn, saw sunset glow, 
Loved and were loved, and now we lie
In Flanders fields.
Take up our quarrel with the foe:
To you from failing hands we throw
The torch; be yours to hold it high.
If ye break faith with us who die
We shall not sleep, though poppies grow 
In Flanders fields.''')
flanders_file.close()

In [None]:
flanders_file = open("flanders.txt", 'r')

contents = flanders_file.read()

print('Type of contents:', type(contents))
print(contents)
flanders_file.close()

#contents #what does this show us about the file?

Q: If `flanders_poem` is a string, why does it print out across multiple lines?

### The readline approach

Read the file line-by-line into a string. This is a safer thing to do as the whole file never gets put in memory at once. Note that the file must be kept open if you still want to read the next line - unlike above where you can close the file immediately after `read()`.

In [None]:
flanders_file = open("flanders.txt", 'r')
print(flanders_file)

'''
line = flanders_file.readline()
print(line)
line = flanders_file.readline()
print(line)
line = flanders_file.readline()
print(line)
line = flanders_file.readline()
print(line)
line = flanders_file.readline()
print(line)

'''

while line != "":
    print(line, end='')
    line = flanders_file.readline()


flanders_file.close()


### The for line in file approach

Like the `readline` approach, this approach also reads in the file line-by-line. It just uses the `in` operator.

In [None]:
flanders_file = open("flanders.txt", 'r')
for line in flanders_file:
    print(line, end="")
    
flanders_file.close()
print(type(line))

### The readlines approach

The `readlines` approach reads the whole file in (like `read`) but rather than putting the file in one big string, it creates a list where each line of the file is an entry in the list.

In [None]:
flanders_file = open("flanders.txt", 'r')
flanders_list = flanders_file.readlines()
flanders_file.close()

print(type(flanders_list))
#print(len(flanders_list))
#print(type(flanders_list[0]))
print(flanders_list)

for line in flanders_list:
    print(line, end="")


In [None]:
filename = 'grades.txt'
myfile = open(filename, "r")

students = {}
myfile = open("grades.txt", "r")

# read each line of the file
for line in myfile:
    # find indices for slicing each line
    ind1 = line.find(',')
    ind2 = line.find('\\')
    name = line[:ind1]
    grade = line[ind1+1:ind2]
    students[name] = grade

myfile.close()

print(students)

<a id='section4'></a>
## 4. The with Statement

Notice that whenever we open a file, we need to be careful to close it again. Python provides a nice way to open and then automatically close a file using a `with` block.

```
with open(«filename», «mode») as «variable»:
      «body»
```

The file is opened at the beginning and **automatically closed** at the end of the body. 


In [None]:
def f(file_object):
    print(file_object.read())

with open('test.txt', 'r') as file:  #test.txt is from beginning of notebook
    #f(file)
    print(file.read())
    
print("The next line")

In [None]:
with open("flanders.txt", 'r') as flanders_file:
    
    #content = flanders_file.read()
    #print('Content', content)
    
    for line in flanders_file:
        print('Line: ', line, end="")



 <a id='section5'></a>
## 5. CSV Files

The CSV format (comma separated values) is very commonly used to represent the data in a spreadsheet. 

For example a spreadsheet such as:

Name|Test1|Test2|Final
----|-----|-----|-----
Kendrick|100|50|29
Dre|76|32|33
Snoop|25|75|95

is represented as a file like this:

```
Name,Test1,Test2,Final
Kendrick,100,50,29
Lamar,76,32,33
Snoop,25,75,95
```

We can, of course, access this files using the techniques above.

In [None]:
csv_file = open('grades.csv','w')
csv_file.write('''Name,Test1,Test2,Final
Kendrick,100,50,29
Dre,76,32,33
Snoop,25,75,95
''')
csv_file.close()

In [None]:
with open('grades.csv','w') as csv_file:
    csv_file.write('Name,Test1,Test2,Final\nKendrick,100,50,29\nDre,76,32,33\nSnoop,25,75,95')


In [None]:
with open('grades.csv', 'r') as file:
    for line in file:
        print(line, end='')

Notice that you have the information about each row and also the commas. If you are going to process this data, you are going to need to **parse** it. That means , for example, to discard the commas (as they just separate the data and are not otherwise meaningful), to extract the integers from the string.

One of the great things about Python is the existence of many modules that give us the ability to easily do many things, like reading and writing CSV files. 

Reading of CSV files can be done using the CSV reader. You can construct a reader object using `csv.reader()` which takes the file object as input. The reader object can be used to iterate through the contents of the CSV file, similarly to how a file object was used to iterate through the contents in a text file.

The difference between the two is that the file method `read(`) returns the entire contents of the file as one long string, whereas, the CSV `reader()` returns an object which can be iterated through. The reader object holds each row as a list of strings and can be iterated through row by row. 

Example: Read each row of a CSV file

In [None]:
import csv

with open('grades.csv', 'r') as file:
    print(file)
    grades_reader = csv.reader(file) # create csv.reader object with an open file
    print(grades_reader) #what does grades_reader look like? just some object, but it is iterable!
    row_num = 1
    for row in grades_reader:           # the cvs.reader is an iterable!
        print(row)
        #print('Row #', row_num, ':', row)
        row_num += 1

If we didn’t have a CSV file created, we could create one by:
- creating a CSV writer object
- using the writerow() method to populate it with data

Example: In the previous grade example there were a few marking errors on the final exam. 


In [None]:
import csv

rows = [['Name', 'Test1', 'Test2', 'Final'],
        ['Kendrick', '100', '50', '69'],
        ['Dre', '76', '32', '53'],
        ['Snoop', '25', '75', '100']]

with open('grades_new.csv', 'w') as csvfile:
    print(csvfile)
    grades_writer = csv.writer(csvfile)
    print(grades_writer)

    for row in rows:
        grades_writer.writerow(row)
        

In [None]:
#WHAT TO REVIEW?


We have just scratched the surface of what the csv module can do. See the online documentation for much more detail.

<div class="alert alert-block alert-info">
<big><b>This Lecture: File Input and Output</b></big>
<ul>  
    <li>Files need to be opened and closed</li>
    <li>How to write to a file</li>
    <li>There are at least four ways to read a text file</li>
    <li>Use the csv module to process (read and write) CSV files</li>
    </ul>
</div>