# File Handling

To make your programs more interesting and more functional, you'll want to be able to save data. That is customarily done with a database, but how to design and interact with databases is beyond the scope of our course.

So if we can't use a database, how can we save data? We'll write it to files. 

## The File Object and Writing/Reading

### Opening a File

To work with a file, you first have to open it. This tells Python the name of the file you're interested in. Once you have opened a file, to get data from it, you perform a _read_ operation, and to put data into the file, you perform a _write_ operation.

Opening a file is done with ... wait for it ... the open() function. We provide two arguments to open:

* a string that names the file we want to open ('data.txt')
* a string that indicates what we want to do with the file; possible values for this string are:
  * 'r' meaning we want to read from the file
  * 'w' meaning we want to write to the file
  * 'a' meaning we want to append to the file

For example, try the code below:

In [None]:
f = open('data.txt', 'w')

When you open a file for writing (the second argument to open() is a 'w'), Python creates the file for you. If the file already exists, _everything you write replaces what is already in the file_. Said more colloquially, writing "blows away" the existing contents of the file.

What if you just want to add to the end of the file without blowing away what is already in the file? That's what append does -- it adds everything you write to the end of the file without altering what is already there.

What about the variable `f`? The `open()` function returned us some sort of Python object, and we assigned that to a variable we called `f`. So what did `open()` return?

When you open a file, the `open()` function returns a _file object_. Think of a file object as a remote control that you can use to manipulate the file. You need the file object so you can do things to the file like read from it or write to it.

## Writing Text to a File

Let's write something to our file. Be sure you have run the `f = open(...)` code cell above, then run the following code:

In [None]:
f.write('This is a line of text.\n')
f.write('This is another line of text.\n')
f.close()

In Jupyter, you should see a new file named _data.txt_ in your current folder. You can double-click it in Jupyter to open it.

Let's break down what we just did:

* Our file object is in variable `f`. When we say `f.write()`, we're asking the file object to write to the file. What will it write? The string that we pass as an argument to `write()`.
* So what does `f.close()` do? This asks the file object to close the file, which means we're done writing to the file. When you are finished with a file object, you must always close() it. You only have to close a file when you're totally finished with it. You do not close the file after every time you write to it; only when you have done all the writing you intend to do. You cannot write to a file after you close() it; to do so, you'd have to open() it again.

Notice that the string we wrote to the file ended with a newline character, denoted by `\n`. A file is just a stream of characters; if you want to break your text into lines, you need a newline character to separate each line.

## Reading From a File

Python has multiple ways to read from a file. The easiest is to loop through the file one line at a time. For example, try this code:

In [None]:
f = open('data.txt', 'r')

for line in f:
    print(line)
    
f.close()

In [None]:
f = open('data.txt', 'r')

for line in f:   # line will be a string
    for char in line:
        print(char)
    print('*** end of line ***')
    
f.close()

In [None]:
otherfile = open('somedata.txt', 'w')
otherfile.write('65')
otherfile.close()

Although the file object, `f`, is not a Python list, you _can_ use it in a `for` loop. Each time through the loop, you get the next line from the file as a string. (Python uses the file object to keep track of where it is in the file.)

If you ran the code above, you might have noticed that each line of output had an extra blank line after it. This is because, each time through the loop, the variable `line` contains one line of text from the file, and _every line of text ends with a newline character_ (`'\n'`). 

For example, the first time through the loop, here is the value of `line`:

`This is a line of text.\n`

The newline character that marks the end of the line is considered part of the line by Python. When we print this variable, it contains a newline (`'\n'`), and then the `print()` function adds its own newline, so we get two blank lines.

Each time the `for` loop gives us a line of text from the file, if we don't want that newline character at the end of the string, can we get rid of it? Yes, easily. Here's how:

In [None]:
f = open('data.txt', 'r')

for line in f:
    line = line.rstrip('\n')
    print(line)
    
f.close()

The `line` variable is a string, so we can use the string method `rstrip` to strip any newlines from the end of the string. Note the argument to `rstrip` -- it is a string. It tells `rstrip` to look for any of the characters in this string at the end of the string it is operating on (`line` in this case). If it finds any of them, it should remove them.

A string cannot be modified, so `rstrip` returns a new string. The 'r' in `rstrip` stands for 'right,' meaning to strip the given character from the 'right' side of the string.

Try the code below:

In [None]:
f = open('data.txt', 'r')

for line in f:
    line = line.rstrip('\n')
    print(line)

for line in f:
    line = line.rstrip('\n')
    print(line)

f.close()

The first `for` loop prints every line in the file. The second `for` loop tries to do the same thing, but nothing comes out. Why?

As you loop through the file object, `f`, Python is keeping track of where it is in the file. The first `for` loop brings Python all the way to the end of the file. When the second `for` loop runs, Python is already at the end of the file. There are no lines left, so the loop ends immediately.