# Files

Reading and writing to files is unarguably useful, especially when performing any kind of data analysis.

In this notebook, you will be provided with what you need to know to read and/or write to files.

## File Handling

Just like interacting with files in any other context, to read or write to a file, you must first open it, and when we done, you should close it.

When you open a file, you get a **file handle** which is a reference to the file so that you can pull input from it or write output to it.

### `open()` and `close()`

The `open()` function opens a file and returns a handle for that file. The handle can then be used for reading or writing (depending on the mode, see below) from/to that file.

Once finished with a file, it should be closed.

```python
file_handle = open("path/to/file/filename")
# ...
file_handle.close()
```

### File Mode

The `open()` function can also take in an optional second parameter, the mode to open it in.

Modes:

- `"r"`: **read**
- `"a"`: **append**, create file if doesn't exist
- `"w"`: **write**, create file if doesn't exist
- `"x"`: **create**
- `"t"`: **text file**
- `"b"`: **binary file**

By default, the mode is set to `"rt"`.

**Examples:**

In [13]:
f = open("files/hello.txt", "rt") # equivalent to default mode
f.close()

f = open("files/file.txt", "w") # write to text file, text is default. This mode
f.close()                        # will overwrite any contents of the file.

f = open("files/hello.txt", "a") # append to text file. This mode will write to the
f.close()                        # end of the file, NOT overwriting any contents

f = open("files/hello.bin", "rb") # read from binary file.
f.close()                        

## Reading from files, three ways

### `read()`

`read()` returns a string, the entire contents of the file. If the file is large, this can be slow and memory intensive.

In [None]:
f = open("files/hello.txt") # opens in "rt" mode
print(f.read())
f.close()

### `readlines()`

`readlines()` returns a list of the lines in the file. Again, if the file is large, this can be slow and memory intensive.

In [None]:
f = open("files/hello.txt") # opens in "rt" mode
print(f.readlines())
f.close()

### Iterating over the file

Both of the above are easy, but they aren't efficient for large files.

A more efficient way is to iterate over the file, reading in one line at a time.

A basic for loop can iterate over any file.

In [None]:
f = open("files/hello.txt") # opens in "rt" mode
for line in f:
    print(line)
f.close()

You'll notice that when you use read from a file, the new line characters at the end of each line are kept.

You can remove them using `rstrip()`

In [None]:
f = open("files/hello.txt") # opens in "rt" mode
for line in f:
    print(line.rstrip())
f.close()

## Writing to files

Writing to a file is trivially easy. The `write()` function takes a string and writes it to a file.

In [None]:
f = open("files/output.txt", "w") # opens in "rt" mode
f.write("Replace this string with your own content!")
f.write("Does write add new line characters to the end of each line?")
f.close()

f = open("files/output.txt") # opens in "rt" mode
print(f.readlines())
f.close()

The most important thing to know about writing to files is that in **write** `"w"` mode, the contents of the file will be overwritten. If you want to add to the end of a file, the **append** `"a"` mode should be used.

### Append to `hello.txt`

Append a forth line to `hello.txt`

### Read in your updated `hello.txt`

Read in and print every line of your updated `hello.txt`

### Read in a protein

In the `files` directory is a file `AT1G58602.pdb`. This is a text file containing the predicted protein structure for a "probable disease resistant protein" from the plant "ARABIDOPSIS THALIANA". This structure was predicted by AlphaFold.

Read in and print out every line of this file.

### View `AT1G58602.png`

In the `figures` directory is an image of `AT1G58602`. Of course, you can easily open this file on your computer, but we can also have python do it for us!

We haven't covered how to open and view images, but one of the most used skills of every programmer is searching for and learning how to do something new. Learn how to open and view `AT1G58602.png` using python, then do so.