# Prerequisites:
- Variables
- Iterables(?)
- Loops

# Learning Outcomes:
- Open files using Python's built-in functions and extract their contents to variables
- Use the CSV module to read data from CSV files

# **Reading Files**

One of the common uses of Python in chemistry is to analyse large amounts of data. 
This might be data gathered during an experiment that has been stored in a number of files, and Python has a number of built-in functions to read (and write) files. 
In this section, we will explore how to read different types of files, including text files and CSV files, using Python's built-in capabilities.

Let's start with a opening a simple text file and reading its contents:

In [None]:
file = open('molecule.txt', 'r')
contents = file.read()
file.close()
print(contents)

After running the cell above, you should see the contents of the `molecule.txt` file in the cell output. 
If you don't see the output, make sure that the file is in the same directory as this notebook. 
You can also verify the output by checking the file's contents in a text editor.

The first line of the code cell above opens the file `molecule.txt` using the `open()` function and saves it to a special file-reading Python *object* we have called `file`.
The `open()` function takes at least one argument which is either the file name (if in the same working directory) or the full filepath of the file.
It can also take a second argument to specify the mode in which the file is opened (e.g., `'r'` for reading, `'w'` for writing, etc.).
If you don't specify a mode, the file is opened in read mode by default.

The second line of the code cell reads the entire contents of the file using the `read()` method of the file object and stores it in a variable called `contents`. 

The third line closes the file using the `close()` method and is considered good practice.
Otherwise we might leave it open, which can lead to various issues (e.g., file access errors).

Finally, the last line prints the contents of the `contents` variable.

### Reading Files with `with`
We can also use the `with` statement to open files, which will automatically close the file for us when we are done with it.
This is a more "Pythonic" way to handle files and is generally recommended.

Let's take a look at the same example using the `with` statement:

In [None]:
with open('molecule.txt', 'r') as file:
    contents = file.read()

print(contents)

As before, we open the `molecule.txt` file and read its contents.
The difference is that we use the `with` statement to open the file, which automatically closes it when we are done with it (i.e., when we exit the `with` block).

We now have a way to read files in Python, and use their contents as *variables* in our code.

## Reading CSV Files
CSV (Comma Separated Values) files are a common format for storing tabular data, such as data from experiments or simulations.
Each line in a CSV file represents a row of data, and each value in the row is separated by a comma (you can easily verify this by opening up a CSV file in a text editor).
Python has a built-in module called `csv` that makes it easy to read (and write) CSV files.

Let's take a look at how to read a CSV file using the `csv` module:

In [None]:
import csv

with open('elements.csv') as file:
    csv_reader = csv.reader(file)
    for row in csv_reader:
        print(row)

Here, we first import the built-in `csv` module to allow us to easily parse CSV files.

Next we open the `elements.csv` file using the `with` statement as we have seen before.
Note that we are opening the file in read mode without needing to specify it explicitly.

The `csv.reader()` function takes the file object as an argument and returns a CSV reader object that can be used to *iterate* over the rows in the CSV file.

Finally, we use a `for` loop to iterate over the rows in the CSV file and print the contents of each row.
The csv_reader object allows us to access each row as a list of values, making it easy to work with the data.

## Exercises

### Manipulate data
Use f-strings to print the contents of the `elements.csv` file in a more readable format.
Don't forget about the header row!

Example answer (skipping the header entirely):
```python
import csv

with open('elements.csv') as csvfile:
    csv_reader = csv.reader(csvfile)
    next(csv_reader)  # Skip the header row
    for row in csv_reader:
        print(f"Name: {row[0]}, Symbol: {row[1]}, Atomic Number: {row[2]}")
```

### Using the file path
Try to open a file that is not in the same directory as this notebook and print its contents.

TODO: Example answer

### Loop through multiple files
TODO: Task involving looping through multiple files with a predictable filename (e.g. `001.csv`) and reading their contents.

TODO: Example answer

## Debugging
The code below contains a bug and will not run.
See if you can fix it by reading the error message and using the information it provides.

In [None]:
with open('molecule.csv', 'r') as file:
    text = file.read()

print(text)

## TODO
- Discuss carriage returns and other special characters?
- Explain the distinction between text and binary files?