# Input output (io)

Now we are equipped with all the tools to work with data. 
We can store it in variables, make calculations and perform logical operations. 
There is just one thing missing... the data! 
Rather than manually inputting into a cell in the notebook, we will need to collect the data from an external source. 

There are many data sources out there, we could connect to an API to request some data, connect to a large database or we could simply load data from a local data file. 
All these data sources, no matter their origin will be stored in python data structures, i.e. lists, dictionaries, tuples etc.

## reading files

You will first open a local file. 
Opening file with python is very straightfoward. 
You will use the `open()` function to get the data into the python session. 
The `open()` function takes the file path of where the file is saved as well as the `access_mode` in which we want to read the file. 
In this case we use `r` for the mode as we want to read the data. 
Look up the python documentation for the other `access_modes`.

It is a good practice to use the `with` statement when opening files as this automatically closes the file once the contents have been read. 
Otherwise you would have to remember this and do it manually. 
The `with` statement is used with the complementing statement `as`. 
This gives stores the opened file container as a variable. 
In this case we use the letter `f`. 
Now we enter an indented block of code where we perform the operation on the opened file. 
In this case we use the `.read()` method.

In [None]:
with open('data/poem.txt', 'r') as f:
    read_data = f.read()

Now we can access the text that was stored in the `read_data` variable.

In [None]:
read_data

You will notice that we have a list with a single element. The data was read as a chunk of text where the `\n` character are not interpreted as newlines. Let's use the `.readlines()` method instead.

In [None]:
with open('data/poem.txt', 'r') as f:
    read_data = f.readlines()
read_data

This time you have split the input into list elements per line in the file. You could now perform an iteration on this list. For example, you could count the number of occurrences of the word `and`.

In [None]:
count_of_and = 0
for line in read_data:
    if 'and' in line:
        count_of_and += 1

count_of_and

There are 11 `and`s in the file. Notice however that there is a drawback in this code in that it will only count whether and is in the line, not the number of occurences!

## writing files

Let us now write the contents of the file. Similarly to before, we first need to `open()` a file. This time however, we open it in the `w` `access_mode` for write. Again we use `with` to `open()` the file `as` `f`. Now in the indented block of code, we can use the `.write()` method on the file object `f` to write the data into a file.

We could just want to write the lines that are starting with an `F`.

In [None]:
with open('data/short_poem.txt', 'w') as f:
    for line in read_data:
        if line[0] == 'F':
            f.write(line)

Now go and check the your folder and see whether you wrote the correct output to the file.

## CSV

Files can contain text, but also data. For example, you can export a spreadsheet to a CSV file. In this case, each row is presented as a line in the file, with each column separated by a comma (CSV stands for Comma Separated Values).

You can load CSV files into your Python program. To do that, you first need to import a module from the standard library. You will learn more about importing later.

In [None]:
from csv import reader

With the reader imported, you can load the data simply:

In [None]:
with open('data/tips.csv', 'r') as input_file:
    tips_reader = reader(input_file)
    meals = list(tips_reader)
meals[:5]

This method for loading the data gives raw results: a list of lists. The first line contains the labels (the column headers of the spreadsheet). Ideally, you would like to use to labels when loading the data.

The reader `DictReader` provides that functionality:

In [None]:
from csv import DictReader
with open('data/tips.csv', 'r') as input_file:
    tips_reader = DictReader(input_file)
    meals = list(tips_reader)
meals[:3]

Note that the results are returned as a list of `OrderedDicts`. This is a special type of dictionary, which can be imported from `collections` where the order is conserved.