# Standard Library - File Input/Output
Let's take a look at basic file input and output. Python is a good choice for processing any kind of structured text file, including space delimited, tab delimited, and comma delimited (csv) which is what we focus on here.

It should be noted that most of the time we are dealing with csv data, we'd rather use the much more capable `pandas` package which will be taught at a later time. Consider these instructions to be the 'generic' way to process any text file in the absence of a domain-specific tool.

## Reading a file
We'll start by opening a csv file of Pokemon facts (you can follow along if you'd like by downloading this file from the repo). The second argument to `open` indicates that we want read-only mode. An open file object acts like an **iterator**, so we can loop over the file to work with each line one at a time. For now, let's just print out the contents of the line. We also `close` the file when we're done.

In [1]:
file = open('pokemon.csv', 'r') # open a file in read mode
for line in file:
    print(line.strip()) # strip the trailing newline before we print
file.close()

pokedex_number,name,type1,type2,hp,attack,defense,sp_attack,sp_defense,speed,base_total
1,Bulbasaur,grass,poison,45,49,49,65,65,45,318
2,Ivysaur,grass,poison,60,62,63,80,80,60,405
3,Venusaur,grass,poison,80,100,123,122,120,80,625
4,Charmander,fire,,39,52,43,60,50,65,309
5,Charmeleon,fire,,58,64,58,80,65,80,405
6,Charizard,fire,flying,78,104,78,159,115,100,634
7,Squirtle,water,,44,48,65,50,64,43,314
8,Wartortle,water,,59,63,80,65,80,58,405
9,Blastoise,water,,79,103,120,135,115,78,630
10,Caterpie,bug,,45,30,35,20,20,45,195
11,Metapod,bug,,50,20,55,25,25,30,205
12,Butterfree,bug,flying,60,45,50,90,80,70,395
13,Weedle,bug,poison,40,35,30,20,20,50,195
14,Kakuna,bug,poison,45,25,50,25,25,35,205
15,Beedrill,bug,poison,65,150,40,15,80,145,495
16,Pidgey,normal,flying,40,45,40,35,35,56,251
17,Pidgeotto,normal,flying,63,60,55,50,50,71,349
18,Pidgeot,normal,flying,83,80,80,135,80,121,579
19,Rattata,normal,dark,30,56,35,25,35,72,253
20,Raticate,normal,dark,75,71,70,40,80,77,413
21,Spearow,normal,fl

Let's do some basic processing on each line. Since each line is separated by commas, we call the `split` string method with a comma as the delimiter. This returns a list of each token in the line. We know from looking at the header row that the first token is the number, the second token is the name, and so on, so we can select a specific field by using its index.

It's sometimes desirable to skip the header row - we do this by calling `next` to run the iterator once.

We also introduce the syntax for a **context manager**, which is a handy way to make sure that the file closes itself once we leave the `with` block.

In [2]:
with open('pokemon.csv', 'r') as file: # open file in a context manager
    next(file) # skip the header row
    for line in file:
        tokens = line.split(',')
        print(f'My name is {tokens[1]} and my type is {tokens[2]}')

My name is Bulbasaur and my type is grass
My name is Ivysaur and my type is grass
My name is Venusaur and my type is grass
My name is Charmander and my type is fire
My name is Charmeleon and my type is fire
My name is Charizard and my type is fire
My name is Squirtle and my type is water
My name is Wartortle and my type is water
My name is Blastoise and my type is water
My name is Caterpie and my type is bug
My name is Metapod and my type is bug
My name is Butterfree and my type is bug
My name is Weedle and my type is bug
My name is Kakuna and my type is bug
My name is Beedrill and my type is bug
My name is Pidgey and my type is normal
My name is Pidgeotto and my type is normal
My name is Pidgeot and my type is normal
My name is Rattata and my type is normal
My name is Raticate and my type is normal
My name is Spearow and my type is normal
My name is Fearow and my type is normal
My name is Ekans and my type is poison
My name is Arbok and my type is poison
My name is Pikachu and my type

## Writing to a file

Now let's look at writing some information to a file. Let's say we want to produce a short list of only the strongest Pokemon (the highest base stat total). We'll use the context manager to open two files at once and make sure that they're both closed. After running this snippet, check your working directory and you should see that the file `pokemon_out.csv` has been created. Check it in a text editor to make sure it has the contents you expect.

In [3]:
with open('pokemon.csv', 'r') as infile, open('pokemon_out.csv', 'w') as outfile:
    header = next(infile) # skip the header row

    for line in infile:
        tokens = line.split(',')
        if int(tokens[10]) > 500:
            outfile.write(f'{tokens[1]} {tokens[10]}')

```
$ head pokemon_out.csv
Venusaur 625
Charizard 634
Blastoise 630
Pidgeot 579
Nidoqueen 505
Nidoking 505
Ninetales 505
Arcanine 555
Poliwrath 510
Alakazam 600
```

## There's got to be a better way! (`csv` library)
Keeping track of token indices is a bit cumbersome, isn't it? It would be easy to forget which index you're looking for and start writing the wrong data. We can take advantage of the fact that a csv file is structured and actually tells us the field names in the header row. By using the `csv` module from the standard library, we can turn each row into a dict which includes the field names and is thus much easier to work with.

In [4]:
import csv
with open('pokemon.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row)

{'pokedex_number': '1', 'name': 'Bulbasaur', 'type1': 'grass', 'type2': 'poison', 'hp': '45', 'attack': '49', 'defense': '49', 'sp_attack': '65', 'sp_defense': '65', 'speed': '45', 'base_total': '318'}
{'pokedex_number': '2', 'name': 'Ivysaur', 'type1': 'grass', 'type2': 'poison', 'hp': '60', 'attack': '62', 'defense': '63', 'sp_attack': '80', 'sp_defense': '80', 'speed': '60', 'base_total': '405'}
{'pokedex_number': '3', 'name': 'Venusaur', 'type1': 'grass', 'type2': 'poison', 'hp': '80', 'attack': '100', 'defense': '123', 'sp_attack': '122', 'sp_defense': '120', 'speed': '80', 'base_total': '625'}
{'pokedex_number': '4', 'name': 'Charmander', 'type1': 'fire', 'type2': '', 'hp': '39', 'attack': '52', 'defense': '43', 'sp_attack': '60', 'sp_defense': '50', 'speed': '65', 'base_total': '309'}
{'pokedex_number': '5', 'name': 'Charmeleon', 'type1': 'fire', 'type2': '', 'hp': '58', 'attack': '64', 'defense': '58', 'sp_attack': '80', 'sp_defense': '65', 'speed': '80', 'base_total': '405'}
{

Let's repeat our previous exercises with this improved CSV functionality.

In [5]:
with open('pokemon.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(f'My name is {row["name"]} and my type is {row["type1"]}')

My name is Bulbasaur and my type is grass
My name is Ivysaur and my type is grass
My name is Venusaur and my type is grass
My name is Charmander and my type is fire
My name is Charmeleon and my type is fire
My name is Charizard and my type is fire
My name is Squirtle and my type is water
My name is Wartortle and my type is water
My name is Blastoise and my type is water
My name is Caterpie and my type is bug
My name is Metapod and my type is bug
My name is Butterfree and my type is bug
My name is Weedle and my type is bug
My name is Kakuna and my type is bug
My name is Beedrill and my type is bug
My name is Pidgey and my type is normal
My name is Pidgeotto and my type is normal
My name is Pidgeot and my type is normal
My name is Rattata and my type is normal
My name is Raticate and my type is normal
My name is Spearow and my type is normal
My name is Fearow and my type is normal
My name is Ekans and my type is poison
My name is Arbok and my type is poison
My name is Pikachu and my type

Using `csv` makes it much easier to write structured output. We can specify the names of the fields we want to keep, and the library will handle formatting for us. The DictWriter requires us to specify the desired field names, and we choose to ignore the other fields.

In [6]:
with open('pokemon.csv', 'r') as infile, open('pokemon_out.csv', 'w') as outfile:
    reader = csv.DictReader(infile)
    writer = csv.DictWriter(outfile, fieldnames=['pokedex_number', 'name', 'base_total'], extrasaction='ignore')
    writer.writeheader()
    for row in reader:
        if int(row['base_total']) > 500:
            writer.writerow(row)

```
$ head pokemon_out.csv
pokedex_number,name,base_total
3,Venusaur,625
6,Charizard,634
9,Blastoise,630
18,Pidgeot,579
31,Nidoqueen,505
34,Nidoking,505
38,Ninetales,505
59,Arcanine,555
62,Poliwrath,510
```