# Python - Files

## So you say you have some data...

We are going to go over the basics of reading text based file. I will assume that the format of the file is a number n followed by n pairs of numbers each on their own line.  

Example

```
4
1 2.0
2 4.0
3 8.0
4 16.0
```

### Step 1 - create a reference to the file
To open the file for reading, you need to know the name of the file and specify "r" (for read) as the mode of use for the file.  If you try to open a file that does not exist or for which you gave an incorrect path, an error will occur.

In [43]:
# The name of the file is data.txt and it should be a plain text file
my_data_file = open("data.txt", "r")
print(my_data_file)

<_io.TextIOWrapper name='data.txt' mode='r' encoding='UTF-8'>


### Step 2 - read data from the file line by line
readline() gets a whole line of data as a string
You can then use split to break the line into
individual strings all stored in a list.  (By default split
breaks on spaces, but you can change that.)

Once you have a string for each piece of data, you can use int() and float() to convert to a numerical type.  

In [44]:
my_data_file = open("data.txt", "r")

# The result is going to be a list of pairs 
# Start it empty
result = []

token = my_data_file.readline()
#Convert the string to an integer value
count = int(token)
print(count)

for i in range(0,count) :
    # read each line
    line = my_data_file.readline()
    
    # split on space into a list of strings
    string_data = line.split()
    
    # get each part of the pair and add to the list
    first = int(string_data[0])
    second = float(string_data[1])
    result.append( (first,second) )
    
print(result)
    

4
[(1, 2.0), (2, 4.0), (3, 8.0), (4, 1.0)]


### Step 3 - close the file
You should always close your files after you are done with them.  Close guarantees that any buffers are flushed and the file is available for use by others.

In [39]:
my_data_file.close()

## Writing to a file
Its the same three step process.  
1. Open the file for writing ("w") or appending ("a").
2. Use the write() function to write to the file
3. Close the file.  I can not stress how important it is that you close the file. If you don't, martians will invade and steal all of our martinis. 

Write requires that you give it a string.  If you want to write a line, you need to add a newline.

In [40]:
file_out = open("data.txt", "w")

# Write the first value on its own line
file_out.write("4\n")

# Write the pair 
file_out.write("1")
file_out.write(" 2.0\n")

# Write the next pair
file_out.write("2 4.0\n")

# Write two lines worth
file_out.write("3 8.0\n4 1.0\n")

# Close it up
file_out.close()


## Reading until you hit the end of the file.

Lets say that we have a list of words. There can be multiple words on a line and they are all separated by spaces.  

Example

```
hay hey
go              chess   checkers
the previous line used tabs

that was an empty line
a e i o u y

10 20 30
still strings
```

In [41]:
my_data_file = open("words.txt", "r")

# The result is going to be a list of words 
# Start it empty
result = []

line = my_data_file.readline()
# We are clever here... any non-empty result will be
# treated as true (All the lines in the file except possibly the last
# end in a newline character.)
while line :
    # split on white space into a list of strings
    string_data = line.split()
    
    # add each one to the result
    for word in string_data :
        result.append(word)
    # get the next line
    line = my_data_file.readline()
    
print(result)

['hay', 'hey', 'go', 'chess', 'checkers', 'the', 'previous', 'line', 'used', 'tabs', 'that', 'was', 'an', 'empty', 'line', 'a', 'e', 'i', 'o', 'u', 'y', '10', '20', '30', 'still', 'strings']


## An  alternative
We can use a for loop to get every line in the file as well.  It isn't as clear that it is processing the file line by line, but it reads cleaner than the while loop.

In [45]:
my_data_file = open("words.txt", "r")

# The result is going to be a list of words 
# Start it empty
result = []

# Iterate using a for loop
for line in my_data_file :
    # add each word to the result
    for word in line.split() :
        result.append(word)
my_data_file.close()
    
print(result)

['hay', 'hey', 'go', 'chess', 'checkers', 'the', 'previous', 'line', 'used', 'tabs', 'that', 'was', 'an', 'empty', 'line', 'a', 'e', 'i', 'o', 'u', 'y', '10', '20', '30', 'still', 'strings']


### Using with
If you are going to read or write your file all at once, there is a convenient way to do so 
```python
    with open(file_name) as file_reference :
        #code to process the file
```
The big advantage is that once you leave the with block, the file is guaranteed to be closed.  This will happen even if an error occurs.

In [2]:
# The result is going to be a list of words 
# Start it empty
result = []

with open("words.txt", "r") as my_data_file :
    for line in my_data_file :
        # add each word to the result
        for word in line.split() :
            result.append(word)
    
print(result)

['hay', 'hey', 'go', 'chess', 'checkers', 'the', 'previous', 'line', 'used', 'tabs', 'that', 'was', 'an', 'empty', 'line', 'a', 'e', 'i', 'o', 'u', 'y', '10', '20', '30', 'still', 'strings']
