# 07 Python and Text Files
File(s) needed: numbers.txt, fastfood.txt


We can use different types of files to read and/or write data.

Files allow for two kinds of data access.
- Direct access, where you can go directly to the piece of data you want. Like a database file where we use a primary key to get the data of interest.
- Sequential access, where you have to read through the file line-by-line until you get to the data you want. 

We will start with text files, which are sequential. Later in the semester we will work with some of the tools available in Python to use other types of files.

There are three basic steps to using a file in a program.
- Open the file
- Process the file
- Close the file

They may seem obvious, but we often forget about them, especially closing the file.

We open files by creating a **file object**, giving it the name of the file of interest, and using its built-in **methods** to access the contents.

The general way we open a file is:

```
file_object_name = open(filename, mode)
```

The mode can be any one of the following:
- r – read only. No changes are allowed.
- w – open the file for writing. The contents of an existing file are erased. If the file does not exist it is created.
- a – open for writing but the data is appended to the existing data. If the file does not exist it is created.

This code assumes the file is located in the same directory as the code. The `filename` can include a path string, but it needs to have a letter 'r' at the beginning. This r tells Python the string is a raw string so any backslashes in the filename are backslashes and not escape characters. It is different from the r used for 'read only' mode.

Examples:
- open or create a file for the storage of tax calculation results
```
output_file = open('tax_results.txt', w)
```

- open a file in a different directory for read only access
```
input_file = open(r'C:\Users\TSwift\temp\lyrics.txt', 'r')
```

## Writing to a file
This is pretty straightforward.
- Open the file.
- Use the `write` method to write the data to the file.
- Close the file using the `close` method.

Let's try it.

In [None]:
# Example: write to a file
output_file = open('grail.txt', 'w')
output_file.write('Sir Robin\n')             # why the \n?
output_file.write('6527.189\n')

# add the code to write a few more lines of string data to the file




To write a numeric value to the file it has to be converted to a string first using the `str` function. Don't forget about the newline code.

In [None]:
sell_price = 8.24
output_file.write(str(sell_price) + '\n')

In [None]:
# Run this cell when we're done with the file
output_file.close()

## Appending data to a file
Use the same basic process as writing. The only things that change are that we use the 'a' mode when we open the file and the data is tacked onto the end of what is already there.

In [None]:
# Example: appending to existing file
# Make sure grail.txt is closed after writing to it. Then open it in Jupyter to see what is currently in it.
out_file = open('grail.txt', 'a')

# add some new data to be appended
out_file.write('King Arthur\n')
out_file.write('Sir Gawain\n')
out_file.write(str(sell_price*10)+ '\n')

# add some more lines to append some data.


out_file.close()
# look at what is in the file now.

## Reading data from a file
We use the `readline` method to read the data one line at a time.

In [None]:
# Example: reading the first three lines of grail.txt
in_file = open('grail.txt', 'r')
line1 = in_file.readline()
line2 = in_file.readline()
line3 = in_file.readline()
in_file.close()
print(line1)
print(line2)
print(line3)


Why all the blank lines? readline advances to the next line plus each line has a newline code at the end. The ‘\n’ in the text file separates items in the input file but we usually don’t need them when we read text file data. We can use a Python string method called **rstrip** (which is short for 'right strip') to get rid of them.

Add the following code to the previous example before the print statements and run the cell again:
```
line1 = line1.rstrip('\n')
line2 = line2.rstrip('\n')
line3 = line3.rstrip('\n')
```


### Reading numeric data
Remember how the `input()` function only returned strings so we had to use conversion functions to get numbers from numeric strings? The same applies for `readline`. Use `int()` or `float()` to convert the numeric data into the type you need. One nice thing is that we don't have to worry about the newline ('\n') codes in this situation.

In [None]:
# Example: read the first three values from numbers.txt
infile = open('numbers.txt','r')
num1 = float(infile.readline())
num2 = float(infile.readline())
num3 = float(infile.readline())
total = num1 + num2 + num3
print(num1, num2, num3)
print(total)


## Loops and file access
You may have already realized that loops are very helpful when reading or writing data to files. The files we have been working with hold just a few lines of data. What if there were thousands of data points we needed to read or write?

### Output to a file with a loop
The general procedure is to open the file, loop through the data and write each piece to the file, close the file. It's a pretty straightforward process so we'll come back to that in a later example.

### Reading to the end of a file with a loop
We have talked about two different looping techniques, the `WHILE` and `FOR` loops. Both can be used when reading data. The `FOR` loop is easier to use, but the structure used with the `WHILE` loop is used in other programming languages so you should be familiar with it in case you see it in the future.

#### WHILE loop
The idea here is to loop through the file until the `readline()` method returns an empty string. It requires what we call a priming read. A **priming read** is a reading of the first line of data before we start the loop. If there is nothing in the data file, the loop will never run. 

The next example shows how this works.

In [None]:
# Example: reading a file with the WHILE loop

# open file for reading


# read the first line from the file – the priming read


# Continue reading from the file until an empty string is returned.





#close the file



#### Using the FOR loop
The `FOR` loop is easier to use because of three characteristics:
1. It doesn't require a priming read.
2. It automatically reads the next line each time the loop repeats.
3. It automatically stops at the end of the file. 

The next example uses the same file as the previous example to show the difference.

In [4]:
# Example: reading a file with the FOR loop

# open file for reading
infile = open('numbers.txt','r')

# Continue processing the file until

    
for number in infile:    
                total+=int(number)
                print("The sum is:", total)                  # display the number
    
#close the file
infile.close()


FileNotFoundError: [Errno 2] No such file or directory: 'numbers.txt'

## Programming Exercise
Use the file **integers.txt** as input for a program to read all of the values stored in the file and report the count, the total, and the average (to 3 decimal places) of all the numbers. There is no user input for this program. Make sure the data file and the code are in the same folder. Your output should look something like this:
```
There are 68 values in the file.
The total of all values is 1296, for an average of 19.059 per value.
```

In [2]:
infile = open('integers.txt','r')
total = 0

for line in infile:    
           num = float(line)
           total += num
           
        
print('Total of all numbers is:', format(total))
print('The average of all numbers is:', format(total/68, '.3f'))
    
infile.close()

Total of all numbers is: 1296.0
The average of all numbers is: 19.059


## Processing data records in text files
Up to this point we have been practicing with single data points, each on its own line in the text file. Most data you will need to work with is comprised of multiple **fields** that together make **records**. We can still work with this data in text files, as long as we know what the data looks like. We can have each field of a record appear on its own line in the file and a certain number of lines (fields) comprise a record. If we have _x_ fields in a record, then every _x_ lines in the text file starts a new record.

Let's look at an example. Open the file **fastfood.txt** in Jupyter. You can see it looks like there is a cyclical pattern there. This is what each field is:
```
item name
selling price
cost
category
```

We can use that information (and a WHILE loop) to read the data one field at a time inside one record at a time.

In [7]:
# Example: reading data records

# open file for reading
infile = open('fastfood.txt','r')

# read the name of the first item
line = infile.readline()
item = line.rstrip('\n')    # strip off the newline from the line read

# Continue processing the file until
# an empty string is returned.

while line != '':
    line = infile.readline()
    sell_price = float(line)
    
    line = infile.readline()
    item_cost = float(line)
    
    line = infile.readline()
    category = line.rstrip('\n')
    
    print(item, sell_price, item_cost, category)

    line = infile.readline()
    item = line.rstrip('\n')
    
       
#close the file
infile.close()


taco 1.49 0.53 food
soda-medium 2.19 0.22 drink
soda-large 3.29 0.31 drink
soda-small 1.79 0.19 drink
chili 3.89 1.41 food
hot dog 2.29 1.02 food
french fries 2.49 0.68 side order


### Modifying records
Text files are sequential files, so we have to read and write the entire file when we want to make any changes. To do this we use a technique that is similar to swapping values between two variables. Think about that for a second. How do you swap values between two variables? How do two people switch chairs?

We will be using another Python module here, `os`. This module provides us with tools for interacting with the computer's operating system.

In [5]:
# Example: modifying specific records in the fastfood.txt file
import os           # needed for the remove and rename file functions

# create a flag variable for the search process
found = False

# Search criteria for the desired record
search = input('Enter the name of an item: ')
new_price = float(input('Enter the new price: '))

# open original file for reading
food_file = open('fastfood.txt', 'r')

# open a temporary file for writing the changes
temp_file = open('temp.txt', 'w')

# Find the item - read the name of the first item (priming read)
item = food_file.readline().rstrip('\n')

# Process the file until an empty string is returned
while item != '':
    
    # read the rest of the record
    sell_price = float(food_file.readline())
    cost = float(food_file.readline())
    category = food_file.readline().rstrip('\n')
    
    
    
    # Either write the record to the temp file or
    # the new data if this is the desired record.
    if item == search:
        temp_file.write(item + '\n')
        temp_file.write(str(new_price) + '\n')
        temp_file.write(str(cost) + '\n')
        temp_file.write(catergory + '\n')
        
        found = True
        
    else:
        temp_file.write(item + '\n')
        temp_file.write(str(sell_price) + '\n')
        temp_file.write(str(cost) + '\n')
        temp_file.write(catergory + '\n')
    
    # read the name of the next item
    item = food_file.readline().rstrip('\n')

#close both files
food_file.close()
temp_file.close()

# Delete the original file
os.remove('fastfood.txt')

# Change the name of the temporary file to the original file name
os.rename('temp_file.txt', 'fastfood.txt')

# Tell the user if we found the item
if found:
    print('The file was updated!')
else: 
    print('The file could not be found, please try again.')



Enter the name of an item: hot dog
Enter the new price: 7.88


NameError: name 'catergory' is not defined

### Deleting records
We have to follow the same procedure as in the modifying records example EXCEPT we don't write the "deleted" record to the new data file.

In [3]:
# Example: modifying specific records in the fastfood.txt file
import os           # needed for the remove and rename file functions

# create a flag variable for the search process
found = False

# Search criteria for the desired record
search = input('Enter the name of an item: ')

# open original file for reading
food_file = open('fastfood.txt', 'r')

# open a temporary file for writing the changes
temp_file = open('temp.txt', 'w')

# Find the item - read the name of the first item (priming read)
item = food_file.readline().rstrip('\n')

# Process the file until an empty string is returned
while item != '':
    
    # read the rest of the record
    sell_price = float(food_file.readline())
    cost = float(food_file.readline())
    category = food_file.readline().rstrip('\n')
    
    
    
    # Either write the record to the temp file or
    # the new data if this is the desired record.
    if item == search:
        temp_file.write('')
        temp_file.write('')
        temp_file.write('')
        temp_file.write('')
        
        found = True
        
    else:
        temp_file.write(item + '\n')
        temp_file.write(str(sell_price) + '\n')
        temp_file.write(str(cost) + '\n')
        temp_file.write(category + '\n')
    
    # read the name of the next item
    item = food_file.readline().rstrip('\n')

#close both files
food_file.close()
temp_file.close()

# Delete the original file
os.remove('fastfood.txt')

# Change the name of the temporary file to the original file name
os.rename('temp.txt', 'fastfood.txt')

# Tell the user if we found the item
if found:
    print('The file was deleted!')
else: 
    print('The file could not be found, please try again.')

Enter the name of an item: hot dog
The file was deleted!


## Programming Exercises
Using the **fastfood.txt** file as data,
- write a program that allows the user to add records.
- write a program that allows the user to delete records.

##### Add a degree of difficulty?
Add a confirmation step to the delete program, like
```
Are you sure you want to delete this record? YES or NO
```

Alllow the user to say "no" and the record is not deleted.