This lesson is about files, and using the data within them.

When we use files, we make it so we can use data more than once. Let's say you are a data scientist, and you need to get some data into a format that you can use it for comparisons in a future study. You'd want some way to save your data in a standard format that you and anyone else can understand.

In [166]:
#The os module tells what files are in a directory
#Let's use it first to view the directory.

In [167]:
import os
os.getcwd() #prints the current working directory

'/Users/thomassullivan/projects/GitHub/codigo_curriculum/Intermediate Python/files_practice'

Let's start off with files in the same directory. First, let's see how many files are already here. 

In [168]:
file_path = os.getcwd() #let's create a path as a string
#now let's use the os.listdir() method to find the files in the directory.
os.listdir(file_path)

['expanded_colors.txt',
 'colors.txt',
 'sorted_colors.txt',
 'expanded_colors2.txt',
 'files.ipynb',
 '.ipynb_checkpoints']

We see only three files. The first one is this file itself, and the second is something created by the interpreter. It's the third file that we want to load. It's just a list of colors.

In [169]:
colorFile = open(file_path + '/colors.txt')

Let's read this file with the read method, called on the file itself.

In [170]:
colorFile.read()

'Red\nOrange\nBlue\nYellow\nGreen\nPurple'

What's wrong? We just see one line in the file for each color.

Let's first close the file so that other programs can access it.

In [171]:
colorFile.close()

Now, let's open it again with the `.readlines()` method, to read each line of the file on a separate line.

In [172]:
colorFile = open(file_path+ '/colors.txt')

In [173]:
colorFile.readlines()

['Red\n', 'Orange\n', 'Blue\n', 'Yellow\n', 'Green\n', 'Purple']

Notice that we still have the `\n` characters at the end of each line. Still, we can more easily work with these values when we have a list of strings, rather than a single string. 

Let's now open this file, and add the contents of it into a list

In [174]:
results = open(file_path + '/colors.txt')

In [175]:
results = list(results)

In [176]:
results

['Red\n', 'Orange\n', 'Blue\n', 'Yellow\n', 'Green\n', 'Purple']

Let's play with the strings to get rid of the newline characters.

In [177]:
result_string = ''.join(results)

In [178]:
result_string #now we have a single string

'Red\nOrange\nBlue\nYellow\nGreen\nPurple'

In [179]:
results = result_string.split('\n')

In [180]:
results

['Red', 'Orange', 'Blue', 'Yellow', 'Green', 'Purple']

Let's add some more colors to this list

In [181]:
results.append('Green')

In [182]:
results.append('Brown')

In [183]:
results.append('Pink')

In [184]:
results

['Red',
 'Orange',
 'Blue',
 'Yellow',
 'Green',
 'Purple',
 'Green',
 'Brown',
 'Pink']

We create a new file by using the 'open' function in 'write' or 'w' mode. This creates a new file. It will also erase the contents of an existing file so be careful before using it.

In [146]:
expanded_colors = open('expanded_colors.txt', 'w') #write mode
for line in results:
    expanded_colors.write(line + '\n')

In [147]:
#We now have a blank file called expanded colors

In [148]:
#Let's write the lines to expanded colors individually

In [190]:
expanded_colors.write('Red\n')

4

In [191]:
expanded_colors.write('Green\n')

6

In [192]:
expanded_colors.close()

In [193]:
#Now let's check to make sure that we wrote to the file properly

In [194]:
expanded_colors = open('expanded_colors.txt', 'r')

In [154]:
for line in expanded_colors:
    print(line)
expanded_colors.close()

Red

Orange

Blue

Yellow

Green

Purple

Green

Brown

Pink

RedGreen


See the problem, we have written each line without adding anything to it at the end. We need to add the escape character `\n` for each new line.

In [209]:
expanded_colors = open('expanded_colors.txt', 'w') #this overwrites the existing file
for line in results:
    expanded_colors.write(line + '\n')
print(results)

['Red', 'Orange', 'Blue', 'Yellow', 'Green', 'Purple', 'Green', 'Brown', 'Pink']


In [210]:
#We can write addtional lines to the file before we close it
expanded_colors.write('Indigo\n')

7

In [211]:
expanded_colors.write('Gold\n')

5

In [212]:
expanded_colors.close()

In [213]:
#Now let's make sure we have added the lines to the file correctly

In [214]:
expanded_colors = open('expanded_colors.txt', 'r') #open the file in read mode

In [215]:
for line in expanded_colors:
    print(line)

Red

Orange

Blue

Yellow

Green

Purple

Green

Brown

Pink

Indigo

Gold



Notice how we have each line separately now

Let's say we want to add text to a file, but keep the existing text as it is. We can do that by opening the file in append mode. 

In [216]:
expanded_colors = open('expanded_colors.txt', 'a')

In [217]:
expanded_colors.write('Orange\n')

7

In [218]:
expanded_colors.write('Blue\n')

5

In [219]:
#Let's close the file
expanded_colors.close()

Now, we open it again in read mode

In [220]:
expanded_colors = open('expanded_colors.txt', 'r')

In [221]:
for line in expanded_colors:
    print(line)
expanded_colors.close()

Red

Orange

Blue

Yellow

Green

Purple

Green

Brown

Pink

Indigo

Gold

Orange

Blue



Now we have each line of the file printed. What if we want to have these lines appear without double spacing? Let's do this with the `.strip()` method.

In [222]:
expanded_colors = open('expanded_colors.txt', 'r')
for line in expanded_colors:
    print(line.strip())

Red
Orange
Blue
Yellow
Green
Purple
Green
Brown
Pink
Indigo
Gold
Orange
Blue


That's better, all the lines are together now. So let's take this opportunity and write all of the lines from our list `results` into the file `expanded_colors.txt`

In [223]:
#We already know we can do this by opening the file with the write mode
results = ['Red', 'Orange', 'Blue', 'Yellow', 'Green', 'Purple', 'Green', 'Brown', 'Pink']
expanded_colors = open('expanded_colors2.txt', 'w')
for item in results:
    expanded_colors.write(item + '\n')
expanded_colors.close()

In [224]:
#Now let's open the file again and print the lines. Do this as many times as you need to make yourself
#feel comfortable

expanded_colors = open('expanded_colors2.txt', 'r')
for line in expanded_colors:
    print(line.strip())

Red
Orange
Blue
Yellow
Green
Purple
Green
Brown
Pink


In [225]:
#NOTE: You must reload the file before printing out the lines again

Now that we've learned the hard way to open files, its time to learn the easy way. The `with()` statement manages the file access automatically, and closes it as soon as its finished. Let's open the file `colors.txt` and print out the lines.

In [226]:
with open('colors.txt', 'r') as in_file:
    for line in in_file:
        print(line.strip())

Red
Orange
Blue
Yellow
Green
Purple


Now let's take the lines from the file, add them to a list, append something to that list, and then write them to a new file with the `with()`statement.

In [227]:
new_colors = []
with open('colors.txt', 'r') as in_file: #be sure not to write 'w'
    for line in in_file:
        new_colors.append(line.strip())

In [228]:
#print the new list just to be safe
print(new_colors)

['Red', 'Orange', 'Blue', 'Yellow', 'Green', 'Purple']


In [229]:
new_colors.append('Indigo')
print(new_colors)

['Red', 'Orange', 'Blue', 'Yellow', 'Green', 'Purple', 'Indigo']


In [230]:
#let's sort the list of colors alphabetically
new_colors.sort()
print(new_colors)

['Blue', 'Green', 'Indigo', 'Orange', 'Purple', 'Red', 'Yellow']


In [231]:
#now let's use the with statement with the sorted list
with open('sorted_colors.txt', 'w') as out_file:
    for line in new_colors:
        out_file.write(line + '\n')

In [232]:
#let's check the file we just wrote
with open('sorted_colors.txt', 'r'):
    for line in new_colors:
        print(line.strip())

Blue
Green
Indigo
Orange
Purple
Red
Yellow


Before we conclude, let's address a problem. Whenever we open a file in write mode, we erase the file's contents. Let's write a function that checks to make sure that a file does not exist before writing to it.
We can access the contents of the directory with `os.listdir()`

In [236]:
os.listdir()

['expanded_colors.txt',
 'colors.txt',
 'sorted_colors.txt',
 'expanded_colors2.txt',
 'files.ipynb',
 '.ipynb_checkpoints']

We use the `os.path.isfile()` function to see if a file exists

In [237]:
import os.path
if os.path.isfile('sorted_colors.txt'):
    print('file exists')

file exists


Exercise: create a function to take a filename from the user and see if it exists.