# Basic File I/O

In the previous lecture on Pandas, we created a file, Example.csv. Let's open it as a simple text file.
df.to_csv('Example.csv')

In [2]:
fp = open("Example.csv")
# do things with 'fp'
fp.close()

But, we might not remember to close the file. Or, if we get an error, we might not even get a chance to close it. To avoid this (and potentially corrupted files), we can use a WITH-AS block. It is indented, similar to FOR and WHILE loops, as well as functions. The AS gives us an alias to use for the object we create with the open() function. 

(This is another way of saying we assign the open file that open() returns, and assign it to fp.)

Compare the 

In [None]:
with open('Example.csv') as fp:
    # do things with 'fp'

In [10]:
with open('Example.csv') as fp:
    # do things with 'fp'
    data = fp.read()
    print(data[0:2000])

,JURISDICTION NAME,COUNT PARTICIPANTS,COUNT FEMALE,PERCENT FEMALE,COUNT MALE,PERCENT MALE,COUNT GENDER UNKNOWN,PERCENT GENDER UNKNOWN,COUNT GENDER TOTAL,PERCENT GENDER TOTAL,COUNT PACIFIC ISLANDER,PERCENT PACIFIC ISLANDER,COUNT HISPANIC LATINO,PERCENT HISPANIC LATINO,COUNT AMERICAN INDIAN,PERCENT AMERICAN INDIAN,COUNT ASIAN NON HISPANIC,PERCENT ASIAN NON HISPANIC,COUNT WHITE NON HISPANIC,PERCENT WHITE NON HISPANIC,COUNT BLACK NON HISPANIC,PERCENT BLACK NON HISPANIC,COUNT OTHER ETHNICITY,PERCENT OTHER ETHNICITY,COUNT ETHNICITY UNKNOWN,PERCENT ETHNICITY UNKNOWN,COUNT ETHNICITY TOTAL,PERCENT ETHNICITY TOTAL,COUNT PERMANENT RESIDENT ALIEN,PERCENT PERMANENT RESIDENT ALIEN,COUNT US CITIZEN,PERCENT US CITIZEN,COUNT OTHER CITIZEN STATUS,PERCENT OTHER CITIZEN STATUS,COUNT CITIZEN STATUS UNKNOWN,PERCENT CITIZEN STATUS UNKNOWN,COUNT CITIZEN STATUS TOTAL,PERCENT CITIZEN STATUS TOTAL,COUNT RECEIVES PUBLIC ASSISTANCE,PERCENT RECEIVES PUBLIC ASSISTANCE,COUNT NRECEIVES PUBLIC ASSISTANCE,PERCENT NRECEI

In [11]:
with open('Example.csv') as file:
    for line in file:
        print("The first line of the file only:\n")
        print(line)
        break

The first line of the file only:

,JURISDICTION NAME,COUNT PARTICIPANTS,COUNT FEMALE,PERCENT FEMALE,COUNT MALE,PERCENT MALE,COUNT GENDER UNKNOWN,PERCENT GENDER UNKNOWN,COUNT GENDER TOTAL,PERCENT GENDER TOTAL,COUNT PACIFIC ISLANDER,PERCENT PACIFIC ISLANDER,COUNT HISPANIC LATINO,PERCENT HISPANIC LATINO,COUNT AMERICAN INDIAN,PERCENT AMERICAN INDIAN,COUNT ASIAN NON HISPANIC,PERCENT ASIAN NON HISPANIC,COUNT WHITE NON HISPANIC,PERCENT WHITE NON HISPANIC,COUNT BLACK NON HISPANIC,PERCENT BLACK NON HISPANIC,COUNT OTHER ETHNICITY,PERCENT OTHER ETHNICITY,COUNT ETHNICITY UNKNOWN,PERCENT ETHNICITY UNKNOWN,COUNT ETHNICITY TOTAL,PERCENT ETHNICITY TOTAL,COUNT PERMANENT RESIDENT ALIEN,PERCENT PERMANENT RESIDENT ALIEN,COUNT US CITIZEN,PERCENT US CITIZEN,COUNT OTHER CITIZEN STATUS,PERCENT OTHER CITIZEN STATUS,COUNT CITIZEN STATUS UNKNOWN,PERCENT CITIZEN STATUS UNKNOWN,COUNT CITIZEN STATUS TOTAL,PERCENT CITIZEN STATUS TOTAL,COUNT RECEIVES PUBLIC ASSISTANCE,PERCENT RECEIVES PUBLIC ASSISTANCE,COUNT NRECEIVE

Let's combine a few of the things we've learned so far, and add line numbers to our printout:

In [13]:
with open('Example.csv') as file:
    for i,line in enumerate(file):
        print("{}: \t {}".format(i,line[:20])) #let's only print the first 20 characters for this example
        if i >= 20:
            break
            #and, let's only print the first 20 lines to keep it short

0: 	 ,JURISDICTION NAME,C
1: 	 0,10001,44,22,0.5,22
2: 	 1,10002,35,19,0.54,1
3: 	 2,10003,1,1,1.0,0,0.
4: 	 3,10004,0,0,0.0,0,0.
5: 	 4,10005,2,2,1.0,0,0.
6: 	 5,10006,6,2,0.33,4,0
7: 	 6,10007,1,0,0.0,1,1.
8: 	 7,10009,2,0,0.0,2,1.
9: 	 8,10010,0,0,0.0,0,0.
10: 	 9,10011,3,2,0.67,1,0
11: 	 10,10012,0,0,0.0,0,0
12: 	 11,10013,8,1,0.13,7,
13: 	 12,10014,0,0,0.0,0,0
14: 	 13,10016,17,12,0.71,
15: 	 14,10017,0,0,0.0,0,0
16: 	 15,10018,3,2,0.67,1,
17: 	 16,10019,0,0,0.0,0,0
18: 	 17,10020,0,0,0.0,0,0
19: 	 18,10021,0,0,0.0,0,0
20: 	 19,10022,1,1,1.0,0,0


We can also write to a file using the .write() method

In [14]:
with open('test.txt', 'w') as fp:
    fp.write('Testing!') 

Or, we can use the print() function, but redirect the output to the file by passing fp to the print()'s file argument

In [15]:
with open('test.txt', 'w') as fp:
    print('Testing!',file=fp)

In the above example, we passed a 'w' to the open() function after the filename. This will create a file 'test.txt', and allow us to write to it. But, if the file test.txt already exists, it will overwrite it permanently. 

If we want to append to teh end of the file, we will want to pass in an 'a' for append.

In [16]:
with open('test.txt', 'a') as fp:
    print('Appending!',file=fp)

And, not let's take a look at our work:

In [20]:
with open('test.txt', 'r') as fp:
    alltext = fp.read()
    print(alltext)
    print(type(alltext))

Testing!
Appending!

<class 'str'>


There are also methods to readline() one at a time, or to .readlines() all at once. This last one might be helpful for an upcoming homework assignment, so we'll show it here:

In [22]:
with open('test.txt', 'r') as fp:
    alltext = fp.readlines()
    print(alltext)
    print(type(alltext))

['Testing!\n', 'Appending!\n']
<class 'list'>


One final note: you'll see the variable name 'fp' used in a lot of examples throughout Google, on StackOverflow, and in textbooks. This is short for "file pointer," as fp "points" to the file that we opened with open(). You don't need to remember this, but it's a neat throwback to C/C++, and other languages that use pointer objects everywhere.