# Files Part II

In Part I, we learned that a file handle serves as a translator between Python and the actual file. The handle can also be thought of as a **sequence**, where each **element** in the sequence is a **line** of text. 

We know that `for` loops are used to read each element in a sequence. So, we can use a `for` loop to read each line in a file. 

First, let's make up a random text file. 

In [1]:
file1 = open("myfile.txt","w") # create a text file called myfile.txt
L = ["This is the first line \n","This is the second line \n","Final line! \n"] # write some text
file1.writelines(L) # write the text to the file
file1.close() # close the file

Now, we can read the file line-by-line using a `for` loop. 

In [1]:
fhandle = open('myfile.txt')
for line in fhandle:
    print(line)

This is the first line 

This is the second line 

Final line! 



 What if we wanted to **count** the number of lines in a file? We can combine our knowledge of file handling and `for` loops to write a program that counts lines in a text file:

In [3]:
fhand = open('myfile.txt')
count = 0
for line in fhand:
    count = count + 1
print('Line Count:', count)


Line Count: 3


Instead of printing or counting the lines individually, we may want to store the whole file in a single string variable. We can read the whole file at once using the `read()` function. 

In [5]:
fhand = open('myfile.txt')
inp = fhand.read()
print(len(inp))
print(inp[:9]) # print the first 10 characters of the string

62
This is t


Now, suppose you wanted to **search** through a file to look for something in particular. We can put an `if` statement inside a `for` loop to print only lines that meet some criteria. 

In [6]:
fhand = open('myfile.txt')
for line in fhand:
    if line.startswith('This'): # find lines that start with "This"
        print(line)

This is the first line 

This is the second line 



Notice that the output of the previous chunk has extra blank lines between the lines of text. In order to get rid of these extra lines, we must strip the `\n` character from the lines of text. We use `rstrip()` to strip the whitespace from the right-hand side of the string. The `\n` is considered whitespace and will be removed from the line by `rstrip()`. 

In [19]:
fhand = open('myfile.txt')
for line in fhand:
    line = line.rstrip()
    if line.startswith('This'):
        print(line)

This is the first line
This is the second line


Instead of using `if` to find lines that contain our keyword, we can use `if not` and `continue` to **skip** lines that we are not interested in.  

In [7]:
fhand = open('myfile.txt')
for line in fhand:
    line = line.rstrip()
    if not line.startswith('This'):
        continue
    print(line)

This is the first line
This is the second line


We can use `in` to find a line that contains our selection criteria, regardless of whether the criteria occurs at the beginning, middle, or end of the string. In this example, we find all of the lines that contain the word, "second". 

In [8]:
fhand = open('myfile.txt')
for line in fhand:
    line = line.rstrip()
    if not'second' in line:
        continue
    print(line)

This is the second line


Often the **filename** itself is a **variable**. An example of the filename being a variable would be if we were looping through a bunch of different filenames. Another example of the filename being a variable would be if we asked the user to input a filename. 

In the example below, we ask the user to input a filename. If you supply a real filename, such as "myfile.txt", then the code will work. If you specify a filename that does not exist, then you will get an error. Try it both ways!

In [10]:
fname = input('Enter the file name: ')
fhand = open(fname)
count = 0
for line in fhand:
    if line.startswith('This'):
        count = count + 1
print('There were', count, 'lines that started with "This"')

Enter the file name: myfile.txt
There were 2 lines that started with "This"


The reason we are talking about **filenames** is because filenames are notorious for causing errors. One main reason filenames cause erros is because files often get moved around or deleted. If the computer expects a file to be present but it is missing, then it will cause an error!

Fortunately, we can use the `try`, `except` structure to keep the program running even if the file is missing (or mispelled). The code below will run whether or not a working filename is supplied. 

In [11]:
fname = input('Enter the file name: ')
try:
    fhand = open(fname)
except:
    print('File cannot be opened:', fname)
    quit()
    
count = 0
for line in fhand:
    if line.startswith('This'):
        count = count + 1
print('There were', count, 'lines that started with "This"')

Enter the file name: BestFileEver.txt
File cannot be opened: BestFileEver.txt
There were 0 lines that started with "This"
