# L11 - Reading from Files
Parsing files is often a task that involves string manipulation, so let's go over some common string methods used.
# strip(), lstrip(), rstrip()
If you recall that strip will remove certain characters (empty space by default) from both ends of a string. If you only want to remove things from the left or right sides, then you can use lstrip() or rstrip(), respectively.

In [None]:
s = 'ooga booga ooga booga ooga booga ooga'
print(s.strip('ooga'))
print(s.strip('ooga').strip())
print(s.lstrip('ooga'))
print(s.rstrip('ooga'))

# split()
split will split a string at the chosen delimeter (empty space by default)

In [None]:
at_space = s.split()
at_ooga = s.split('ooga')
at_booga = s.split('booga')
print(at_space)
print(at_ooga)
print(at_booga)

# find() and in
The find method will return the index of the first location of an input string or -1 if the string input string is not in the string the method is being called on. The keyword 'in' can be used to determine if certain substrings are in a string (or if values are in a list, tuple or other container)

In [None]:
print(s.find('o'))
print(s.find('o', 2))
print(s.find('c'))
print('o' in s)
print('c' in s)

# Reading files
Unsurprisingly, we can open files with the built-in function open()

In [None]:
f = open('scores.txt')

or

In [None]:
f = open('scores.txt', 'r')

What's the difference? None. The 'r' let's python know that you are opening this files to read from it, and this is the default configuration for open(), so you do not need to include the 'r' if you are reading from a file. 

The variable f now points to our text file, so we can use that to access its contents. One method is readline(). Try running the code below a couple times and notice what happens.

In [None]:
line = f.readline()
print(line)

readline() will read one line of the text file at a time. If you want all the contents spit out into one string, then you can use the read() method.

In [None]:
f = open('scores.txt')
s = f.read()
print(s)

The choice is yours whether you'd prefer to parse one line at a time or parse through the whole file all at once. Note that when you've reached the end of the file and try to use either readline() or read() you will get an empty string in return. 

Of course you're not going to want to manually use the readline() method to get through a massive file, however. So you can use this in junction with a for loop.

In [None]:
f = open('scores.txt')
for line in f:
    print(line)

Notice there are now extra spaces between numbers. This is because each line in the text file has a '\n' at the end where the text moves to the next line. And print by default adds a '\n' to the end of the print statement, so each score has a '\n\n' after it.

You can condense this statement even further if you would like getting rid of the first line.

In [None]:
for line in open('scores.txt'):
    print(line.strip())

Now with the strip() method we got rid of those extra spaces. 

If you want to read a file multiple times, you will need to close it to open it back up again. 

In [None]:
f = open('scores.txt')
print(f.readline())
f.close()
f = open('scores.txt')
print(f.readline())

The following code will compute the average of the scores. Just input the name of the file.

In [None]:
filename = input("File name ---> ").strip() # remove any extra spaces the user may include

total = 0
num = 0
for line in open(filename):
    total += int(line)
    num += 1

print("Average score: {:.1f}".format(total/num))

# Writing to files
Sometimes you may want to store the data you create in a file. You can do this with the same open() function

In [None]:
f_out = open('out.txt', 'w')

This time we do need to include the second argument. The 'w' let's the open function know that you want to write to a file. A word of caution: 'w' will erase the contents of the output file if that file already exists. If you want to add to the end of an existing file, you can use 'a', the append mode.

In [None]:
f_out = open('out.txt', 'a')

To actually write to the contents of the out file, you use the write() method where it takes a string as its input argument.

In [None]:
f_out.write("Hello World!\n")

Note that unlike print(), there are no characters like '\n' added on to the end of the string by default. The other big difference is that it cannot handle non-string variables. If you want to write a int or float value to a file, you must convert it to a string first with the str() function. 

To actually get the contents you've written to your file to save, you must close your writing file.

In [None]:
f_out.close()

Remembering to close your file is annoying, so Python has a structure that will automatically do it for you.

In [None]:
with open('out.txt', 'w') as f_out:
    f_out.write("You smell")

After you leave the scope of the with block, the file will automatically be closed for you.
# Parsing formatted files
Often, files you want to parse will have a standard form to make parsing through them easier. We'll look at an example of parsing a grocery list text file that includes 'Grocery List' on line one as a title for the list.

In [None]:
f = open('grocery.txt')
title = f.readline().strip()
items = []
quantities = []
for line in f:
    parts = line.split(',')
    items.append(parts[0].strip())
    quantities.append(int(parts[1].strip()))

print(title)
for item, quantity in zip(items, quantities):
    print("Need {} in quanitity {}".format(item, quantity))

Knowing how to parse through text is a very powerful tool because it can be applied to many different areas. Like working with csv files or even parsing data from internet webpages.