# File Processing Review
Covers: Reading/writing file, split(), exceptions

## Opening a file
- r: Opens a file for **reading only** (*by default*)
- r+: Opens a file for **both reading and writing**
- w: Opens a file for **writing only**
- w+: Open a file for **writing and reading**
- a: Opens a file for **appending**
- a+: Opens a file for both **appending and reading**

In [None]:
# Open the file
my_file = open('test.txt', 'r')

## Reading + Closing a File

In [None]:
# Read the entire document
content = my_file.read()
# Read line by line
content = my_file.readline()
# Close the file
my_file.close()

## Split Function
split(seperator) - returns a list of strings after breaking by specified seperator. If seperator not specific, consecutive whitespace (space, \n, \t)

In [None]:
x = content.split()
y = content.split(',', 2)  # Limit the number of splits

## Writing to a file

In [None]:
#NOTE: Must open the file with appropriate mode to write
new_file = open('new_file.txt', 'w')
new_file.write(str(9+10) + '\n')  # Can only write strings
new_file.closes()

In [None]:
# ALTERNATIVE METHOD (good practice)
with open('new_file.txt', 'w') as new_file:
    for i in range(100):
        new_file.write(str(i) + ', ')

##### Using print() function to write to a file

In [None]:
print(9+10, file=new_file)  # No need to convert to str

## Exceptions for file

In [None]:
# Use try & except
try: 
    f = open('invalid.txt', 'r')
except:
    print("This is the except message")
    
# Using try & except with specified error types
try:
    f = open('invalid.txt', 'r')
except FileNotFoundError:
    print("File Not Found Error Message")
except IndexError:
    print("Index Error Message")
except:  # Not needed, but could use
    print("Generic Except Message")

## Application Example
Creating a dictionary that uses each word as a key and the appearance frequency of word as value

In [None]:
word_counts = {}
with open('Example.txt', 'r') as file:
    for line in file:
        tokens = line.upper().replace(',','').replace(';','').replace('(','').replace(')','')\
        .replace('!','').replace('?','').replace('.','').split()
        for word in tokens:
            try:
                word_counts[word] += 1
            except:  # If it does not exist
                word_counts[word] = 1