# Files

Python uses **file objects** to interact with `files` on your computer. These file objects can be any type of file you have on your computer, whether it be an audio file, a text file, emails, Excel documents, etc. Note: You will probably need to install certain libraries or modules to interact with those various file types, but they are easily available. (We will cover downloading modules later on in the course).

Python has a built-in open function that allows us to open and manipulate with basic file types. First we will need a file though. We're going to use some IPython magic to create a text file!

## Writing a File using IPython  
### This function is specific to Jupyter Notebooks! Alternatively, quickly create a simple .txt file with the Notepad++ text editor.
  
### For details on IPython and Jupyter Notebook see https://ipython.org/

In [None]:
%%writefile test.txt
Hello, this is a quick test file.
This is the second line of the same file.

## Python Opening a file

Let's begin by opening the file test.txt that is located in the same directory as this notebook. For now we will work with files located in the same directory as the notebook or .py script you are using.

It is very easy to get an error on this step:

In [None]:
myfile = open('test.txt', 'r')

open('test.txt', 'r')
- The first parameter is the file name, and 
- the second parameter, 'r', informs Python that we intend to read from the file.   
 - Opening the file for read is the default, so the 'r' is not required
- This requires that the file exist, and only allow us read from it, not write to it.

### To be safe, you should test the result of the file open statement in a try/except block, since an exception usually means the file does not exist, or it has incorrect permissions.

In [None]:
try:
    myfile = open("test.txt", 'r')
except FileNotFoundError:
    print( "File not found.  Program aborted." )

### Let's try to open a file which does not exist

### first without the try/except

In [None]:
myfile = open("test_1.txt", 'r')

### Now the try/except

In [None]:
try:
    myfile = open("test_1.txt", 'r')
except FileNotFoundError:
    print( "File not found.  Program aborted." )

In [None]:
myfile = open('whoops.txt')

To avoid this error ,make sure your .txt file is saved in the same folder as your notebook, to check your notebook location, use **pwd**:

In [None]:
pwd    # pwd - print working directory

### Alternatively, to use files from any location on your computer, simply pass in the entire file path. 

For Windows you need to use double \ so python doesn't treat the second \ as an escape character, a file path is in the form:

    myfile = open("C:\\Users\\YourUserName\\Home\\Folder\\myfile.txt")

For MacOS and Linux you use forward slashes:

    myfile = open("/Users/YouUserName/Folder/myfile.txt")

In [None]:
# Open the text.txt we made earlier
my_file = open('test.txt')

### read() - reads the entire file into memory

In [None]:
# We can now read the file
my_file.read()

### What will the output be if we use the print() function?

In [None]:
# Open the text.txt we made earlier
my_file = open('test.txt')
print(my_file.read())
print('WHY is the output different?')

### What happens if we try to read it again?

In [None]:
my_file.read()

This happens because Python set a "cursor" is at the end of the file after reading it. So the next read() starts at the cursor and there is nothing left to read. We can reset the "cursor" to the beginning of the file by doing this:

In [None]:
# Seek to the start of file (index 0)
my_file.seek(0)

In [None]:
# Now read again
my_file.read()

You can read a file line by line using the readlines method. Use caution with large files, since everything will be held in memory. We will learn how to iterate over large files later in this lesson.    
- readlines loads each line as an item in a list

In [None]:
# Readlines returns a list of the lines in the file
my_file.seek(0)
my_file.readlines()

Of course you can store the reference to the list in a variable

In [None]:
my_file.seek(0)
lst = my_file.readlines()

In [None]:
lst

#### Open the file for append (add line to the end of the file

In [None]:
my_file = open('test.txt','a')
my_file.write('This is line 3')
my_file.close()

In [None]:
my_file = open('test.txt','r')
my_file.readline()

In [None]:
my_file.readline()

In [None]:
my_file.readline()

When you have finished using a file, it is always good practice to close it.

In [None]:
my_file.close()

## Writing to a File

By default, the `open()` function will only allow us to read the file. We need to pass the argument `'w'` to write over the file. For example:

In [None]:
# Add a second argument to the function, 'w' which stands for write.
# Passing 'w+' lets us read and write to the file

my_file = open('test.txt','w+')

### <strong><font color='red'>Use caution!</font></strong> 
Opening a file with `'w'` or `'w+'` clears (truncates) the original file, meaning that anything that was in the original file **is deleted**!

In [None]:
# Write to the file
my_file.write('This is a new line')

In [None]:
# Read the file
my_file.seek(0)
my_file.read()

In [None]:
my_file.close()  # always do this when you're done with a file

## Appending to a File
Passing the argument `'a'` opens the file and puts the pointer at the end, so anything written is appended. Like `'w+'`, `'a+'` lets us read and write to a file. If the file does not exist, one will be created.

In [None]:
my_file = open('test.txt','a+')
my_file.write('\nThis is text being appended to test.txt')
my_file.write('\nAnd another line here.')

In [None]:
my_file.seek(0)
print(my_file.read())

In [None]:
my_file.close()

In [None]:
my_file= open('test.txt', 'r')
my_file.readlines()

### Appending with `%%writefile`
We can do the same thing using IPython cell magic:

In [None]:
%%writefile -a test.txt

Look at 'This' is text being appended to test.txt
And another line here.

In [None]:
my_file.seek(0)
my_file.readlines()

In [None]:
# What will we get if we
my_file.readlines()

In [None]:
# What about this this time?
my_file.seek(0)
my_file.readlines()

Add a blank space if you want the first line to begin on its own line, as Jupyter won't recognize escape sequences like `\n`

## Iterating through a File

Lets get a quick preview of a for loop by iterating over a text file. First let's make a new text file with some IPython Magic:

In [None]:
%%writefile test.txt
First Line
Second Line

Now we can use a little bit of flow to tell the program to for through every line of the file and do something:

In [None]:
for line in open('test.txt'):
    print(line, end='')

In [None]:
f = open('test.txt','r')
lst = f.readlines()
print(lst[1])

**What did we do here?**  For every line in this text file, read and then print the next line. It's important to note a few things here:

1. We could have called the "line" object anything (see example below).
2. By not calling `.read()` on the file, the whole text file was not stored in memory.
3. Notice the indent on the second line for print. This whitespace is required in Python.
4. This tells us that a text file is an iterable, each line is a item.

Regarding the the first point above

In [None]:
for asdf in open('test.txt'):
    print(asdf)

2nd Parameter	Operators    
- 'r' (or nothing)	
 - open file for reading. (default argument is 'r' )     
- 'a'	open file for appending - 
 - writes to end of existing data without erasing prior file      
- 'r+'	
 - open file for both reading and writing    
- 'rb'   
 - open file for reading in binary mode.   
  - Every character is read as-is.   
  - That is, '\n', '\r' and non-printable characters are included in read data, if present    
   - Escape characters are included literally
  - no end of line characters are added to returned lines    
- 'wb'	  
 - open files for writing in binary mode.    
- 'r+b'	  
 - open files for reading and writing in binary mode.    

### As an aside - practice using f-strings to format tables

#### Here is a program that creates a table of squares, cubes, square roots and cube roots for the first 20 positive integers.

In [None]:
print("Here is a table of squares, squares, cubes, square roots\n"
      "and cube roots of the first 20 positive integers:")

print( "\n   k       k^2      k^3    sqrt k   cube rt k")

for k in range(1, 21):
    print(f"{k:4} {k ** 2:8d} {k ** 3:8d} {k ** (1/2):9.3f} {k ** (1/3):10.3f}")

Take a look at the f-string in the "Program" above

In [None]:
my_file = open('table', 'w')
my_file.write("Here is a table of squares, squares, cubes, square roots\n"
               "and cube roots of the first 20 integers:\n")
my_file.write("\n   k       k^2      k^3    sqrt k   cube rt k\n"  )

for k in range(1, 21):
    my_file.write(f"{k:4} {k ** 2:8d} {k ** 3:8d} {k ** (1/2):9.3f} {k ** (1/3):10.3f}\n")

In [None]:
my_file = open('table', 'r')
print(my_file.read())