# Reading and Writing to File

In this notebook we will see how you can use python to create, read and write to a text file on your computer.

By the end of this notebook you will know about:
- `file` objects in python,
- Writing to a file,
- Reading a file and
- Preferred `file` object syntax.

## Files

Sometimes you may want to read and write data from and to a file, for example like a .csv or .txt file.

One way to do this is by using a `file` object in python, if you're interested, here's a link to the python documentation on `file` objects, <a href="https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files">https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files</a>.


#### A Note

Before moving forward I want to remind you to be mindful of the order in which you execute code chunks inside of `jupyter notebooks`. As we will see in the rest of the notebook, some code needs to be run before/after other code to work properly.

### Writing a `file` Object

We will start by seeing how to write to a file object.

In [44]:
## we'll store the object in the variable file
## we open a file by calling open(file_name, open_method)
## for example here we call open("new_file.txt", "w+")
file = open("new_file.txt", "w+")
file

<_io.TextIOWrapper name='new_file.txt' mode='w+' encoding='UTF-8'>

In [45]:
## We are done with the file for now
## So we must CLOSE the file
## this is an important step anytime
## you're done with a file, if you neglect
## to close the file your compute will just keep 
## all of the files open until you close the 
## jupyter notebook
file.close()

Go to your jupyter home page. After running that code you should see the file, `new_file.txt`, show up. Why? When we call `open("new_file.txt, "w+")` it tells our computer to open the file, `new_file.txt`, and the `w+` tells python that we want to write on it. Specifically the `+` tells python that if this file does not exist we want to create it. 

If we left off the `+` we would have received a file not found error, but now that we've created the file we can open it whenever we want. Let's try below.

In [46]:
## now we'll reopen the file, but without the "+"
file = open("new_file.txt", "w")

In [47]:
## once a file is open in "w"rite mode
## we can write to it with file.write(string)
file.write("Now my file has a line in it.\n")

## the \n at the end of that line tells python to
## make a "newline" in the file
## let's write a second line
file.write("This is the second line in the file.\n")


## I'm done with the file for now, 
## let's close it.
file.close()

Before moving on to the next code chunk open up the file and check to see if it wrote what we told it to.

In [48]:
## You code
## open the file in write mode
file = open("new_file.txt", "w")

## write the line "A third line, might be fine"
file.write("A third line, might be fine\n")

## close the file when you're done
file.close()


Now open your file again. What happened?

The `w` command tells your computer to overwrite any existing data that was there.

If we want to append to a file with text already written in it we must open with `a`, which stands for "a"ppend.

In [49]:
## opening the file with "a" instead of "w"
## tells python we want to append lines to the
## file instead of overwrite the existing content
file = open("new_file.txt", "a")

file.write("A new line, we won't overwrite the old lines this time!\n")

file.close()

Check it one more time! Did I lie to you?

In [50]:
## You code
## Create a .csv file called "my_first_data.csv"
file = open("my_first_data.csv", "w+")

## write a line of column names
## "x","y","z"
## don't close the file yet
## Hint: don't forget the \n!
file.write("x,y,z\n")

# column names


6

In [51]:
## You code
## use a loop to write the following corresponding data
## in the correct order,
## you can close it when you're done
## also separate the x, y and z values with commas
x = [1,2,3,4]
y = [2,4,6,8]
z = [1,8,27,64]

## code here
## hint you'll have to cast the contents of x, y and z
## as strs before writing them to file
## don't forget the \n!
for i in range(len(x)):
    file.write(str(x[i]) + "," + str(y[i]) + "," + str(z[i]) + "\n")

file.close()

Open your file and compare it with `check_data_file.csv` to make sure it matches.

### Reading a `file` Object

Now let's suppose that we have a file that contains data you would like to read in. Instead of `open(file_name,"w")` or `open(file_name,"a")`, you write `open(file_name,"r")`. You can then read the content with `.read()`. Let's see.

In [52]:
## open the file to read it with "r"
file = open("new_file.txt", "r")

print(file.read())

A third line, might be fine
A new line, we won't overwrite the old lines this time!



In [53]:
## You code
## try to reread the file's contents using .read()
print(file.read())




What happened?

When we called `read()` on our file object the cursor of the object went through all the text and returned it to us, but this process left the cursor at the end of the file's text. Here we can think of the cursor as our own eyeballs. Once you read through all of the text on a page, your eyes will be looking at the end of the page.

In order to call `read()` again and return all the text we need to return our cursor back to the begining of the document (i.e. point our eyes back at the top of the page). This is done with a `seek()` call.

In [54]:
## Go back to the 0th item in the string
## aka the beginning of the file's contents
file.seek(0)

0

In [55]:
## You code
## let's try this again store the output of file.read()
## in a variable called file_text
file_text = file.read()

In [56]:
## You code
## print file_text with a print statement to check it worked
print(file_text)

A third line, might be fine
A new line, we won't overwrite the old lines this time!



In [57]:
## close the file
file.close()

In [58]:
## You code
## open your file "my_first_data.csv" in "r"ead mode
file = open("my_first_data.csv", "r")


In [59]:
## You code
## see what the readlines() command does
print(file.readlines())

['x,y,z\n', '1,2,1\n', '2,4,8\n', '3,6,27\n', '4,8,64\n']


In [60]:
## close the file
file.close()

### Preferred Syntax

While we have been using a variable for the `file` object and then closing the object in a later code chunk, that is not the preferred syntax according to the python documentation.

It is good practice to use the `with` keyword when dealing with file objects. Let's see an example of this:

In [61]:
## first type with
## then type the open statement we have been using throughout
## then as whatever_name_you_want
## then a colon
with open("my_first_data.csv", "r") as file:
    ## indented write code for all you want to do with your file
    print(file.readlines())
    
## NOTE! You do not need to write a .close statement with this syntax

['x,y,z\n', '1,2,1\n', '2,4,8\n', '3,6,27\n', '4,8,64\n']


This is preferred because once you exit the `with` indentations the file is automatically closed, regardless of how the code within the indentation executes.

--------------------------

This notebook was written for the Erd&#337;s Institute C&#337;de Data Science Boot Camp by Matthew Osborne, Ph. D., 2023.

Any potential redistributors must seek and receive permission from Matthew Tyler Osborne, Ph.D. prior to redistribution. Redistribution of the material contained in this repository is conditional on acknowledgement of Matthew Tyler Osborne, Ph.D.'s original authorship and sponsorship of the Erdős Institute as subject to the license (see License.md)