# File I/O

Name:

Date:

Learning Objectives:
By the end of this lesson, you should be able to:
1. Read and write data to and from a text file
2. Access numerical data inside of a binary-encoded file
3. Describe how data is written to and accessed from "pickle" files 

### Import modules required for this notebook

In [None]:
# import the struct and pickle modules


# Part 1: Reading and Writing Text Files

Reading and writing data from a text file are both done with the built-in `open` method.

### Reading Text Files
The `open` method takes in a path to a file and an optional argument describing how the file will be opened. By default, the `open` method will open files in reading mode (`mode='r'`). Try this on some buoy data obtained from the [National Buoy Data Center](https://www.ndbc.noaa.gov/station_page.php?station=46269) for a site in Monterey Bay

In [None]:
# open the buoy data file


# read all of the lines of the file


# close the file


# see how long the file is (how many characters are in the file)


# split the lines by the new line character to see how many lines there are


# print the first five lines


As can be seen above, the `file_object` must be opened, read, and then closed when finished. When a file is open, it is stored in memory so it is generally not recommended to have many files opened at the same time. Instead, the `with` keyword allows us to bundle the `open` and `close` methods together, running a subset of commands on the file while it is open. In other words:

In [None]:
# open the buoy data file using the with keyword
# the with block automatically closes the file


# see how long the file is (how many characters are in the file)


# split the lines by the new line character to see how many lines there are

# print the first five lines


## Writing Text Files
Writing text files is just as easy as reading them - just change the mode! For example, we may want to generate a quick readme note to include in our directory. We can do this as follows:

In [None]:
# define a string with a readme note

# output your note as a readme file


When writing text to a text file, we can format our text and name our files in such a way that other programs can interpret the data. One of the clearest examples of this functionality is seen with data stored in a comma separated value format. These types of files can be read by Excel and many other programs. Let's try this ourself:

In [None]:
# make a header line that has columns for year, month and day, separated by commas


# make two data lines with days of the year


# combine the header and the data lines, separated by a newline character


# write the output to a file


# open the file in a spreadsheet program - what does it look like?

### &#x1F914; Mini-Exercise
Goal: Write a text file that describes the ocean conditions when the waves in Monterey Bay were biggest during the year 2022. 

Read in the wave data and do a search through each line for the line with the biggest waves (`WVHT`, column 9). Then, write out a text file that includes the two header lines and the line with biggest wave. Store your lines with each component separated by commas a file called "Monterey Bay Biggest Waves 2022.csv"

In [None]:
# open the buoy file and read the lines


# split the lines at the new line indicator


# store the first two lines in variables "header" and "units"


# loop through the remainder of the lines and find the one with the biggest waves


# make an output string combining the lines for header, units, and the biggest wave day

# save as a csv with the file name Monterey Bay Biggest Waves 2022.csv


# Part 2: Working with binary data
In many older programs, data is often stored in binary files. While this practice is waning, its still a good idea to be able to read and write data in binary format.

### Writing integer data to binary
To write a list of integers in binary format, use the `bytearray` method to convert a list of numbers of bytes, and use the `wb` mode to write in binary:

In [None]:
# declare a list of integer values

# convert the list to array

# open a file_object to write as a binary file

# write array into the file

# close the file


### Reading integer data from a binary file
When reading from a binary file, be sure to use the 'rb' mode

In [None]:
# open the binary file for reading

# read the contents from the file 

# close the file

# print the lines

# the list command will return back the list, converting from a binary array

### Writing float data to a binary file
The `struct` module provides a mechanism by which various types of data can be encoded in binary and packed into a file. It's very uncommon to write out data in this format in Python, but it's common to receive files from other programs. Here, we'll write a binary file with float values and then examine how we can read back the data

In [None]:
# make up a list of 4 float values for testing

# define the format for output - either f or d (for float or double)

# define the output using the struct.pack method.
# the first character is the number of elements, the second is the format type

# write the binary data to a file called float.bin


As you can see, when packing data into the binary file, the format must be specified. For a full list o different format types, see the [struct documentation page](https://docs.python.org/3/library/struct.html).

### Reading float data from a binary file
Reading structured data from a binary file is similar to writing - you need to know the record format as well as the number of items in the file.

In [None]:
# define the record format, identical to the code block above


# use the struct.calcsize method to determine the size of this format, in binary

# make an empty list of values to keep track of during the file reading


# open the float.bin file and read in the 4 values we wrote previously

    # loop through the 4 values, reading a portion of size record_size


        # store the binary value into a variable


        # use the struct.unpack method to convert the binary representation to a float
        # then, add the value to the list
        
# print out the values


### &#x1F914; Mini-Exercise
Goal: Read in data from a binary file given information about its contents

A scientist using an old and outdated coding language has passed you a data file. They mention to you that the file has two columns of double-precision data, and each column has 119 rows. Read the file contents into two lists. Then, print the first and last values from the each list (decoded from binary).

In [None]:
# define the record format, identical to the code block above


# use the struct.calcsize method to determine the size of this format, in binary

# make two lists to store the data


# open the data_file file and read in the values into the 2 lists

    # loop through the 119 rows, reading two values from each row, with each value of size record_size 
    

        # read the first value, and store into a binary value
        

        # use the struct.unpack method to convert the binary representation to a float
        # and add to the first list
        
        # read the second value, and store into a binary value
        
        # use the struct.unpack method to convert the binary representation to a float
        # and add to the second list

# print the first and last entries in each list



Hint: The code block above may be very useful on Homework 5

# Part 3: Pickling Files

Pickle files are an extremely flexible data storage type unique to Python. As the name suggests, you can treat a pickle file like a pickle jar - and it can hold *anything* (and if we'd like to continue the analogy, it preserves that data until the jar is opened). For example, you can create a dictionary and store it in a pickle file

In [None]:
# make a dictionary for the days in a month

# store the dictionary in a pickle file


Then, reading from the pickle file is similar to writing:

In [None]:
# load the dictionary back in from the pickle file

# print the dictionary


However, if we can continue the analogy, pickles in your pickle jar do not need to just store cucumbers. In other words, you can put a mix of different types of objects in a single pickle file

In [None]:
# make a list of the years 2000 to 2020

# make a string to describe the contents

# write the string, year, and dict to the pickle file called date_data.pickle


Then, you can read the objects back in the same order they're created:

In [None]:
# load the objects back in from the pickle file

# print the objects