# json and pickle files

Today we're talking about saving Python objects like dictionaries, lists and strings.

If you haven't already found a need to save Python objects, you will.

If your Jupyter notebook is getting really long and slow, save that dictionary that you carefully built, shut down your kernel, and continue your analysis in another notebook. 

My general advice is, at a minimum, to be sure to keep data collection, cleaning, analysis, and visualization in different notebooks. It is also often good to separate different analyses, especially if they produce large objects that are saved in memory, as this will slow down your notebook.

### <br><br>Saving and loading objects using built-in Python functions

You might already save and load dictionaries and lists with your own code by writing text files, and then opening and parsing text files.

For example:

In [None]:
speed_dict = {"Peregrine falcon": "242 mph", 
              "Golden eagle": "150–200 mph", 
              "White-throated needletail swift": "105 mph", 
              "Eurasian hobby": "100 mph", 
              "Mexican free-tailed bat": "100 mph", 
              "Frigatebird": "95 mph", 
              "Rock dove (pigeon)": "92.5 mph", 
              "Spur-winged goose": "88 mph", 
              "Gyrfalcon": "80 mph", 
              "Grey-headed albatross": "79 mph", 
              "Cheetah": "68.0–75.0 mph", 
              "Sailfish": "67.85 mph", 
              "Anna's hummingbird": "61.06 mph", 
              "Swordfish": "60 mph", 
              "Pronghorn": "55.0 mph", 
              "Springbok": "55 mph", 
              "Quarter Horse": "55.0 mph", 
              "Blue wildebeest": "50.0 mph", 
              "Lion": "50.0 mph", 
              "Blackbuck": "50 mph"}

In [None]:
with open("fastestAnimals.txt", "w") as f:
    for k, v in speed_dict.items():
        f.write(k + "\t" + v + "\n")

<br><br>I often use a list comprehension or dictionary comprehension to change .txt files into a list or dictionary.

In [None]:
with open("fastestAnimals.txt", "r") as f:
    txt_speed_dict = {line.split("\t")[0]: line.rstrip("\n").split("\t")[1] for line in f}

In [None]:
txt_speed_dict

<br><br>**Do you have to write your own code?**

### <br><br>Serializing Python objects

The process of converting a Python object into a format that can be stored is called *serialization*. The process of reconstructing that data is called *deserialization*.

We will cover two different file formats for serialization: pickle and json. First let's go over the differences between the two and why you might want to use one or the other.

#### <br><br>pickle

- ends in .pkl
- stores objects in binary format
- can be used to save any Python object, including your object classes and functions
- pickle objects can only be opened in Python
- is binary, so is not human readable - you can't open the files and read the data
- WARNING: can be used to store malicious code!!! So never ever open pickle files you receive from someone else. Only open your own pickle files, and only if no one else has had access to them. 

#### <br><br>json

- ends in .json
- can be used to store strings, integers, floats, lists, dictionaries, tuples, and booleans. Cannot store classes or functions
- can be opened in other languages
- is human readable - you can look at the actual file and read your data
- cannot be used to store malicious code
- usually a little faster to write and read than pickle

### <br><br>Let's practice with pickle.

In [None]:
speed_dict

<br>First, we import the pickle module.

In [None]:
import pickle

<br>To write our dictionary to a pickle file, we open the filename in write mode using a with statement, but we have to open in **write binary mode**.

After our with statement, we **dump** the data into the pickle file.

In [None]:
with open("fastestAnimals.pkl", "wb") as f:
    pickle.dump(speed_dict, f)

<br>To load a pickle file, we use the same syntax, except we open the file in **read binary mode** and we **load** the data.

In [None]:
with open("fastestAnimals.pkl", "rb") as f:
    pkl_speed_dict = pickle.load(f)

In [None]:
pkl_speed_dict

### <br><br>Pickle Exercise

In [None]:
speed_list = ["Peregrine falcon", 
              "Golden eagle", 
              "White-throated needletail swift", 
              "Eurasian hobby", 
              "Mexican free-tailed bat", 
              "Frigatebird", 
              "Rock dove (pigeon)", 
              "Spur-winged goose", 
              "Gyrfalcon", 
              "Grey-headed albatross", 
              "Cheetah", 
              "Sailfish", 
              "Anna's hummingbird", 
              "Swordfish", 
              "Pronghorn", 
              "Springbok", 
              "Quarter Horse", 
              "Blue wildebeest", 
              "Lion", 
              "Blackbuck"]

Run the code above to save `speed_list`. Write code to save the speed_list as a pickle file called `speedyAnimals.pkl`

Now write code to open the pickle file you just created as a dictionary called `pickle_list`.

In [None]:
print(pickle_list)

<br><br>Notice that with both the dictionary and the list, pickle automatically maintained the type of Python object without us needing to specify what type of object it was. 

### <br><br>Let's practice with json.

In [None]:
import json

<br>With json, we can open in regular write and read modes. We still use dump and load.

In [None]:
with open("fastestAnimal.json", "w") as f:
    json.dump(speed_dict, f)

In [None]:
with open("fastestAnimal.json", "r") as f:
    json_speed_dict = json.load(f)

In [None]:
json_speed_dict

### <br><br>Json Exercise

In [None]:
falcon = "Peregrine falcons are among the world's most common birds of prey and live on all continents except Antarctica. They prefer wide-open spaces, and thrive near coasts where shorebirds are common, but they can be found everywhere from tundra to deserts. Peregrines are even known to live on bridges and skyscrapers in major cities."
print(falcon)

Run the code above to store the string `falcon`. Write code to save the string as a json file called `falcon_info.json`. 

Write code to open up the file you just created. Save it as a string `json_string`.

In [None]:
print(json_string)