There are many ways to save the data you've collected as a file onto your computer. When working on your final projects, you'll probably want to comparmentalize the code you use to *get* the data, the code you use to *process* it, and the code you use to *analyze* it. To accomplish this, you'll need a way to save your data *as files* so you have a persistent form of the data that you can pass from one piece of code to another (or from one person to another, if you're working in a group).

In this tutorial, we'll cover one of the easier ways to save and load data, using a module called ``pickle``. Later on on the course, we'll learn how to use the data-structure library called ``pandas`` which will provide us with more options. 

First, we need some data. In homage to the first study we discussed in this class, let's pull some tweets from the accounts of Clinton and Trump. 

In [1]:
API_KEY = ""
API_SECRET = ""

In [2]:
import tweepy
auth = tweepy.AppAuthHandler(API_KEY, API_SECRET)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

In [3]:
clinton_tweets = []
trump_tweets = []

for status in tweepy.Cursor(api.user_timeline, id="HillaryClinton").items(100):
    clinton_tweets.append((status.text, status.favorite_count, status.retweet_count, status.source))
    
for status in tweepy.Cursor(api.user_timeline, id="realDonaldTrump").items(100):
    trump_tweets.append((status.text, status.favorite_count, status.retweet_count, status.source))

In [4]:
print(clinton_tweets[0])
print("*"*50)
print(trump_tweets[0])

("We'll never forget the horror of September 11, 2001. Today, let's honor the lives and tremendous spirit of the victims and responders. -H", 32987, 12418, 'TweetDeck')
**************************************************
('#NeverForget\nhttps://t.co/G5TMAUzy0z', 22270, 7784, 'Twitter for iPhone')


We've got our data. 100 tweets from Clinton and Trump, for each tweet we have 4 features: text, favorite count, retweet count, and source. Now, we want to save it. We'll be able to load this data in whatever other programs we write to process and analyze the text, and we won't have to run this data-getting algorithm again. How do we do that?

I'm going to use the module ``pickle``, which saves our data in a binary file on our computer. Let's import the module; it's already included with Anaconda. I'll use the alias ``pkl`` to type less.

In [5]:
import pickle as pkl

Here's how you save your data as a file:

```
pkl.dump([the variable you want to save], open([the name of the file to save to], "wb"))
```

Pay close attention to the second argument of the open method: ``wb``. This means WRITE, BINARY. 

I'll save the Clinton tweets to clinton.pkl and the Trump tweets to trump.pkl.

In [6]:
pkl.dump(clinton_tweets, open("clinton.pkl", "wb"))
pkl.dump(trump_tweets, open("trump.pkl", "wb"))

Now I'm going to use some Python code to delete the ``clinton_tweets`` and ``trump_tweets`` variables from Python.

In [7]:
del clinton_tweets
del trump_tweets

The data we collected is gone, vanished.

In [8]:
clinton_tweets

NameError: name 'clinton_tweets' is not defined

No problem though! We were conscientious enough to save our data to a file. All we have to do is load it. This is the way you do so:

```
pkl.load(open([the name of the file to load], "rb"))
```

Look at the second argument of the open method, ``rb``. This means READ, BINARY. I'm really emphasizing this because I've personally, repeatedly made the mistake of using ``rb`` when I'm WRITING a pickle and ``wb`` when READING a pickle. Avoid this at all costs.

In [9]:
new_clinton = pkl.load(open("clinton.pkl", "rb"))
new_trump = pkl.load(open("trump.pkl", "rb"))

Our data is back.

In [10]:
print(new_clinton[0])
print("*"*50)
print(new_trump[0])

("We'll never forget the horror of September 11, 2001. Today, let's honor the lives and tremendous spirit of the victims and responders. -H", 32987, 12418, 'TweetDeck')
**************************************************
('#NeverForget\nhttps://t.co/G5TMAUzy0z', 22270, 7784, 'Twitter for iPhone')
