# Data Structure Representations 

  * When your blog entries are in a variable they're in the computer's memory 
  * If your program exits your data is lost. 
  * It's important to be able to save your program data 
    * To do that you must pick a representation.
  * Javascript Object Notation (JSON) is a popular data format. 
  * JSON is well supported by Python 
  * You can save most Python data using JSON easily 

Here's how to use JSON in a program:

In [None]:
import json

post1 = {'title': 'First post!',  
  'text': "This is my first post to my new blog."
  }
post2 = {'title': 'Ate Cereal for Breakfast.', 
  'text': "I ate cereal today they were Heritage O's. High in fiber."
  }
bloggers = {
  'mike' : {
  'name' : 'Mike Matera', 
  'email' : 'matera@matera.com',
  'posts' : [post1, post2]
  }
}

json.dumps(bloggers)

  * JSON data is very similar to how data is represented using Python literals 
  * The `dumps()` function converts Python data into a JSON string 
  * The `loads()` function does the opposite.

In [None]:
json_string = '{"mike": {"name": "Mike Matera", "email": "matera@matera.com", "posts": [{"title": "First post!", "text": "This is my first post to my new blog."}, {"title": "Ate Cereal for Breakfast.", "text": "I ate cereal today they were Heritage O\'s. High in fiber."}]}}'
data = json.loads(json_string)
print (data)

  * We can add functions to our blogging program that allow users to load and store the blog database. 

Here's a function that loads the blog database:

In [None]:
def load_blogs(filename) : 
    with open(filename) as f : 
        return json.loads(f.read())
    
blog_data = load_blogs('blog.json')
print (blog_data)

And a corresponding function that saves blogs:

In [None]:
def save_blogs(data, filename) :
    with open(filename, 'w') as f : 
        f.write(json.dumps(data))

# JSON and The Web 

  * JSON is used by many websites as a part of their official Application Programming Interface (API) 
  * An API is a way for a program to access a website 
  * APIs make it much easier for your program to get useful data. 
  * APIs have a special URL called an //endpoint// 

See what happens when you browse to Wikipedia's endpoint: 

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Programming

Hard to read for humans but easy for Python! Here's code that makes the data available as a Python program:

In [None]:
import requests
import json
response = requests.get('https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Programming')

print ('Status code:', response.status_code)

data = json.loads(response.text)
print (data)

  * The structure of the response can sometimes be a bit complicated. 
  * JSON APIs have to be flexible enough to handle huge responses 
    * When the response is large you need to do paging (only a few responses at a time) 

Let's take a look at the contents of the response above:

In [None]:
import requests
import json
response = requests.get('https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Programming')

print ('Status code:', response.status_code)

data = json.loads(response.text)

print ('Keys in the data:')
for key in data : 
  print ("\t", key)

print ('Keys under "query"')
for key in data['query'] : 
  print ("\t", key)

print ('Keys under "query/pages"')
for key in data['query']['pages'] : 
  print ("\t", key)

print ('Keys under "query/pages/6271327"')
for key in data['query']['pages']['6271327'] : 
  print ("\t", key)

print ('Page ID:', data['query']['pages']['6271327']['pageid'])
print ('Page Title:', data['query']['pages']['6271327']['title'])
print ('Page Extract:', data['query']['pages']['6271327']['extract'])


## APIs in Practice 

  * Most sites have an API 
  * Some require authentication and some don't 
  * APIs are often self-describing
    * They tell you what you can ask for

Here's an example of using the GitHub API:

In [None]:
import requests
import json

response = requests.get('https://api.github.com/')
data = json.loads(response.text)

for endpoint in data : 
  print (f'name: {endpoint} url: {data[endpoint]}')

user_response = requests.get(data['user_url'].format(user='mike-matera'))
user_data = json.loads(user_response.text)
for key in user_data : 
  print (f'key: {key} value: {user_data[key]}')

## APIs That Require Authentication 

  * Many sites require you to use the API as a registered user.
  * You should **NEVER** give your password over an API 
  * As a registered user you can retrieve an //access key// 
    * Access keys secrets that let the site know who you are 

Check out what happens when you search the Twitter API without an access key: 

https://api.twitter.com/1.1/search/tweets.json?q=%40python

Here are instructions for getting a key from Twitter: 

https://developer.twitter.com/en/docs/basics/authentication/guides/access-tokens

  * When you create an application on Twitter you receive a: 
    * Consumer Key (API Key)
    * Consumer Secret (API Secret)
  * You use those keys to get an //Access Token// 
  * An access token is like a password that expires soon 
  * Every request you make must have an //access token// 
    * The keys won't work!

Here's a program that gets an access token from keys:

In [None]:
''' Use the Twitter API to get an access token. ''' 

import base64 
import requests 
import json

API_KEY = 'your-key-here'
API_SECRET = 'your-secret-here'

endpoint = 'https://api.twitter.com/oauth2/token'

auth_key = f'{API_KEY}:{API_SECRET}' 
auth_encoded = base64.b64encode(auth_key.encode('utf-8')).decode('utf-8')
postdata = { 'grant_type' : 'client_credentials' }
headers = {'Authorization' : f'Basic {auth_encoded}', 'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8'}

auth_response = requests.post(endpoint, postdata, headers=headers) 
auth_data = json.loads(auth_response.text)

for key in auth_data : 
    print (f'{key}: {auth_data[key]}')

Executing the code will retrieve and print an access token. You need the access token to make requests.  

## Do it the Python (Easy) Way 

  * Calling web APIs directly using `requests` is cumbersome 
  * When you add authentication it gets really hard. 
  * Most popular websites have Python modules that automate most of the hard work. 
  * You can use the Twitter API with ease once you install the module:

```bash
$ pip-3.6 install --upgrade --user python-twitter
```

With the Twitter API installed you can easily get tweets:

In [None]:
'''Search for tweets''' 

import sys 
import twitter

prog, user = sys.argv

api = twitter.Api(consumer_key='your-key-here',
                  consumer_secret='your-secret-here',
                  access_token_key='your-access-token',
                  access_token_secret='your-token-secret')

tweets = api.GetUserTimeline(screen_name=user)
for tweet in tweets : 
    print (tweet.text)