## A Gentle (Pythonic) Introduction to JSON 

Programming languages have lots of ways of representing data called 
*data structures*. Very often you want to flatten, or *serialize* these 
data structures so that they can be transmitted over the network or stored 
in a file for later. In the old days, you'd invent a file format of your own 
devising to do this and write the different bits of your data structure 
into the file format. But that's an expensive way of doing things and very 
error prone. 

Most modern languages have a way of automatically serialising data structures, 
and the one used by JavaScript took off in a big way, probably because it is 
very simple to understand and use, and is embedded in every Web browser. 
Conveniently in your case, it's also very similar to the way things are done 
natively in Python. This format is called JavaScript Object Notation or [JSON](https://en.wikipedia.org/wiki/JSON) for short.

You really only need to know about two types of structure to understand JSON. 
In Python parlance these are [lists](https://docs.python.org/2/tutorial/datastructures.html#more-on-lists) and [dictionaries](https://docs.python.org/2/tutorial/datastructures.html#dictionaries), and they are both 
built in to the language itself. The same is true of other programming 
languages, like Ruby, Perl, PHP, Java and (of course) JavaScript.

### Lists

A list is a container into which you can put other things, knowing that their 
order will be preserved. For example if I wrote:

In [2]:
numbers = [1, 2, 10]

I would create a list named `numbers` that contains the integers 1, 2 and 10 
in that order. Python will never re-arrange a list, you can guarantee the 
order stays the same unless you manipulate it yourself. 

You can pretty much have anything you like in a list. For example this

In [3]:
things = [1, "dog", 4.5]

creates a list named `things` that contains the integer `1`, the string `dog` 
and the real number `4.5`. 

### Dictionaries

The next thing you need to know about is a dictionary. A dictionary makes an 
association between a *key* and a *value*. It's like a mini-database where you 
can put stuff in, give it a unique identifier, and then use that identifier 
later to retrieve it. For example, a dictionary called `ages` containing 
people's ages might look like:

In [4]:
ages = { "Anne" : 34, "Bob" : 29, "Alex" : 15 }

If you want you can make this a bit more readable by using multiple lines:

In [5]:
ages = {
  "Anne" : 34,
  "Bob" : 29,
  "Alex" : 15 
}

This dictionary contains three entries, one with the key `Anne` and the value 
`34`, one with the key `Bob` and the value `29`, and a third with the key `Alex` 
and the value `15`.

So you can print out Anne's age:

In [6]:
print(ages['Anne'])

34


### Composing Lists and Dictionaries

To make things a bit more interesting (and expressive) you can put lists and 
dictionaries inside one another. So you could have a list containing 
dictionaries, or a dictionary where each value is a list and so on. 

This turns out to be a fairly generic way of flattening complex data 
structures, and it's exactly what JSON is based on. JSON actually uses a 
notation that's very similar (I think perhaps even identical) to the way that 
Python displays lists and dictionaries, so if you're familiar with one you can 
read the other. 

So let's imagine I want to represent a *person*. I can create a dictionary 
with specific keys and values. Something like...

In [7]:
person1 = { 
  "name": "Anne",
  "age": 34, 
  "shoesize": 6
}

and maybe a second person:

In [9]:
person2 = { 
  "name": "Bob",
  "age": 29, 
  "shoesize": 11
}

I could now put these two people into a list

In [10]:
people = [person1, person2]

### JSON

What might the list `people` look like if I wrote it out in long hand? It 
would contain two dictionaries, `person1` and `person2`, and each of those 
would have three keys called `name`, `age` and `shoesize` associated with 
their respective values. If I wrote the whole thing out longhand in Python 
notation, it would look like this:

```python
[
  { 
    "name": "Anne",
    "age": 34,
    "shoesize": 6
  },
  {
    "name": "Bob",
    "age": 29,
    "shoesize": 11
  }
]```

And that conveniently happens to be JSON notation too! That’s all JSON really 
is: a convenient way of describing data structures as combinations of 
(what in Python we would call) dictionaries and lists so they can be saved 
into files or transmitted over communications links (e.g. over the web or 
between applications).

[lists]: https://docs.python.org/2/tutorial/datastructures.html#more-on-lists
[dictionaries]: https://docs.python.org/2/tutorial/datastructures.html#dictionaries
[JSON]: https://en.wikipedia.org/wiki/JSON

### Reading and Writing JSON

Python also conveniently comes with a [json](https://docs.python.org/3/library/json.html) module, that makes it easy to read and write JSON. First you need to import it:

In [7]:
import json

And then you can *serialize* your Python data structures easily as JSON, using the `dumps` function:

In [10]:
person = {"name": "Anne", "age": 34, "shoesize": 6}
json_text = json.dumps(person)
print(json_text)

{"age": 34, "name": "Anne", "shoesize": 6}


If you want you can choose to serialize it in a pretty format using linebreaks and indentation:

In [12]:
json_text = json.dumps(person, indent=2)
print(json_text)

{
  "age": 34,
  "name": "Anne",
  "shoesize": 6
}


And once you have the text you can *parse* it back into Python again with the *loads* function:

In [13]:
person = json.loads(json_text)
print(person)

{'age': 34, 'name': 'Anne', 'shoesize': 6}


Since you can serialize and parse JSON using strings you can also save them to files and read them back later, or send them over the network, which is what a great deal of [Web APIs](https://en.wikipedia.org/wiki/Web_API) do.

### Tweets

Twitter makes each tweet available as JSON using its [API](https://dev.twitter.com/overview/documentation). Their field guide provides a very good overview of the data you can find for each tweet. The text of each tweet is actually only about 2% of all the data in a tweet.

Here is a particular (famous) tweet:

In [3]:
%%HTML

<blockquote class="twitter-tweet" lang="en"><p lang="en" dir="ltr">Wow...A man picks up burning tear gas can and throws it back at police. <a href="https://twitter.com/hashtag/ferguson?src=hash">#ferguson</a> pic by <a href="https://twitter.com/kodacohen">@kodacohen</a> <a href="https://twitter.com/stltoday">@stltoday</a> <a href="http://t.co/SASXU1yF3E">pic.twitter.com/SASXU1yF3E</a></p>&mdash; Lynden Steele (@manofsteele) <a href="https://twitter.com/manofsteele/status/499432557906104320">August 13, 2014</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

And here is its JSON representation:


```json
{
  "contributors": null,
  "truncated": false,
  "text": "Wow...A man picks up burning tear gas can and throws it back at police. #ferguson pic by @kodacohen @stltoday http://t.co/SASXU1yF3E",
  "is_quote_status": false,
  "in_reply_to_status_id": null,
  "id": 499432557906104320,
  "favorite_count": 4952,
  "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
  "retweeted": false,
  "coordinates": null,
  "entities": {
    "symbols": [],
    "user_mentions": [
      {
        "id": 17073225,
        "indices": [
          89,
          99
        ],
        "id_str": "17073225",
        "screen_name": "kodacohen",
        "name": "Robert Cohen"
      },
      {
        "id": 6039302,
        "indices": [
          100,
          109
        ],
        "id_str": "6039302",
        "screen_name": "stltoday",
        "name": "STLtoday"
      }
    ],
    "hashtags": [
      {
        "indices": [
          72,
          81
        ],
        "text": "ferguson"
      }
    ],
    "urls": [],
    "media": [
      {
        "expanded_url": "http://twitter.com/manofsteele/status/499432557906104320/photo/1",
        "display_url": "pic.twitter.com/SASXU1yF3E",
        "url": "http://t.co/SASXU1yF3E",
        "media_url_https": "https://pbs.twimg.com/media/Bu5XQ6KCIAEPh81.jpg",
        "id_str": "499432556685565953",
        "sizes": {
          "small": {
            "h": 187,
            "resize": "fit",
            "w": 340
          },
          "large": {
            "h": 550,
            "resize": "fit",
            "w": 999
          },
          "medium": {
            "h": 330,
            "resize": "fit",
            "w": 600
          },
          "thumb": {
            "h": 150,
            "resize": "crop",
            "w": 150
          }
        },
        "indices": [
          110,
          132
        ],
        "type": "photo",
        "id": 499432556685565953,
        "media_url": "http://pbs.twimg.com/media/Bu5XQ6KCIAEPh81.jpg"
      }
    ]
  },
  "in_reply_to_screen_name": null,
  "id_str": "499432557906104320",
  "retweet_count": 8223,
  "in_reply_to_user_id": null,
  "favorited": false,
  "user": {
    "follow_request_sent": false,
    "has_extended_profile": false,
    "profile_use_background_image": true,
    "default_profile_image": false,
    "id": 16661744,
    "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme9/bg.gif",
    "verified": true,
    "profile_text_color": "666666",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/419932700590358528/f05wsGeS_normal.jpeg",
    "profile_sidebar_fill_color": "252429",
    "entities": {
      "url": {
        "urls": [
          {
            "url": "http://t.co/EXlC5LuwAF",
            "indices": [
              0,
              22
            ],
            "expanded_url": "http://www.stltoday.com",
            "display_url": "stltoday.com"
          }
        ]
      },
      "description": {
        "urls": []
      }
    },
    "followers_count": 2159,
    "profile_sidebar_border_color": "181A1E",
    "id_str": "16661744",
    "profile_background_color": "1A1B1F",
    "listed_count": 167,
    "is_translation_enabled": false,
    "utc_offset": -18000,
    "statuses_count": 2858,
    "description": "St. Louis Post-Dispatch   Director of Photography",
    "friends_count": 1647,
    "location": "St. Louis",
    "profile_link_color": "2FC2EF",
    "profile_image_url": "http://pbs.twimg.com/profile_images/419932700590358528/f05wsGeS_normal.jpeg",
    "following": false,
    "geo_enabled": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/16661744/1391829947",
    "profile_background_image_url": "http://abs.twimg.com/images/themes/theme9/bg.gif",
    "screen_name": "manofsteele",
    "lang": "en",
    "profile_background_tile": false,
    "favourites_count": 110,
    "name": "Lynden Steele",
    "notifications": false,
    "url": "http://t.co/EXlC5LuwAF",
    "created_at": "Thu Oct 09 03:53:26 +0000 2008",
    "contributors_enabled": false,
    "time_zone": "Central Time (US & Canada)",
    "protected": false,
    "default_profile": false,
    "is_translator": false
  },
  "geo": null,
  "in_reply_to_user_id_str": null,
  "possibly_sensitive": false,
  "possibly_sensitive_appealable": false,
  "lang": "en",
  "created_at": "Wed Aug 13 05:49:35 +0000 2014",
  "in_reply_to_status_id_str": null,
  "place": null,
  "extended_entities": {
    "media": [
      {
        "expanded_url": "http://twitter.com/manofsteele/status/499432557906104320/photo/1",
        "display_url": "pic.twitter.com/SASXU1yF3E",
        "url": "http://t.co/SASXU1yF3E",
        "media_url_https": "https://pbs.twimg.com/media/Bu5XQ6KCIAEPh81.jpg",
        "id_str": "499432556685565953",
        "sizes": {
          "small": {
            "h": 187,
            "resize": "fit",
            "w": 340
          },
          "large": {
            "h": 550,
            "resize": "fit",
            "w": 999
          },
          "medium": {
            "h": 330,
            "resize": "fit",
            "w": 600
          },
          "thumb": {
            "h": 150,
            "resize": "crop",
            "w": 150
          }
        },
        "indices": [
          110,
          132
        ],
        "type": "photo",
        "id": 499432556685565953,
        "media_url": "http://pbs.twimg.com/media/Bu5XQ6KCIAEPh81.jpg"
      }
    ]
  }
}
```

Quite a bit to chew on there right?