![NASA](http://www.nasa.gov/sites/all/themes/custom/nasatwo/images/nasa-logo.svg)

<center>
<h1><font size="+3">GSFC Python Bootcamp</font></h1>
</center>

---
<center>
<H1 style="color:red">
Introduction to JSON
</H1>
</center>


In [None]:
from __future__ import print_function

## Reference Documents

* <a href="https://realpython.com/python-json/">Working With JSON Data in Python </a>

## <font color="red"> What is JSON?</font>

* JSON (JavaScript Object Notation) is a popular data format used for representing structured data. 
* It is a text format that is language independent and can be used in Python, Perl among other languages. 
* JSON format is used for data communications between servers and web applications.
* It is built on two structures:

     - A collection of name/value pairs. This is realized as an object, record, dictionary, hash table, keyed list, or associative array.
     - An ordered list of values. This is realized as an array, vector, list, or sequence.

## <font color="red">JSON Data</font>

In [None]:
import json

The main functions of `JSON` are:

* `dump()`: encoded string writing on file.
* `load()`: Decode while JSON file read.
* `dumps()`: encoding to JSON objects
* `loads()`: Decode the JSON string.

**Example of JSON Data**

```python
{
    "stations": [
        {
            "acronym": “BLD”, 
            "name": "Boulder Colorado",
            "latitude”: 40.00,
            "longitude”: -105.25
        }, 
        {
            "acronym”: “BHD”, 
            "name": "Baring Head Wellington New Zealand",
            "latitude": -41.28,
            "longitude": 174.87
        }
    ]
}
```

**Another Example of JSON Data**

We consider an online database, <a href="IP-API.com">IP-API.com</a>, that returns GeoIP data in JSON format. Simply opening <a href="http://ip-api.com/json/54.148.84.95">http://ip-api.com/json/54.148.84.95</a> will return the following JSON result:


```python
{
  "as": "AS16509 Amazon.com, Inc.",
  "city": "Boardman",
  "country": "United States",
  "countryCode": "US",
  "isp": "Amazon",
  "lat": 45.8696,
  "lon": -119.688,
  "org": "Amazon",
  "query": "54.148.84.95",
  "region": "OR",
  "regionName": "Oregon",
  "status": "success",
  "timezone": "America\/Los_Angeles",
  "zip": "97818"
}
```

To see your own Geolocation data in JSON format, just open <a href="http://ip-api.com/json/">http://ip-api.com/json/</a>.

+ The text in JSON is done through quoted string which contains value in key-value mapping within `{` `}`.
+ JSON data representation is very similar to Python dictionaries.
+ It supports primitive types, like strings and numbers, as well as nested lists and objects

## <font color="red">Convert Python Data Types into JSON </font>

Python objects and their equivalent conversion to JSON.

| Python | JSON Equivalent |
| --- | ---  |
| `dict` | object |
| `list`, `tuple` | array |
| `str` | string |
| `int`, `float` | number |
| `True` | true |
| `False` | false |
| `None` | null |


In [None]:
print("Dictionary: ", json.dumps({"name": "John", "age": 30}))
print("List:       ", json.dumps(["apple", "bananas"]))
print("Tuple:      ", json.dumps(("apple", "bananas")))
print("String:     ", json.dumps("hello"))
print("Integer:    ", json.dumps(42))
print("Float:      ", json.dumps(31.76))
print("True:       ", json.dumps(True))
print("False:      ", json.dumps(False))
print("None:       ", json.dumps(None))

**Convert JSON to Python Object (Dict)**

In [None]:
json_data = '{"acronym": "BLD", "name": "Boulder Colorado", \
              "latitude": 40.00, "longitude": -105.25}'
python_obj = json.loads(json_data)
print("Name:     ", python_obj["name"])
print("Acronym:  ", python_obj["acronym"])
print("Latitude: ", python_obj["latitude"])
print("Latitude: ", python_obj["longitude"])

print()


In [None]:
print(json.dumps(python_obj, sort_keys=True, indent=4))

**Convert JSON to Python Object (List)**

In [None]:
array = '{"drinks": ["coffee", "tea", "water"]}'
data = json.loads(array)

print(json.dumps(data, sort_keys=True, indent=4))
print()
for element in data["drinks"]:
    print(element)

**Convert JSON to Python Object**

In [None]:
json_input = '{"stations": [{"acronym": "BLD", \
                                "name": "Boulder Colorado", \
                            "latitude": 40.00, \
                            "longitude": -105.25}, \
                            {"acronym": "BHD", \
                             "name": "Baring Head Wellington New Zealand",\
                             "latitude": -41.28, \
                             "longitude": 174.87}]}'

In [None]:
decoded = json.loads(json_input)
for x in decoded['stations']:
    print(x["name"])

In [None]:
print(json.dumps(decoded, sort_keys=True, indent=4))

**Convert Python Object (Dict) to JSON**

In [None]:
d = {}
d["name"] = "Boulder Colorado"
d["acronym"] = "BLD"
d["latitude"] = 40.00
d["longitude"] = -105.25
print(json.dumps(d, ensure_ascii=False))

**Convert Python Objects into JSON**

In [None]:
x = {
  "name": "John",
  "age": 30,
  "married": True,
  "divorced": False,
  "children": ("Ann","Billy"),
  "pets": None,
  "cars": [
    {"model": "BMW 230", "mpg": 27.5},
    {"model": "Ford Edge", "mpg": 24.1}
  ]
}

print("Python Object: \n\t", x)

In [None]:
for key in x:
    print(key+":\t", x[key])

In [None]:
x_js = json.dumps(x)
print("Corresponding JSON Object: \n\t", x_js)

Going Back to Python

In [None]:
y = json.loads(x_js)
for key in y:
    print(key+":\t", y[key])

**Convert Python String to JSON**

In [None]:
json_string = """
{
    "researcher": {
        "name": "Ford Prefect",
        "species": "Betelgeusian",
        "relatives": [
            {
                "name": "Zaphod Beeblebrox",
                "species": "Betelgeusian"
            }
        ]
    }
}
"""
data = json.loads(json_string)
print(json.dumps(data, sort_keys=True, indent=4))

## <font color="red"> Serialization and Deserialization</font>

**Serialization**

We use the `dump()` that takes two arguments: 
* The data object to be serialized.
* The file object to which it will be written (Byte format).

In [None]:
file_name = "Sample.json"
with open(file_name, "w") as fid: 
     json.dump(x, fid)

**Deserializing JSON**

* The Deserialization is opposite of Serialization, i.e. conversion of JSON object into their respective Python objects. 
* We use the `load()` function which is usually used to load from string, otherwise the root object is in list or dict.

In [None]:
with open(file_name, "r") as fid: 
     z = json.load(fid)
        
print(z)
for key in z:
    print(key+":\t", z[key])

**Example**

Consider the following JSON data (from NASA's Astronomy Picture of the Day API) that we write in a file named `apod.json`.

In [None]:
%%writefile apod.json
{
    "media_type": "image",
    "copyright": "Yin Hao",
    "date": "2018-10-30",
    "url": "https://apod.nasa.gov/apod/image/1810/Orionids_Hao_960.jpg",
    "explanation": "Meteors have been shooting out from the constellation of Orion. This was expected, as October is the time of year for the Orionids Meteor Shower.  Pictured here, over two dozen meteors were caught in successively added exposures last October over Wulan Hada volcano in Inner Mongolia, China. The featured image shows multiple meteor streaks that can all be connected to a single small region on the sky called the radiant, here visible just above and to the left of the belt of Orion, The Orionids meteors started as sand sized bits expelled from Comet Halley during one of its trips to the inner Solar System. Comet Halley is actually responsible for two known meteor showers, the other known as the Eta Aquarids and visible every May. An Orionids image featured on APOD one year ago today from the same location shows the same car. Next month, the Leonids Meteor Shower from Comet Tempel-Tuttle should also result in some bright meteor streaks. Follow APOD on: Facebook, Instagram, Reddit, or Twitter",
    "hdurl": "https://apod.nasa.gov/apod/image/1810/Orionids_Hao_2324.jpg",
    "title": "Orionids Meteors over Inner Mongolia",
    "service_version": "v1"
}

In [None]:
# Read the file
with open("apod.json", "r") as f:
    json_text = f.read()

In [None]:
# Decode the JSON string into a Python dictionary.
apod_dict = json.loads(json_text)
print(apod_dict['explanation'])

In [None]:
# Encode the Python dictionary into a JSON string.
new_json_string = json.dumps(apod_dict, indent=6)
print(new_json_string)

In [None]:
new_json_string = json.dumps(apod_dict, indent=6, sort_keys=True)
print(new_json_string)

## <font color='red'>Web Scraping with JSON</font>

**First Example**

In [None]:
import urllib
url = "https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY"
with urllib.request.urlopen(url) as page:
     page_content = page.read()

In [None]:
# To turn it into a string, for example, we use the decode method:
page_content.decode('utf-8')

In [None]:
# Process the data with JSON
json_page = json.loads(page_content)

In [None]:
# print the keys
for x in json_page:
    print(x)

In [None]:
# print the keys
for x in json_page:
    print(x+" -->", json_page[x])
    print()

In [None]:
from pprint import pprint
pprint(json_page)

In [None]:
%matplotlib inline
from skimage import io

io.imshow(io.imread(json_page["url"]))
io.show()

In [None]:
io.imshow(io.imread(json_page["hdurl"]))
io.show()

**Second Example**

We will fetch data of CityBike NYC (Bike Sharing System) from specified <a href="https://feeds.citibikenyc.com/stations/stations.json">https://feeds.citibikenyc.com/stations/stations.json</a> and convert into dictionary format.

In [None]:
import requests

# Get JSON string data from CityBike NYC using web requests library
json_response= requests.get("https://feeds.citibikenyc.com/stations/stations.json")

# Check type of json_response object
print(type(json_response.text))


In [None]:
# Load data in loads() function of json library
bike_dict = json.loads(json_response.text)

In [None]:
# Check type of news_dict
print(type(bike_dict))

In [None]:
# List all the keys
for key in bike_dict:
    print(key)

In [None]:
# Get stationBeanList key data from dict
print(bike_dict['stationBeanList'][0]) 

In [None]:
print(json.dumps(bike_dict['stationBeanList'][0], indent=3)) 

## <font color="red">Conclusion</font>

* JSON is standardized and language-independent (while `pickle` is specific to Python).
* It is more secure and much faster than `pickle`.
* If you only need to use Python, then the `pickle` module is still a good choice for its ease of use and ability to reconstruct complete Python objects.