## JSON

# JSON - Javascript Object Notation
#### Invented by Douglas Crockford when working at Yahoo in early 2000s.

* Goal - Human Readable, Machine Parsable

* Specification: https://www.json.org/

JSON — short for JavaScript Object Notation — format for sharing data. 

JSON is derived from the JavaScript programming language

Available for use by many languages including Python 

usually file extension is .json when stored



In [None]:
# Sample JSON below from https://json.org/example.html
# Question why is Syntax highlighting working properly ? :)

In [None]:
{"widget": {
    "debug": "on",
    "window": {
        "title": "Sample Konfabulator Widget",
        "name": "main_window",
        "width": 500,
        "height": 500
    },
    "image": { 
        "src": "Images/Sun.png",
        "name": "sun1",
        "hOffset": 250,
        "vOffset": 250,
        "alignment": "center"
    },
    "text": {
        "data": "Click Here",
        "size": 36,
        "style": "bold",
        "name": "text1",
        "hOffset": 250,
        "vOffset": 100,
        "alignment": "center",
        "onMouseUp": "sun1.opacity = (sun1.opacity / 100) * 90;"
    }
}}    


In [None]:
# if this was string starting with { it would be our json
mydata = {
    "firstName": "Jane",
    "lastName": "Doe",
    "hobbies": ["running", "sky diving", "dancing"],
    "age": 43,
    "children": [
        {
            "firstName": "Alice",
            "age": 7
        },
        {
            "firstName": "Bob",
            "age": 13
        }
    ]
}

In [None]:
type(mydata)

In [None]:
print(mydata)

In [None]:
mylist = list(range(10))
print(mylist)

The process of encoding JSON is usually called serialization. This term refers to the transformation of data into a series of bytes (hence serial) to be stored or transmitted across a network. You may also hear the term marshaling, but that’s a whole other discussion. Naturally, deserialization is the reciprocal process of decoding data that has been stored or delivered in the JSON standard.

All we’re talking about here is reading and writing. Think of it like this: encoding is for writing data to disk, while decoding is for reading data into memory.
 https://realpython.com/python-json/

In [None]:
import json

In [None]:
with open("data_file.json", mode="w") as write_file:
    json.dump(mydata, write_file)

In [None]:
with open("numbers.json", mode="w") as write_file:
    json.dump(mylist, write_file)

In [None]:
# use json string in our program
json_string = json.dumps(mydata)
print(json_string)

In [None]:
print(mydata)

In [None]:
# Convert Json_string back to our Python Object
my_obj = json.loads(json_string)
my_obj

In [None]:
newlist = json.loads('[1,3,5,"Valdis"]')
newlist

In [None]:
badlist = json.loads('[1,3,5,"Vald"]')
badlist

In [None]:
type(json_string)

In [None]:
# Avove example JSON and Python object have the same syntax but there are some differences

![object](../img/object.png)

![Array](../img/array.png)

![Value](../img/value.png)

Simple Python objects are translated to JSON according to a fairly intuitive conversion.

Python	JSON

dict	object

list, tuple	array

str	string

int, long, 

float	number

True	true

False	false

None	null

In [None]:
newlist = json.loads('[true,2,null, false, 555.333]')
newlist

In [None]:
# The first option most people want to change is whitespace. You can use the indent keyword argument to specify the indentation size for nested structures. Check out the difference for yourself by using data, which we defined above, and running the following commands in a console:

json.dumps(mydata)


In [None]:
# very useful for visibility!
print(json.dumps(mydata, indent=4))

In [None]:
with open("data_file.json", "w") as write_file:
    json.dump(mydata, write_file, indent=4)

In [None]:
with open("data_file.json", "r") as read_file:
    data = json.load(read_file)
data

In [None]:
type(data)

In [None]:
len(data)

In [None]:
type(data[0]), type(data[1])

Keep in mind that the result of this method could return any of the allowed data types from the conversion table. This is only important if you’re loading in data you haven’t seen before. In most cases, the root object will be a dict or a list.

If you've gotten JSON data in from another program or have otherwise obtained a string of JSON formatted data in Python, you can easily deserialize that with loads(), which naturally loads from a string:

In [None]:
json_string = """
{
    "researcher": {
        "name": "Ford Prefect",
        "species": "Betelgeusian",
        "relatives": [
            {
                "name": "Zaphod Beeblebrox",
                "species": "Betelgeusian"
            }
        ]
    }
}
"""
data = json.loads(json_string)
data

In [None]:
# get value of relative's name
data['researcher']

In [None]:
# get value of relative's name
data['researcher']['relatives']

In [None]:
# get value of relative's name
data['researcher']['relatives'][0]

In [None]:
# get value of relative's name
data['researcher']['relatives'][0]['name']

In [None]:
data['researcher']['relatives'][0]['name'].split()[0]

In [None]:
data['researcher']['relatives'][0]['name'].split()[0][:4]

In [None]:
type(data)

In [None]:
import json
import requests

In [None]:
## Lets get some data https://jsonplaceholder.typicode.com/

In [None]:
response = requests.get("https://jsonplaceholder.typicode.com/todos")
if response.status_code != 200:
    print("Bad Response: ", response.status_code)
print(response.status_code)
todos = json.loads(response.text)


can open https://jsonplaceholder.typicode.com/todos in regular browser too..

In [None]:
type(todos)

In [None]:
len(todos)

In [None]:
todos[:10]

In [None]:
myl = [('Valdis', 40), ('Alice',35), ('Bob', 23),('Carol',70)]

In [None]:
# Lambda = anonymous function

In [None]:
def myfun(el):
    return el[1]
# same as myfun = lambda el: el[1]

In [None]:
sorted(myl, key = lambda el: el[1], reverse=True)

In [None]:
# Exercise find out top 3 users with most tasks completed!

# TIPS
# we need some sort of structure to store these user results before finding out top 3
# at least two good data structure choices here :)
# here the simplest might actually be the best if we consider userId values


In [None]:
todos[0]

In [None]:
todos[0]['userId']

In [None]:
todos[0]['completed']

In [None]:
# Here we create a new dictionary and and count the completed works by id
newdict = {}
for todo in todos:
    if todo['completed'] == True:
        if todo['userId'] in newdict:
            newdict[todo['userId']] += 1
        else:
            newdict[todo['userId']] = 1

In [None]:
newdict

In [None]:
sorted(newdict.items())

In [None]:
bestworkers = sorted(newdict.items(), key=lambda el: el[1], reverse=True)
bestworkers[:3]

In [None]:
users = [ el['userId'] for el in todos]
len(users),users[:15]

In [None]:
uniqusers = set(users)
uniqusers

In [None]:
# dictionary comprehension but could live without one
users = { el['userId'] : 0 for el in todos} 

In [None]:
users

In [None]:
users.keys()

In [None]:
users.value

In [None]:
#{'completed': True,
# 'id': 8,
#  'title': 'quo adipisci enim quam ut ab',
#  'userId': 1}

In [None]:
#idiomatic
for el in todos:
    users[el['userId']] += el['completed'] # Boolean False is 0 True is 1 obviously this might not be too readable

In [None]:
# same as above could be useful in more complicated cases
for el in todos:
    if el['completed'] == True:
        users[el['userId']] += 1

In [None]:
# there could be a one liner or a solution with from collections import Counter

In [None]:
users.items()

In [None]:
list(users.items())

In [None]:
userlist=list(users.items())

In [None]:
type(userlist[0])

In [None]:
# we pass a key anonymous(lambda) function
sorted(userlist, key=lambda el: el[1], reverse=True)[:3]

In [None]:
# lets try a simple way

In [None]:
mylist=[0]
mylist*=11

In [None]:
for el in todos:
    if el['completed'] == True:
        mylist[el['userId']] +=1

In [None]:
mylist

In [None]:
mylist.index(max(mylist))

In [None]:
# kind of hard to get more values need to get tricky

# How about Pandas and Json ?

In [None]:
import pandas as pd

In [None]:
df = pd.read_json('https://jsonplaceholder.typicode.com/todos')

In [None]:
df

In [None]:
df.groupby(['userId'])['completed'].sum()

In [None]:
df.groupby(['userId'])['completed'].sum().sort_values()

In [None]:
df.groupby(['userId'])['completed'].sum().sort_values(ascending=False)

# Exercise Find Public JSON API get data and convert it into Pandas DataFrame

## Many possible sources

https://github.com/toddmotto/public-apis
    
### You want the ones without authorization and WITH CORS unless you are feeling adventurous and want to try with auth



In [None]:
## For authorization you generally need some sort of token(key)
# One example for zendesk API  https://develop.zendesk.com/hc/en-us/community/posts/360001652447-API-auth-in-python


# For an API token, append '/token' to your username and use the token as the password:
## This will not work for those without zendesk access token

url = 'https://your_subdomain.zendesk.com/api/v2/users/123.json'
r = requests.get(url, auth=('user@example.com/token', 'your_token'))
# For an OAuth token, set an Authorization header:

bearer_token = 'Bearer ' + access_token
header = {'Authorization': bearer_token}
url = 'https://your_subdomain.zendesk.com/api/v2/users/123.json'
r = requests.get(url, headers=header)