[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/jkanclerz/data-science-workshop-2021/blob/main/01--edi/08--json.ipynb)

## Format Json

**J**ava **S**cript **O**bject **N**otation

- Items may have different fields
- fields have keys and values
- values may be different formats
- values may be json objects itself
- values may be collections

### dokumentacja

- http://www.json.org/
- https://docs.python.org/3/library/json.html

```python
# python script
import json
help(json)
```

### Najczesciej web service

- musicbrainz - https://musicbrainz.org/doc/Development/XML_Web_Service/Version_2
- nbp http://api.nbp.pl/

- https://musicbrainz.org/ws/2/artist/00a9f935-ba93-4fc8-a33a-993abe9c936b?fmt=json&inc=releases
- http://api.nbp.pl/api/exchangerates/tables/A?format=json
- https://api.chucknorris.io/

In [4]:
import json
help(json)

Help on package json:

NAME
    json

DESCRIPTION
    JSON (JavaScript Object Notation) <http://json.org> is a subset of
    JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data
    interchange format.
    
    :mod:`json` exposes an API familiar to users of the standard library
    :mod:`marshal` and :mod:`pickle` modules.  It is derived from a
    version of the externally maintained simplejson library.
    
    Encoding basic Python object hierarchies::
    
        >>> import json
        >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
        '["foo", {"bar": ["baz", null, 1.0, 2]}]'
        >>> print(json.dumps("\"foo\bar"))
        "\"foo\bar"
        >>> print(json.dumps('\u1234'))
        "\u1234"
        >>> print(json.dumps('\\'))
        "\\"
        >>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True))
        {"a": 0, "b": 0, "c": 0}
        >>> from io import StringIO
        >>> io = StringIO()
        >>> json.dump(['streaming API'], io

## Serializacja / deserializacja

In [5]:
person = {
    "name": "Jakub",
    "hobbies": ["lego", "bike", "DIY"]
}

In [6]:
type(person)

dict

In [7]:
person_as_string = json.dumps(person)

In [10]:
person_as_string

'{"name": "Jakub", "hobbies": ["lego", "bike", "DIY"]}'

In [11]:
type(person_as_string)

str

In [12]:
loaded_person = json.loads(person_as_string)

In [13]:
loaded_person

{'name': 'Jakub', 'hobbies': ['lego', 'bike', 'DIY']}

In [14]:
type(loaded_person)

dict

In [16]:
with open('var/jakub.json', 'w') as f:
    json.dump(person, f)

In [17]:
!cat var/jakub.json

{"name": "Jakub", "hobbies": ["lego", "bike", "DIY"]}

In [19]:
with open('var/jakub.json', 'r') as f:
    person_loaded_from_file = json.load(f)

In [20]:
person_loaded_from_file

{'name': 'Jakub', 'hobbies': ['lego', 'bike', 'DIY']}

## Rest API
Representational state transfer (REST) 

In [21]:
import requests

In [22]:
response = requests.get('https://api.chucknorris.io/jokes/random')

In [23]:
response.text

'{"categories":[],"created_at":"2020-01-05 13:42:22.089095","icon_url":"https://assets.chucknorris.host/img/avatar/chuck-norris.png","id":"TnN6vHAhSqCRPVQL53R-lw","updated_at":"2020-01-05 13:42:22.089095","url":"https://api.chucknorris.io/jokes/TnN6vHAhSqCRPVQL53R-lw","value":"When Chuck Norris shaves the razor blades get cut."}'

In [24]:
joke = json.loads(response.text)

In [25]:
joke

{'categories': [],
 'created_at': '2020-01-05 13:42:22.089095',
 'icon_url': 'https://assets.chucknorris.host/img/avatar/chuck-norris.png',
 'id': 'TnN6vHAhSqCRPVQL53R-lw',
 'updated_at': '2020-01-05 13:42:22.089095',
 'url': 'https://api.chucknorris.io/jokes/TnN6vHAhSqCRPVQL53R-lw',
 'value': 'When Chuck Norris shaves the razor blades get cut.'}

In [27]:
def random_joke():
    r = requests.get('https://api.chucknorris.io/jokes/random')
    joke_as_dict = r.json()
    return joke_as_dict.get('value')

In [28]:
random_joke()

'Chuck Norris is forbidden from competing in paintball games... for very fucking obvious reasons.'

In [29]:
random_joke()

'It is scientifically impossible for Chuck Norris to have had a mortal father. He went back in time and fathered himself.'

In [30]:
categories = requests.get('https://api.chucknorris.io/jokes/categories').json()

In [31]:
categories

['animal',
 'career',
 'celebrity',
 'dev',
 'explicit',
 'fashion',
 'food',
 'history',
 'money',
 'movie',
 'music',
 'political',
 'religion',
 'science',
 'sport',
 'travel']

### Complex structures

In [34]:
artist_releases = requests.get('https://musicbrainz.org/ws/2/artist/00a9f935-ba93-4fc8-a33a-993abe9c936b?fmt=json&inc=releases').json()

In [35]:
artist_releases

{'type-id': 'e431f5f6-b5d2-343d-8b36-72607fffb74b',
 'begin-area': {'type-id': None,
  'sort-name': 'Kitee',
  'disambiguation': '',
  'id': 'a62f5ab3-c6e2-4b1e-b35f-2d83cd20ab58',
  'name': 'Kitee',
  'type': None},
 'gender': None,
 'disambiguation': 'Finnish symphonic metal',
 'end-area': None,
 'type': 'Group',
 'end_area': None,
 'gender-id': None,
 'ipis': [],
 'id': '00a9f935-ba93-4fc8-a33a-993abe9c936b',
 'area': {'type-id': None,
  'id': '6a264f94-6ff1-30b1-9a81-41f7bfabd616',
  'iso-3166-1-codes': ['FI'],
  'sort-name': 'Finland',
  'disambiguation': '',
  'name': 'Finland',
  'type': None},
 'sort-name': 'Nightwish',
 'releases': [{'date': '1996',
   'packaging-id': None,
   'release-events': [{'area': {'id': '6a264f94-6ff1-30b1-9a81-41f7bfabd616',
      'iso-3166-1-codes': ['FI'],
      'type-id': None,
      'name': 'Finland',
      'type': None,
      'sort-name': 'Finland',
      'disambiguation': ''},
     'date': '1996'}],
   'status-id': '518ffc83-5cde-34df-8627-81bff

In [36]:
artist_releases.keys()

dict_keys(['type-id', 'begin-area', 'gender', 'disambiguation', 'end-area', 'type', 'end_area', 'gender-id', 'ipis', 'id', 'area', 'sort-name', 'releases', 'isnis', 'begin_area', 'country', 'life-span', 'name'])

In [38]:
artist_releases['releases'][0].keys()

dict_keys(['date', 'packaging-id', 'release-events', 'status-id', 'disambiguation', 'id', 'quality', 'packaging', 'text-representation', 'barcode', 'country', 'title', 'status'])

In [42]:
just_titles = map(lambda release: (release['title'], release['date']), artist_releases['releases'])

In [43]:
list(just_titles)

[('[demo]', '1996'),
 ('The Carpenter', '1997-09-30'),
 ('Angels Fall First', '1997-11-01'),
 ('Angels Fall First', '1997'),
 ('Oceanborn', '1998-12-07'),
 ('Sacrament of Wilderness', '1998'),
 ('Walking in the Air', '1999-03'),
 ('Oceanborn', '1999-04-21'),
 ('Passion and the Opera', '1999-05-31'),
 ('Oceanborn', '1999-05-31'),
 ('Oceanborn', '1999-05-31'),
 ('Oceanborn', '1999-07-01'),
 ('Sleeping Sun (4 Ballads of the Eclipse)', '1999-08-02'),
 ('Oceanborn', '1999-08-30'),
 ('Oceanborn', '1999-11-10'),
 ('The Pharaoh Sails to Hafenbahn', '1999-12-06'),
 ('Angels Fall First', '1999'),
 ('Angels Fall First', '1999'),
 ('Oceanborn', '1999'),
 ('Oceanborn', '1999'),
 ('Wishmaster', '2000-05-02'),
 ('Wishmaster', '2000-05-29'),
 ('The Kinslayer', '2000-05-29'),
 ('Wishmaster', '2000-05-29'),
 ('Wishmaster', '2000-06')]