# JSON

## Jupyter Notebook

Jupyter Notebooks are a way to interactively run Python code in a web browser. They are a great way to learn Python and to share your work with others. This notebook is a Jupyter Notebook. It is made up of cells. Each cell can contain text or code. This cell contains text. The next cell contains code. To run the code in a cell, click on the cell and then click the "Run" button in the toolbar above. You can also run a cell by clicking on the cell and then pressing "Shift + Enter" on your keyboard.

Name Jupyter is a combination of Julia, Python, and R. These are the three programming languages that Jupyter was originally designed to work with. Jupyter now supports many other programming languages, including C++, Java, and JavaScript. Jupyter is an open source project.

Main Jupyter page is here: https://jupyter.org/

## Markdown cells

Markdown Cells can be formatted using Markdown. Markdown is a simple markup language. It is used to format text. Markdown is used in many places, including GitHub, Reddit, and Slack. Markdown is easy to learn. You can learn more about Markdown here: https://www.markdownguide.org/

* [Markdown guide at GitHub](https://guides.github.com/features/mastering-markdown/) is also a good resource.

I can insert images such as Python logo:
![Python logo](https://www.python.org/static/community_logos/python-logo-master-v3-TM.png)

In [5]:
# this is a program
# let's print our python version
import sys # this import will stay in place for the rest of the program
print("Python version is", sys.version)

Python version is 3.11.4 (tags/v3.11.4:d2340ef, Jun  7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)]


In [6]:
# variable persist through the program cells
name = "Valdis"
age = 49
print(name, age)

Valdis 49


In [7]:
age += 1
# new age
print(f"New age is {age}")
# so I can run this cell multiple times and it will keep adding 1 to age

New age is 50


In [8]:
## one critique of Notebook is that it is not easy to see the order of execution
# you can execute cells out of order and that can be confusing

# Your goal should be that notebook can be run from top to bottom and it should work

In [9]:
# i can use ! syntax to run shell commands
!python --version

Python 3.11.4


## JSON

JSON stands for JavaScript Object Notation. It is a way to store data. It is used in many places, including web applications and APIs. JSON is easy to read and write. It is also easy for computers to parse and generate. JSON is a text format. It is human-readable. It is also machine-readable. JSON is a subset of JavaScript.

JSON was popularized by Douglas Crockford. He wrote a book called "JavaScript: The Good Parts". He also created a website called JSON.org. JSON is a lightweight data-interchange format. It is easy for humans to read and write. It is also easy for machines to parse and generate. JSON is based on a subset of the JavaScript Programming Language. It is used in many places, including web applications and APIs. JSON is a text format. It is human-readable. It is also machine-readable. JSON is a subset of JavaScript.

JSON is used for:

* Configuration files (Visual Studio Code, etc.)
* Data storage and transfer over network
* Data exchange between web browser and server
* Interstingly Jupyter Notebook files are also JSON files - .ipynb is just JSON file with special structure




### JSON philosophy

JSON is supposed to be easy for humans to read and write. It is also supposed to be easy for machines to parse and generate. It is human-readable. It is also machine-readable. JSON is a subset of JavaScript.

**JSON is a text format. **

In [10]:
## In Python we need json module to work with json data
import json # standard library module

In [11]:
my_data = {
    "name": "Valdis",
    "age": 49,
    "children": ("Rūta", "Maija", "Edīte"),
    "isMarried": True,
    "cars": [
        {"model": "Tesla X", "year": 2019},
        {"model": "Tesla S", "year": 2017},
    ],
    "friends": [
        {"name": "Aivars", "age": 49},
        {"name": "Aivis", "age": 49},
        {"name": "Aigars", "age": 49,
         "cars": [
             {"model": "Tesla X", "year": 2019},
            {"model": "Tesla 3", "year": 2022},
         ]
        },

    ]
}
# this is not JSON! This is Python dictionary

In [12]:
# type
print(type(my_data))

<class 'dict'>


In [13]:
# let's get year for my last friend's last car
print(my_data["friends"][-1]["cars"][-1]["year"])
# so I have a dictionary some values are strings, some are numbers, some are lists, some are dictionaries
# and then that list has dictionaries inside
# and those dictionaries have lists inside
# finally some of those lists have dictionaries inside

# so data can be hierarchical - not tabular

2022


In [14]:
# JSON is a text format for storing and transporting data
# i can get a JSON string from my dictionary
json_data = json.dumps(my_data) # note dumps not dump - it creates a string from Python data structure
print(json_data)

{"name": "Valdis", "age": 49, "children": ["R\u016bta", "Maija", "Ed\u012bte"], "isMarried": true, "cars": [{"model": "Tesla X", "year": 2019}, {"model": "Tesla S", "year": 2017}], "friends": [{"name": "Aivars", "age": 49}, {"name": "Aivis", "age": 49}, {"name": "Aigars", "age": 49, "cars": [{"model": "Tesla X", "year": 2019}, {"model": "Tesla 3", "year": 2022}]}]}


In [None]:
# we see that some of the conversion is not perfect
# some structures such as tuples do not have a direct JSON equivalent
# so they are converted to lists

In [15]:
# now i can save this json string to a file
with open("my_data.json", "w") as f:
    f.write(json_data) #i just write the string to a file
# for English and computer only files this would be fine but for user facing data it is not optimal

In [16]:
# first let's fix indentation
# also we can write directly form any Python data structure using json.dump
with open("my_data_indented.json", "w") as f:
    # not dump not dumps since we are writing to a file immediately
    json.dump(my_data, f, indent=4) # indent is optional but makes it easier to read

In [17]:
# then we can fix our utf-8 encoding
with open("my_data_indented_utf8.json", "w", encoding="utf-8") as f:
    json.dump(my_data, 
              f, 
              indent=4, 
              ensure_ascii=False) # ensure_ascii=False will use utf-8 encoding

## Reading JSON

reading is also called parsing

there is also serialization - writing JSON (and other formats)
deserialization - reading JSON

In [18]:
# lets read our utf-8 file back
# we use r mode by default
with open("my_data_indented_utf8.json", encoding="utf-8") as f:
    new_data = json.load(f) # we read the file and convert it to Python data structure
# so what data type?
print(type(new_data)) # it is a dictionary

<class 'dict'>


In [21]:
# again let's get last car year for my last friend
print(new_data["friends"][-1]["cars"][-1]["year"])

2022


In [None]:
# let's make a quick translation table between Python and JSON
# Python -> JSON
# dict -> object
# list, tuple -> array
# str -> string
# int, float -> number
# True -> true
# False -> false
# None -> null

# JSON -> Python
# object -> dict
# array -> list
# string -> str
# number (int) -> int
# number (real) -> float
# true -> True
# false -> False
# null -> None


In [22]:
# so if we compare data types for my children
print(type(my_data["children"])) # tuple
print(type(new_data["children"])) # list

<class 'tuple'>
<class 'list'>


## Reading JSON data from web

JSON is widely used by various web APIs. For example, we can get weather data from OpenWeatherMap API. We need to register and get API key. Then we can use it to get weather data for any city in the world.

We will start by using fake JSON API from https://jsonplaceholder.typicode.com/

We will use requests library to get data from web.

Requests is just a wrapper around standard Python library urllib. It makes it easier to use.

In [23]:
# if you do not have requests module installed you can install it with
# !pip install requests from notebook cell
# or pip install requests from command line - terminal
import requests # we need to import requests module to make http requests
# can we get version of requests module?
print(requests.__version__) # should not really matter but it is good to know

2.31.0


In [24]:
# then we need url address which we want to get data from
url = "https://jsonplaceholder.typicode.com/users"
print(f"Getting data from {url}")

Getting data from https://jsonplaceholder.typicode.com/users


In [25]:
# making a get request to our url
response = requests.get(url) # this makes a request to our url
# there are many options
# can use post, put, delete, patch, head, options
# can put headers in our request for authorization, content type, etc.
# first let's get a status code
print(response.status_code) # 200 is good, 404 is not found, 500 is server error

# full list of http status codes: https://en.wikipedia.org/wiki/

# good one is 418 I'm a teapot
# https://en.wikipedia.org/wiki/HTTP_418

200


In [26]:
# now that we have a response in our memory we can get data from it
# let'' get pure text for now
text_data = response.text # just a string
# first 100 characters
print(text_data[:100])

[
  {
    "id": 1,
    "name": "Leanne Graham",
    "username": "Bret",
    "email": "Sincere@april.


In [27]:
# we could change text(str) into our python data structure with
json_data = json.loads(text_data) # loads from string
print(type(json_data)) # list of dictionaries

<class 'list'>


In [28]:
# let's get first user's name
print(json_data[0]["name"])

Leanne Graham


In [29]:
# we do not really need this two step process
# requests can give us json data directly
json_data_also = response.json() # this will convert json string to python data structure

# let's get first user's name
print(json_data_also[0]["name"])

Leanne Graham


## Typical JSON structure

Typically JSON data is a dictionary with keys and values. Values can be strings, numbers, lists, or other dictionaries.

Alternatively, JSON data can be a list of dictionaries.

In [30]:
# so here if I want longitute of first person's address I would use
print(json_data_also[0]["address"]["geo"]["lng"])

81.1496


## Awesome JSON APIs

There is a cool list of curated JSON APIs here:
* https://github.com/public-apis/public-apis

Many APIs have rate limits, some need registration to get API key - we pass that in header

Some APIs that are free can be down since they are volunteer based mostly.

In [31]:
cat_url = "https://cat-fact.herokuapp.com/facts"
cat_response = requests.get(cat_url)
# status code
print(cat_response.status_code) # we could use status code in if statement

200


In [32]:
cat_data = cat_response.json()
# type
print(type(cat_data)) # list

<class 'list'>


In [33]:
# let's print first 3 cat facts
for cat_fact in cat_data[:3]:
    print(cat_fact["text"])

Owning a cat can reduce the risk of stroke and heart attack by a third.
Most cats are lactose intolerant, and milk can cause painful stomach cramps and diarrhea. It's best to forego the milk and just give your cat the standard: clean, cool drinking water.
Domestic cats spend about 70 percent of the day sleeping and 15 percent of the day grooming.
