# JSON files

by [Luciano Gabbanelli](https://www.linkedin.com/in/luciano-gabbanelli-ph-d-75302218)

<img width=80 src="https://media.giphy.com/media/KAq5w47R9rmTuvWOWa/giphy.gif">

<img width=150 src="Images/Assembler.png">

***

In computer science, a data structure is a particular way of organising and storing data in a computer such that it can be accessed and modified efficiently. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.

**Three different data structures:**

- Structured Data

- Unstructured Data

- Semi-structured Data

A last category of data type is metadata. From a technical point of view, this is not a separate data structure, but it is a very important element, particularly for Big Data analysis and big data solutions.

Check the following image. Do you understand what type of data each it represents?

<img width=550 src="Images/data-structures.png">

Can you do a little research about the difference between each of these types of data?

## Let's dive into JSON

<table>
<tr>
<td>

<img width=350 src="Images/Jason.jpg">

</td>
<td>
    
- JSON is text, written with JavaScript Object Notation.
    <br>
    
- JSON is based on JavaScript, but is language independent.
    <br>
    
- JSON is a syntax for storing and exchanging data.
    <br>
    
- Files have a **.json** extension.
    <br>
    
- Python has a built-in package called `json`, which can be used to work with JSON data.
    <br>
    
- You can parse JSON
    <br>
    
- Or convert from Python to JSON
    
</td>
</tr>
</table>

The data in JSON format can be used in practically all programming languages (such as Java,
C#, C, C++, PHP, JavaScript, Python, etc.). JSON means that a script (executable) file which is made of text in a programming language, is used to store and transfer the data. 

Python supports JSON through a built-in package called json. Let us start importing this package to handle JSON documents in Python scripts:

In [2]:
import json

# some JSON:
mystring =  '{"name":"John", "age":30, "city":"New York"}'

# as you can see, you have a JSON-formatted string
print(mystring)
print(type(mystring))

{"name":"John", "age":30, "city":"New York"}
<class 'str'>


As can be seen, JSON represents objects in a textual way using *key : value* pairs/mapping, enclosed between curly brackets, { and },  and separated by commas.

Keys are usually text in JSON done through quoted-string, “ and ”.

Values can be:

    - basic types like string, number, boolean, null;
    
    - arrays, between square brackets, [ and ];
    
    - other JSON objects, between braces, { and }.

###  Deserialization of JSON ()

You can also find it as: decode or read into Python, parse or convert from JSON to Python objects, and other creative ways.

If you have a JSON string, you can parse it by using the json.loads() method:

In [3]:
# parse mystring:
mydict = json.loads(mystring)

# if you print the output mydict, 
# there doesn't seem to be much difference with mystring
print(mydict)

# however there is such! 
# the result is a Python dictionary:
print(type(mydict))

# we can access its elements:
print(mydict["age"])

# check out this!
print(type(mydict["age"]))

{'name': 'John', 'age': 30, 'city': 'New York'}
<class 'dict'>
30
<class 'int'>


**Important:**

1) When we talk about JSON objects or documents in Python we mean text strings that contain a JSON document.

2) The Python data structure that best represents a JSON object is the dictionary. Therefore, any JSON object can be represented as a dictionary, and it will be the default data structure that the json library will use in JSON-Python conversions.

### Convert from Python to JSON

If you have a Python object, you can convert it into a JSON string by using the json.dumps() method.
You can also find it as: code or write a Python object into a JSON string.

**Note:** once you importa a library, yo do not have to import it again.

In [4]:
# a Python object (dict):
newdict = {
    "name": "Evaristo Paramos",
    "age": 63,
    "city": "Guillarei, Pontevedra",
    "Occupation": ["Compositor", "Cantante", "Escritor"]
}

In [5]:
# since we have a dictionary, its type is dict
print('newdict is of type', type(newdict))

# and being a dictionary, we can access the values, calling them by their keys
newdict["age"]

newdict is of type <class 'dict'>


63

In [6]:
# convert into JSON:
myJSONstring = json.dumps(newdict)

# the result is a JSON string:
print(myJSONstring)
print('myJSONstring is of type', type(myJSONstring))

{"name": "Evaristo Paramos", "age": 63, "city": "Guillarei, Pontevedra", "Occupation": ["Compositor", "Cantante", "Escritor"]}
myJSONstring is of type <class 'str'>


To be precise, you can convert Python objects of the following types, into JSON strings. Python objects are converted into the JSON equivalent:

|  <div style="width:90px">Python</div>  | JSON      |
|:---------|:----------|
|  dict    | Object    |
|  list    | Array     |
|  tuple   | Array     |
|  string  | String    |
|  int     | Number    |
|  float   | Number    |
|  True    | true      |
|  False   | false     |
|  None    | null      |

**To summ up:** The encoder understands several native Python types by default such as: **string, unicode, int, float, list, tuple, dict**.

Check it out with your own eyes:

In [7]:
print(json.dumps({"name": "John", "age": 30}))
print(json.dumps(["apple", "bananas"]))
print(json.dumps(("apple", "bananas", 32.5)))
print(json.dumps("hello"))
print(json.dumps(42))
print(json.dumps(31.76))
print(json.dumps(True))
print(json.dumps(False))
print(json.dumps(None))

{"name": "John", "age": 30}
["apple", "bananas"]
["apple", "bananas", 32.5]
"hello"
42
31.76
true
false
null


**Task:** Convert a Python object containing all the legal data types:

In [8]:
# let us define a python object
python_object = {
    # type your code here:
    # enter as much data as you want
    # cover all data types
}

# and transform it to JSON
JSON_object = json.dumps(python_object)

# print the object transformed
print(JSON_object)

{}


You can find a possible solution [here](https://www.w3schools.com/python/python_json.asp).

### You can give format to your JSON string

The above example prints a nearly unreadable JSON string; there are no indents, no line breaks.

The json.dumps() method has parameters to make it easier to read the result.

**Task:** Tune a parameter to indent the result. Run the following cell.

In [9]:
print ('JSON data:\n\n', json.dumps(newdict, indent=1), '\n\n-------------------------------------\n')
print(json.dumps(newdict, indent=4, separators=(". ", " = ")), '\n\n-------------------------------------\n')
print( json.dumps(newdict, indent=4, sort_keys=True))

JSON data:

 {
 "name": "Evaristo Paramos",
 "age": 63,
 "city": "Guillarei, Pontevedra",
 "Occupation": [
  "Compositor",
  "Cantante",
  "Escritor"
 ]
} 

-------------------------------------

{
    "name" = "Evaristo Paramos". 
    "age" = 63. 
    "city" = "Guillarei, Pontevedra". 
    "Occupation" = [
        "Compositor". 
        "Cantante". 
        "Escritor"
    ]
} 

-------------------------------------

{
    "Occupation": [
        "Compositor",
        "Cantante",
        "Escritor"
    ],
    "age": 63,
    "city": "Guillarei, Pontevedra",
    "name": "Evaristo Paramos"
}


## Tricky example:

Discuss the following Figure with your Squad:

<img width=500 src="Images/Extructura json.png">

In [10]:
# the code for this JSON string is:  
capo = '''[{
    "dorsal": 30,
    "name": "Lionel Andrés Messi Cuccittini",
    "demarcation": ["Forward", "Midfielder"],
    "team": "Paris Saint-Germain F.C."    
}]'''

**Task:** Discuss its type and print it. Then, parse it and print its type. Also prints its first element and its type. 

In [11]:
# type your code here:
print(capo)
print()
print(type(capo))

[{
    "dorsal": 30,
    "name": "Lionel Andrés Messi Cuccittini",
    "demarcation": ["Forward", "Midfielder"],
    "team": "Paris Saint-Germain F.C."    
}]

<class 'str'>


In [12]:
capo_but_in_Python = json.loads(capo)

In [13]:
print(capo_but_in_Python[0])
print()
print(type(capo_but_in_Python), type(capo_but_in_Python[0]))

{'dorsal': 30, 'name': 'Lionel Andrés Messi Cuccittini', 'demarcation': ['Forward', 'Midfielder'], 'team': 'Paris Saint-Germain F.C.'}

<class 'list'> <class 'dict'>


**Very optional task!**

**Check these two outputs:** They look the same but they're not. Try to understand the difference in the next few years :)

In [14]:
print ('DATA:', repr(capo_but_in_Python))
print()
print ('DATA:', str(capo_but_in_Python))
print()
print ('DATA:', repr(capo))
print()
print ('DATA:', str(capo))

DATA: [{'dorsal': 30, 'name': 'Lionel Andrés Messi Cuccittini', 'demarcation': ['Forward', 'Midfielder'], 'team': 'Paris Saint-Germain F.C.'}]

DATA: [{'dorsal': 30, 'name': 'Lionel Andrés Messi Cuccittini', 'demarcation': ['Forward', 'Midfielder'], 'team': 'Paris Saint-Germain F.C.'}]

DATA: '[{\n    "dorsal": 30,\n    "name": "Lionel Andrés Messi Cuccittini",\n    "demarcation": ["Forward", "Midfielder"],\n    "team": "Paris Saint-Germain F.C."    \n}]'

DATA: [{
    "dorsal": 30,
    "name": "Lionel Andrés Messi Cuccittini",
    "demarcation": ["Forward", "Midfielder"],
    "team": "Paris Saint-Germain F.C."    
}]


## JSON from files: `dumps()` V.S. `dump()`

JSON files are simple text files with a .json extension. 

A .json file can be created with any plain text editor. Care must be taken to always use plain text!  

**NOTE:** advanced word processors like Microsoft Word do not work with plain text and therefore will not be useful for working with .json files.

In what follows, I leave the code from the videos. There you can find the difference between the plural and singular.

In [15]:
# to interact with an appi getting a JSON object
import requests

responseObject = requests.get('http://api.open-notify.org/astros.json')
print(responseObject)
jsonData = responseObject.json()

<Response [200]>


Explore the page of the APPI Server: [http://api.open-notify.org/](http://api.open-notify.org/). Run the previous cell to pull the data to Python and obtain an object called `jsonData`. Read about the `requests` library and discuss with your Squad.

In forthcoming modules we will focus properly in APIs; this is just a snack! 🥨

Run the following cell to see the data:

In [16]:
jsonData

{'message': 'success',
 'people': [{'name': 'Sergey Prokopyev', 'craft': 'ISS'},
  {'name': 'Dmitry Petelin', 'craft': 'ISS'},
  {'name': 'Frank Rubio', 'craft': 'ISS'},
  {'name': 'Nicole Mann', 'craft': 'ISS'},
  {'name': 'Josh Cassada', 'craft': 'ISS'},
  {'name': 'Koichi Wakata', 'craft': 'ISS'},
  {'name': 'Anna Kikina', 'craft': 'ISS'},
  {'name': 'Fei Junlong', 'craft': 'Shenzhou 15'},
  {'name': 'Deng Qingming', 'craft': 'Shenzhou 15'},
  {'name': 'Zhang Lu', 'craft': 'Shenzhou 15'}],
 'number': 10}

In this new object you have all the astronauts who are in space right now.

If there is no match with the video, it is because an astronaut jumps or descends from up there.

**Question:** What type of object is `jsonData`? Makes sense?

In [17]:
type(jsonData)

dict

Let us now use the json dumps to write the json data to a file.

In [18]:
jsonStringData = json.dumps(jsonData)

with open('FilesRW/ISSAstrosDumps.json','w') as myFile:
    print('Now writting ISS Astros data to a file...')
    print()
    myFile.write(jsonStringData)

Now writting ISS Astros data to a file...



Let us check that the file was correctly created:

In [19]:
import os
if os.path.exists("FilesRW/ISSAstrosDumps.json"):
    print("The file is in my computer!")
else:
    print("The file does not exist =(")

The file is in my computer!


Now let us print the content of our file. You can open direcly the file also ;)

In [20]:
with open('FilesRW/ISSAstrosDumps.json','r') as myFile:
    print(myFile)

<_io.TextIOWrapper name='FilesRW/ISSAstrosDumps.json' mode='r' encoding='cp1252'>


**WTF!!! What is this?!**

Let us keep moving and continue with the `dump()` method. May be in the future you will understand...

In [21]:
jsonOutputFile = open('FilesRW/ISSAstrosDump.json','w')
json.dump(jsonData, jsonOutputFile, indent=4)
jsonOutputFile.close()

In [22]:
with open('FilesRW/ISSAstrosDump.json','r') as myFile:
    print(myFile)

<_io.TextIOWrapper name='FilesRW/ISSAstrosDump.json' mode='r' encoding='cp1252'>


🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵🥵

Let us try to understand what is happening!

## JSON from files: `loads()` V.S. `load()`

The following cells are for the second video.

Here is the webpage [http://api.open-notify.org/iss-now.json](http://api.open-notify.org/iss-now.json) where you can find the data.

In [23]:
pythonStringData = '''{"iss_position": {"latitude": "45.9436", "longitude": "-15.4598"}, 
"timestamp": 1675857125, "message": "success"}'''

print(pythonStringData)
print()
print('The object is of type', type(pythonStringData))

{"iss_position": {"latitude": "45.9436", "longitude": "-15.4598"}, 
"timestamp": 1675857125, "message": "success"}

The object is of type <class 'str'>


Your data probably doesn't match Travis Bonfigli's or mine. That's because, unsurprisingly, the International Space Station (ISS) is moving 🛰️

In [24]:
ISSJsonData = json.loads(pythonStringData)

print(type(ISSJsonData))
print()
print(ISSJsonData["timestamp"])
print()
print(ISSJsonData["iss_position"]["latitude"])

<class 'dict'>

1675857125

45.9436


Wait a minute!! We have not worked with a file from our computer.

Yes, I know. But in the Edpuzzle you already leared how to read a file from your computer by mixing the `open()`, `read()` / `readline()` / `readlines()` and `loads()` methods. Do you remember?

In [25]:
# open and parse the data
with open('FilesRW/ISS-now.json') as placeholder:
    pythonDictionary = json.load(placeholder)

print(type(pythonDictionary))
print()
print(pythonDictionary["timestamp"])
print()
print(pythonDictionary["iss_position"]["latitude"])

<class 'dict'>

1675857125

45.9436


🗿   ***Do you know what a placeholder is?***   🗿

**Final task:** Now you are ready to solve the problems we were having with opening the files ISSAstrosDump.json and ISSAstrosDumps.json.

Discuss with your Squad what is the meaning of 

<_io.TextIOWrapper name='Files/ISSAstrosDump.json' mode='r' encoding='cp1252'>

and how to access both files.

You can use dump() for the former and dumps() for the latter; or vice versa.