### An Introduction to JSON

Data stored in a CSV file must be flat; that is, it must fit into rows and columns. Most people refer to this type of data as structured or tabular. This data is tabular because the number of columns is the same for every row.  Individual rows may be missing a value for a column; however, these rows still have the same columns.  

This sort of data is convenient for machine learning because most models, such as neural networks, also expect incoming data to be of fixed dimensions. Real-world information is not always so tabular.  Consider if the rows represent customers.  These people might have multiple phone numbers and addresses.  How would you describe such data using a fixed number of columns?  It would be useful to have a list of these courses in each row that can be of a variable length for each row, or student.

JavaScript Object Notation (JSON) is a standard file format that stores data in a hierarchical format similar to eXtensible Markup Language (XML).  JSON is nothing more than a hierarchy of lists and dictionaries.  Programmers refer to this sort of data as semi-structured data or hierarchical data.  The following is a sample JSON file.

```
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": null
}
```

The above file may look somewhat like Python code.  You can see curly braces that define dictionaries and square brackets that define lists.  JSON does require there to be a single root element.  A list or dictionary can fulfill this role.  JSON requires double-quotes to enclose strings and names.  Single quotes are not allowed in JSON.

JSON files are always legal JavaScript syntax.  JSON is also generally valid as Python code, as demonstrated by the following Python program.

In [1]:
jsonHardCoded = {
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": True,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": None
}

Generally, it is better to read JSON from files, strings, or the Internet than hard coding, as demonstrated here.  However, for internal data structures, sometimes such hard-coding can be useful.

Python contains support for JSON.  When a Python program loads a JSON  the root list or dictionary is returned, as demonstrated by the following code.

In [2]:
import json

json_string = '{"first":"Jeff","last":"Heaton"}'
obj = json.loads(json_string)
print(f"First name: {obj['first']}")
print(f"Last name: {obj['last']}")

First name: Jeff
Last name: Heaton


Python programs can also load JSON from a file or URL.

In [3]:
import requests

r = requests.get("https://raw.githubusercontent.com/jeffheaton/"
                 +"t81_558_deep_learning/master/person.json")
print(r.json())

{'firstName': 'John', 'lastName': 'Smith', 'isAlive': True, 'age': 27, 'address': {'streetAddress': '21 2nd Street', 'city': 'New York', 'state': 'NY', 'postalCode': '10021-3100'}, 'phoneNumbers': [{'type': 'home', 'number': '212 555-1234'}, {'type': 'office', 'number': '646 555-4567'}, {'type': 'mobile', 'number': '123 456-7890'}], 'children': [], 'spouse': None}


In [11]:
r.json()['firstName']

'John'

Python programs can easily generate JSON strings from Python objects of dictionaries and lists.

In [14]:
python_obj = {"first":"Jeff","last":"Heaton"}
python_obj
type(python_obj)
print(json.dumps(python_obj))

{"first": "Jeff", "last": "Heaton"}


In [18]:
jas = json.dumps(python_obj)
type(jas)

str

A data scientist will generally encounter JSON when they access web services to get their data.  A data scientist might use the techniques presented in this section to convert the semi-structured JSON data into tabular data for the program to use with a model such as a neural network.