![Erudio logo](../img/erudio-logo-small.png)

# Reading and Writing JSON

JavaScript Object Notation (JSON) is a widely used data exchange format.  As the name suggests, it is a format derived from JavaScript, but it is strictly language neutral. JSON is currently specified by Internet Engineering Task Force (IETF) RFC 8259.  

JSON is supported by a great many programming languages, in their standard library, or as built-ins, or with widely available libraries for those languages.  Many JSON strings are also identical to valid Python expression for some data structure or scalar.

Let us start out by loading a few Python standard library modules (and one external package) that this lesson will utilize.

In [2]:
!pip install jsonpickle

Collecting jsonpickle
  Downloading jsonpickle-3.0.2-py3-none-any.whl (40 kB)
     ---------------------------------------- 40.7/40.7 kB 2.0 MB/s eta 0:00:00
Installing collected packages: jsonpickle
Successfully installed jsonpickle-3.0.2



[notice] A new release of pip available: 22.2.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [3]:
import json
from pprint import pprint
from textwrap import fill
from dataclasses import dataclass, asdict
from datetime import datetime
from decimal import Decimal
from fractions import Fraction
from math import pi
import jsonpickle

## A String Representation

Let us create a dictionary and use the `json` module to serialize it in a string form. The examples in this lesson will largely follow those used the the lesson on Python pickles.

In [4]:
# Being still alive, lifespan is unknown & marked with NaN
my_data = dict(name="David", real_number=76.54, count=22, likes_python=True, 
               lifespan=float('nan'), end_of_time=float('inf'),
               pets=['Astrophe', 'Kachina', 'Jackson', 'Rebel'])

jstr =json.dumps(my_data)
print(fill(jstr, width=65))

{"name": "David", "real_number": 76.54, "count": 22,
"likes_python": true, "lifespan": NaN, "end_of_time": Infinity,
"pets": ["Astrophe", "Kachina", "Jackson", "Rebel"]}


In [5]:
print(fill(str(my_data), width=65))

{'name': 'David', 'real_number': 76.54, 'count': 22,
'likes_python': True, 'lifespan': nan, 'end_of_time': inf,
'pets': ['Astrophe', 'Kachina', 'Jackson', 'Rebel']}


## Almost Just Python

The JSON string representing the `my_data` dictionary is *almost* valid Python that we could copy-paste or `eval()`.  The main differences are the spelling different of `true` versus `True`, of `false` versus `False`, and of `null` versus `None`.

Another subtle issue occurred in the example, however.  The name `nan` is neither a Python keyword or built-in name *nor* is it strictly part of the JSON spec.  This special class of floating-point values (Not-a-Number) is very useful for certain numeric purposes, so many JSON libraries add it as an informal extension.  

The JSON version is spelled `NaN`.  In Python, we could import the name `nan` from the `math` or `numpy` modules, or we can build it using the `float()` constructor.  The constants `+Infinity` and `-Infinity` which are part of the IEEE-754 floating point standard, likewise are often useful, but are not part of JSON narrowly.

In [6]:
try:
    json.dumps(my_data, allow_nan=False)
except Exception as err:
    print(err)

Out of range float values are not JSON compliant


## Same Values, Different Object

Serialization and deserialization will create an *equivalent* object, but not an identical object.  It should not be confused with a shared memory or concurrency mechanism (but serialization is a building block for *some* concurrency models).

In [7]:
# Avoid the NaN issue
my_data = dict(name="David", likes_python=True, count=None,
               pets=['Astrophe', 'Kachina', 'Jackson', 'Rebel'])

jstr =json.dumps(my_data)
new_data = json.loads(jstr)

In [8]:
print("Equality:", new_data == my_data)
print("Identity:", new_data is my_data)

Equality: True
Identity: False


## Serializing JSON to Files

The API of the `json` module generally matches that of `pickle`.  Along with the `dumps()` and `loads()`, the `json` module also has `dump()` and `load()`.  In all of these, the final 's' is a very compact way of expressing the idea that the function consumes or produces *strings* rather than files.  That naming convention has an old history; most likely newer methods that did not require backward compatibility would use more obvious names.

In contrast to pickle format, which is *usually* used to save files with serialized objects, JSON is *usually* used to create an in-memory string to send over various wire protocols. 

In [11]:
with open('tmp/data.json', 'w') as fh:
    json.dump(my_data, fh)

In [14]:
%pycat tmp/data.json

[1;33m{[0m[1;34m"name"[0m[1;33m:[0m [1;34m"David"[0m[1;33m,[0m [1;34m"likes_python"[0m[1;33m:[0m [0mtrue[0m[1;33m,[0m [1;34m"count"[0m[1;33m:[0m [0mnull[0m[1;33m,[0m [1;34m"pets"[0m[1;33m:[0m [1;33m[[0m[1;34m"Astrophe"[0m[1;33m,[0m [1;34m"Kachina"[0m[1;33m,[0m [1;34m"Jackson"[0m[1;33m,[0m [1;34m"Rebel"[0m[1;33m][0m[1;33m}[0m[1;33m[0m[1;33m[0m[0m


## Reading Objects from Files

Reading JSON from a file—or from another file-like object—is exactly symmetrical with writing it.  With Python's so-called duck-typing, anything with a `.read()` method producing bytes allows unpickling.  Symmetrically, any object with a `.write()` method accepting bytes is suitable for pickling.  See examples in the previous lesson for use of several file-like objects. In this respect, `pickle` and `json` functions are the same.

In [15]:
json.load(open('tmp/data.json'))

{'name': 'David',
 'likes_python': True,
 'count': None,
 'pets': ['Astrophe', 'Kachina', 'Jackson', 'Rebel']}

# JSON Limitations

Only basic Python collections and scalars can be directly represented in JSON; however, these collections *can* be nested indefinitely.  Specifically, JSON allows for dictionaries (called "objects" in the spec) and lists (called "arrays" in the spec); JSON does not have a way of representing tuples, sets, `collections.deque`, `collections.Counter`, NumPy arrays, or other collections you might use in Python.  The keys for JSON objects may only be strings, unlike Python dictionaries that can use any hashable object. For many purposes, casting another collection to a list suffices to transmit the data.

The scalars supported by JSON are exclusively: the three literal names `true`, `false`, and `null`, strings, and numbers.  Strings are surrounded by double quotes, and may contain escaped Unicode code points.  JSON itself only contains a generic "number" datatype.  By default, numbers without decimal points will be interpreted as Python ints.  Numbers with decimals will be interpreted as Python floats.  Python allows other number types, such as `decimal.Decimal`, `fraction.Fraction`, or NumPy values of specific bit lengths.

## Serialization Failures

Any custom classes, including ones that represent special scalars, will fail by default in JSON serialization.

In [16]:
timestamp = datetime.fromisoformat('2020-05-24T00:55:10')
try:
    json.dumps(timestamp)
except Exception as err:
    print(err)

Object of type datetime is not JSON serializable


In [17]:
decnum = Decimal('3.1415')
try:
    json.dumps(decnum)
except Exception as err:
    print(err)

Object of type Decimal is not JSON serializable


## Forcing Serialization

We can customize how special Python datatypes are serialized and deserialized.  This should be done with a caution, however, because it also can impact interoperability with other systems.  This might mean system in other programming languages, or it might simply be other Python machines without the same customizations.  First let us handle extra serialization.

In [18]:
class ScalarEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, (Decimal, Fraction)):
            return float(o)
        elif isinstance(o, datetime):
            return datetime.isoformat(o)
        else:
            return super().default(o)

### Semi-Generic Types

Let us encode some data using the custom encoder we developed.

In [19]:
nums = [timestamp, 42, decnum, pi, Fraction(22, 7)] 
jnums = json.dumps(nums, cls=ScalarEncoder)
pprint(jnums, width=55)

('["2020-05-24T00:55:10", 42, 3.1415, '
 '3.141592653589793, 3.142857142857143]')


This customization will not introduce much compatibility concern.  The same "number" can be represented in different systems.  However, notice that in the JSON representation absolutely nothing distinguishes the float, Decimal, and Fraction we started with as several approximations of the transcendental number pi.  The timestamp has simply become a string, but one that contains all the underlying information.

Reading back in this JSON serialization will work fine, but with all non-integral numbers as platform-native floats.  We can change the default deserialization type for floats and ints if we would like to. We impose just one type for each of float and int.

In [20]:
json.loads(jnums)

['2020-05-24T00:55:10', 42, 3.1415, 3.141592653589793, 3.142857142857143]

In [21]:
json.loads(jnums, parse_float=Decimal, parse_int=Fraction)

['2020-05-24T00:55:10',
 Fraction(42, 1),
 Decimal('3.1415'),
 Decimal('3.141592653589793'),
 Decimal('3.142857142857143')]

## Customizing Serialization

For complex objects, the `.__dict__` of the object often serves as a reasonable proxy for "the interesting data" inside the object. We saw a definition of a custom encoder and could enhance it to deal with additional types that way. However, this is about the point where you want to worry more about the actual utility of your serialization, especially if you will transmit it to other systems (i.e. running different programming languages).  

In [22]:
class RobustEncoder(ScalarEncoder):
    def default(self, o):
        try:
            return super().default(o)
        except:
            return o.__dict__

Let us create a custom instance that has some "problem" nested data, and serialize it using this new encoder.

In [23]:
@dataclass
class TestData:
    description: str
    timestamp: datetime
    numbers: list

In [24]:
test_data = TestData(description="Pi approximations",
                     timestamp=timestamp,
                     numbers=[decnum, pi, Fraction(22, 7)])
pprint(str(test_data), width=56)

("TestData(description='Pi approximations', "
 'timestamp=datetime.datetime(2020, 5, 24, 0, 55, 10), '
 "numbers=[Decimal('3.1415'), 3.141592653589793, "
 'Fraction(22, 7)])')


At this point we are able to serialize to JSON a custom class, albeit without specifically maintaining any information about the class it belongs to, only the underlying data.

In [25]:
pprint(json.dumps(test_data, cls=RobustEncoder))

('{"description": "Pi approximations", "timestamp": "2020-05-24T00:55:10", '
 '"numbers": [3.1415, 3.141592653589793, 3.142857142857143]}')


The example used a Data Class, but that was only because of the compact form of its definition.  The same example would work for any custom class.

# JSON Pickles

If your concern for interoperability is low, and you only wish to exchange data between reasonably similarly configured Python systems (or only persist objects on the same system), the third-party module `jsonpickle` does this abstraction for you.  This achieves round-tripping, which is often useful.  Its capabilities and limitations are essentially identical to `pickle` itself.  However, the binary pickle format is considerably more compact than the JSON string format.

In [26]:
jpkl = jsonpickle.encode(test_data, indent=True)
new_data = jsonpickle.decode(jpkl)
pprint(str(new_data), width=56)

("TestData(description='Pi approximations', "
 'timestamp=datetime.datetime(2020, 5, 24, 0, 55, 10), '
 "numbers=[Decimal('3.1415'), 3.141592653589793, "
 'Fraction(22, 7)])')


The various nested datatypes are fully preserved, as well as the class they belong to.

## The Verbose Format

Above, I used the `indent=True` option to produce more human readable (but somewhat larger) JSON output.  It only modifies semantically meaningless whitespace.  The same switch exits on the `pickle` module itself.  Let us look at what is contained in this specialized JSON format. We will use several slides to see the parts.

In [27]:
lines = jpkl.splitlines()
print('\n'.join(lines[:14]))

{
 "py/object": "__main__.TestData",
 "description": "Pi approximations",
 "timestamp": {
  "py/object": "datetime.datetime",
  "__reduce__": [
   {
    "py/type": "datetime.datetime"
   },
   [
    "B+QFGAA3CgAAAA=="
   ]
  ]
 },


In [28]:
print('\n'.join(lines[14:28]))

 "numbers": [
  {
   "py/reduce": [
    {
     "py/type": "decimal.Decimal"
    },
    {
     "py/tuple": [
      "3.1415"
     ]
    }
   ]
  },
  3.141592653589793,


In [29]:
print('\n'.join(lines[28:]))

  {
   "py/reduce": [
    {
     "py/type": "fractions.Fraction"
    },
    {
     "py/tuple": [
      "22/7"
     ]
    }
   ]
  }
 ]
}


# Sharing JSON Among Languages

JavaScript Object Notation (JSON) is designed as a data interchange format.  Specifically, it is pobably used most commonly for RESTful web service (Representational state transfer).  While those might run in Python, there are numerous other programming languages and frameworks they might use; notably JavaScript is a prominent option.  Every widely used modern programming language has libraries supporting JSON.

For this lesson, we utilize an example Node.js server that is licensed as GPL v.3.0, and can be installed from Rob Kendal's GitHub repository at https://github.com/bpk68/api-server-starter.  That repository is accompanied by an excellent introductory article that describes the steps of creating a simple Node.js webserver.  I have modified that code only in minor ways for this lesson.  I will show two snippets of the JavaScript code used for illustration, but the focus here is on talking to the server from Python, not learning JavaScript or Node.js.

Let us start out by loading a few Python standard library modules, and one widely-used third-party package, that this lesson will utilize.

In [32]:
import json
from http import HTTPStatus
import requests

In [31]:
%%bash
cp node-server/data/users-start.json node-server/data/users.json

## Making REST Requests

This lesson—and *microservices* very commonly—will consist of calling a webserver with a *payload* formatted as JSON, and receiving a response, also usually formatted as JSON.  This structure allows many servers to interact in a manner similar to function calls, with both computation and state distributed among the various servers.  An older approach to this same architecture was XMLRPC, which in fact has a current but legacy Python standard library module `xmlrpc` to support it.

The server in this lesson provides a simple key/value database of users.  All users must have a name and a password, but they may also optionally have other data associated with them.  This design is obviously terrible from a security perspective, since "passwords" are transmitted and stored without encryption (as is other data), but that concern is not for this lesson.

The third-party package `requests` is recommended for HTTP clients, even in the Python standard library documentation itself.  However, the standard library package `urllib.request` has a less intuitive API, but will perform the same tasks if the third-party package is not available.  In our server, we can query the data it contains by making a GET request to the endpoint `/users`.

### You need to start a local node server befoe executing the below cells. The instructions to start one are present in ```node-server/README.md```
**NOTE: If you run the server on another port other than 3001, please update the ceels with the port number on which you run the localhost**

A GET request does not pass any JSON body data; in principle it could pass URL parameters to communicate data, but that style is not used in this lesson.

In [2]:
# The URL of the RESTful server
url = 'http://localhost:3001/users'

# A response to the HTTP request
response = requests.get(url) 

# Show status code and load JSON body
print(response.status_code)
print(response.headers['Content-Type'])
json.loads(response.text)

200
application/json; charset=utf-8


{'1': {'name': 'Guido van Rossum',
  'password': 'unladenswallow',
  'details': {'profession': 'ex-BDFL'}},
 '2': {'name': 'Brendan Eich',
  'password': 'nontransitiveequality',
  'details': {'profession': 'Mozillan'}},
 '3': {'name': 'Ken Thompson',
  'password': 'p/q2-q4!',
  'details': {'profession': 'Unix Creator'}}}

### Unsuccessful Requests

A well behaving webserver will return a status code indicating the nature of the problem with a request. A very small support function will help us show the response details.

In [3]:
def phrase(response):
    for st in HTTPStatus:
        if st.value == response.status_code:
            return f"{st.value} {st.phrase}"

Trying a resource that simply does not exist.

In [4]:
url2 = 'http://localhost:3001/nonesuch'
response = requests.get(url2) 
print(phrase(response))
try:
    json.loads(response.text)
except Exception as err:
    print(err)

404 Not Found
Expecting value: line 1 column 1 (char 0)


At times we might see a status code that is neither 200 nor 404.  A 404 will not have any body, but other status codes are likely to have a body that is encoded as plain text or in another manner.  We can use this clue to decide whether to JSON decode the body.

In [5]:
url3 = 'http://localhost:3001/disabled'
response = requests.get(url3) 
print(phrase(response))
print(response.headers['Content-Type'])
response.text

410 Gone
text/plain; charset=utf-8


'Resource has been disabled'

## Pushing JSON

The way this server is configured, the same endpoint behaves differently if it receives a POST request rather than a GET request.  With a POST, a new record is added to the database.

In [6]:
headers = {'content-type': 'application/json'}
user = {"name": "David Mertz",  
        "password": "badpassword", 
        "details": {
            "profession": "Data Scientist", 
            "publisher": "INE"},
        "lucky_numbers": [12, 42, 55, 87]
       }

response = requests.post(url, data=json.dumps(user), headers=headers)
print(phrase(response))
response.text

200 OK


'new user id:4 added'

Let us make sure the database has the contents we hope for.

In [7]:
response = requests.get(url) 
json.loads(response.text)

{'1': {'name': 'Guido van Rossum',
  'password': 'unladenswallow',
  'details': {'profession': 'ex-BDFL'}},
 '2': {'name': 'Brendan Eich',
  'password': 'nontransitiveequality',
  'details': {'profession': 'Mozillan'}},
 '3': {'name': 'Ken Thompson',
  'password': 'p/q2-q4!',
  'details': {'profession': 'Unix Creator'}},
 '4': {'name': 'David Mertz',
  'password': 'badpassword',
  'details': {'profession': 'Data Scientist', 'publisher': 'INE'},
  'lucky_numbers': [12, 42, 55, 87]}}

The server may validate a POST request (or any request) in some manner, and return an appropriate status based on the JSON passed to it.

In [9]:
anon = {"password": "P4cC!^*8chWz8", "profession": "Hacker"}
response = requests.post(url, data=json.dumps(anon), headers=headers)
print(phrase(response))
response.text

400 Bad Request


'User property "name" is required'

# What the Server is Doing

The Node.js server has a bit of scaffolding to implement a server.  A very similar webserver could be implemented in Python or any other programming language.  While you may not be familiar with JavaScript, the below code should not be difficult to understand in outline.  This is the code that handle a POST to the `/users` route.

```javascript
// CREATE
app.post('/users', (req, res) => {      
    // validation
    if (! req.body.hasOwnProperty('name')) {
        res.status(400).send('User property "name" is required');
    }
```
```javascript
    // add the new user
    else {
        readFile(data => {
            const newUserId = Object.keys(data).length + 1;
            data[newUserId.toString()] = req.body;
            writeFile(JSON.stringify(data, null, 2), () => {
                res.status(200).send(`new user id:${newUserId} added`);
            });
```
```javascript
        },
            true);
    }
});
```

Although the data file that stores the database is itself simply JSON, the server explicitly parses it as JSON to assure the format.  Setting the header immediately before the call to `res.send()` is redundant because the server can detect the type from the JSON object; I added it to illustrate that we are able to explicitly set it.  Very similar APIs are present in Python websevers.

```json
// READ
app.get('/users', (req, res) => {
    fs.readFile(dataPath, 'utf8', (err, data) => {
        if (err) {
            throw err;
        }
```
```javascript
        // framework detects JSON, but set explicitly
        res.setHeader('Content-Type', 'application/json');
        res.send(JSON.parse(data));
    });
});
```

# JSON Schema

The prior lesson demonstrated communicating between a RESTful web server and a client.  Recall that we sent HTTP POST messages with a JSON body to a server and received JSON responses from GET queries.  One thing that was not done in the example was any validation of the format of these messages.  Or rather, there was one element of ad-hoc validation in that the server required the field "name" to be present in a user record.

Using JSON Schema, we can more precisely specify all the elements that may be present in an acceptable JSON document, including which are requires versus option, and indicate datatypes and nesting of containers.  JSON Schema can contain varying levels of details.  We will look at some possible schemata to define a valid user with varying degrees of specificity.

Let us start out by loading Python standard library modules and the third-party `jsonschema` module.  We also create JSON strings for several users to validate.

In [33]:
import json
from jsonschema import validate, ValidationError

In [36]:
guido = json.loads("""{
  "name": "Guido van Rossum",
  "password": "unladenswallow",
  "details": {
    "profession": "ex-BDFL"
  }
}""")

In [37]:
roberto = json.loads("""{
  "name": "Roberto Sanchez",
  "password": "badpassword",
  "details": {
    "profession": "Software Developer",
    "publisher": "INE"
  },
  "lucky_numbers": [12, 42, 55, 87]
}""")

In [38]:
intruder = json.loads("""{
  "password": "P4cC!^*8chWz8", 
  "profession": "Hacker"
}""")

# Validation

A JSON Schema is itself a JSON document following certain specifications.  At the simplest, it needs to specify a type for the JSON being validated. The module `jsonschema` expects Python objects as both `instance` and `schema` arguments.  If you are beginning with JSON—which is, after all, the point of using it—you need to use the `json` module to convert both to Python objects first.

The API the `jsonschema` module uses might be surprising.  It raises an exception on failure, but passes silently on success.  Let us look at a couple examples.

## Checking Scalars

In [39]:
try:
    validate(instance=99, schema={"type": "number"})
    print("99 is a number")
except ValidationError as err:
    print(err)    

99 is a number


In [40]:
try:
    validate(99, {"type": "string"})
    print("99 is a string")
except ValidationError as err:
    print(err)

99 is not of type 'string'

Failed validating 'type' in schema:
    {'type': 'string'}

On instance:
    99


In [41]:
try:
    validate("99", {"type": "number"})
    print("99 is a string")
except ValidationError as err:
    print(err)

'99' is not of type 'number'

Failed validating 'type' in schema:
    {'type': 'number'}

On instance:
    '99'


## A Test Function

I find it easier to wrap the exception raising API with a function that will return either the error description as a string or None as a sentinel for "no errors."

In [42]:
def not_valid(instance, schema):
    try:
        validate(instance, schema)
        return None
    except ValidationError as err:
        return str(err)

The following is the pattern we will use for the remaining examples.

In [43]:
# The "walrus operator" requires Python 3.8+
if msg := not_valid("Ooops", {"type": "array"}):
    print(msg)

'Ooops' is not of type 'array'

Failed validating 'type' in schema:
    {'type': 'array'}

On instance:
    'Ooops'


# Checking Users

The simple examples above do not check structured collections. All user JSON records are what JavaScript calls "objects" but Python calls dicts.   For a JSON object, we need to define both the type and the properties we expect it to have.  We may specify keys as required, but validation will not prohibit inclusion of "cargo" in keys we have not specified.  Very often this is exactly desired behavior; JSON often carries extra information that might be used by other consumers, but a particular consumer only needs to assure the parts it cares about are present.

In [44]:
schema = json.loads("""{
  "type" : "object",
  "required": ["name"],
  "properties" : {
    "name" : {"type" : "string"}
    }
}""")

Validate standard users.

In [45]:
for user in [guido, roberto]:
    if msg := not_valid(user, schema):
        print(msg, "\n--------------------")
    else:
        print(f"User {user['name']} validates correctly")

User Guido van Rossum validates correctly
User Roberto Sanchez validates correctly


The schema in this first pass suffices to check the constraint the server in the prior lesson imposed.  In fact, it checks slightly more in guaranteeing that the field "name" is a string.

In [46]:
barbara_feldon = json.loads("""{
  "name": 99, 
  "details": {"profession": "CONTROL Agent"}
}""")

We have two not-quite-conformant user JSON documents to validate. Each fails in a different way.

In [47]:
for user in [barbara_feldon, intruder]:
    if msg := not_valid(user, schema):
        print(msg, "\n--------------------")
    else:
        print(f"User {user['name']} validates correctly")

99 is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    99 
--------------------
'name' is a required property

Failed validating 'required' in schema:
    {'properties': {'name': {'type': 'string'}},
     'required': ['name'],
     'type': 'object'}

On instance:
    {'password': 'P4cC!^*8chWz8', 'profession': 'Hacker'} 
--------------------


## Nested Structure

A JSON Schema allows specification of nested structures, including type and cardinality, and also may optionally contain a number of annotations to describe the schema itself.  Let us add a few. In the expanded schema, we will require a password along with a name.  Notice that we describe several aspects of what the field "lucky_numbers" might look like, but we do not make it required.  Guido had none, but David did; both should validate.

In [48]:
schema = json.loads("""{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "http://example.com/user.schema.json",
  "title": "User",
  "description": "A User of Our Computer System",
  "type" : "object",
  "required": ["name", "password"],
  "properties" : {
     "name" : {"type" : "string"},
     "password": {
         "description": "Use special characters and mixed case",
         "type": "string"},
     "lucky_numbers": {
         "description": "Up to 6 favorite numbers 1-100",
         "type": "array",
         "items": {
           "type": "number",
           "minimum": 1,
           "maximum": 100
         },
         "uniqueItems": true,
         "minItems": 0,
         "maxItems": 6
    }
  }
}""")

Our existing users continue to validate without a problem.

In [50]:
for user in [guido, roberto]:
    if msg := not_valid(user, schema):
        print(msg, "\n--------------------")
    else:
        print(f"User {user['name']} validates correctly")

User Guido van Rossum validates correctly
User Roberto Sanchez validates correctly


There are a few ways that validation might fail with the expanded schema.  Obviously, "password" was added as a required field, but the pattern there is identical as with "name".  The field "lucky_numbers" has more going on.  It might be omitted altogether for a valid users, but if it is included, it can only be an array (Python list) of numbers between 1 and 100; moreover, it can only have from zero to six numbers that must be distinct.

In [51]:
the_count = json.loads("""{
  "name": "Count von Count",
  "password": "fourbananas",
  "lucky_numbers": ["one", "two", "three"]
}""")

if msg := not_valid(the_count, schema):
    print(msg, "\n--------------------")
else:
    print(f"User {user['name']} validates correctly")

'one' is not of type 'number'

Failed validating 'type' in schema['properties']['lucky_numbers']['items']:
    {'maximum': 100, 'minimum': 1, 'type': 'number'}

On instance['lucky_numbers'][0]:
    'one' 
--------------------


In [52]:
cantor = json.loads("""{
  "name": "Georg Cantor",
  "password": "omega_aleph",
  "lucky_numbers": [1, 2, 3, 4, 5, 6, 7, 8]
}""")

if msg := not_valid(cantor, schema):
    print(msg, "\n--------------------")
else:
    print(f"User {user['name']} validates correctly")

[1, 2, 3, 4, 5, 6, 7, 8] is too long

Failed validating 'maxItems' in schema['properties']['lucky_numbers']:
    {'description': 'Up to 6 favorite numbers 1-100',
     'items': {'maximum': 100, 'minimum': 1, 'type': 'number'},
     'maxItems': 6,
     'minItems': 0,
     'type': 'array',
     'uniqueItems': True}

On instance['lucky_numbers']:
    [1, 2, 3, 4, 5, 6, 7, 8] 
--------------------


In [53]:
revolution_9 = json.loads("""{
  "name": "Yoko Ono",
  "password": "grapefruit",
  "lucky_numbers": [9, 9, 9]
}""")

if msg := not_valid(revolution_9, schema):
    print(msg, "\n--------------------")
else:
    print(f"User {user['name']} validates correctly")

[9, 9, 9] has non-unique elements

Failed validating 'uniqueItems' in schema['properties']['lucky_numbers']:
    {'description': 'Up to 6 favorite numbers 1-100',
     'items': {'maximum': 100, 'minimum': 1, 'type': 'number'},
     'maxItems': 6,
     'minItems': 0,
     'type': 'array',
     'uniqueItems': True}

On instance['lucky_numbers']:
    [9, 9, 9] 
--------------------


In [54]:
go_big = json.loads("""{
  "name": "Leslie Knope",
  "password": "ilovepawnee",
  "lucky_numbers": [1000000, 200000]
}""")

if msg := not_valid(go_big, schema):
    print(msg, "\n--------------------")
else:
    print(f"User {user['name']} validates correctly")

1000000 is greater than the maximum of 100

Failed validating 'maximum' in schema['properties']['lucky_numbers']['items']:
    {'maximum': 100, 'minimum': 1, 'type': 'number'}

On instance['lucky_numbers'][0]:
    1000000 
--------------------


-------------
Materials licensed under [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/) by the authors