### Schemas And Instance Validation

#### Why Validate?

> * a schema is a formal description of the structure of a dataset
> * the dominant schema standard in JSON is https://json-schema.org
> * we define schemas in order to validate documents 
> * schemas are critical in implementing data quality and governance processes 

In [None]:
# document/object/instance
{
   "username": "coding_ninja",
   "email": "ninja@example.com",
   "age": 25
}

In [None]:
# schema
{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "type": "object",
    "properties": {
        "username": { "type": "string" },
        "email": { "type": "string", "format": "email" },
        "age": { "type": "integer" }
    },
    "required": ["username", "email", "age"]
}

#### Schema Construction

> * Let's interactively define a JSON schema: https://www.jsonschemavalidator.net/

In [None]:
{
    "productId": 12345,
    "title": "Super Widget Pro",
    "price": {
        "amount": 19.99,
        "currency": "USD"
    },
    "inStock": true,
    "categories": ["electronics", "gadgets"],
    "reviews": [
        { "rating": 4.5, "comment": "Great product!" },
        { "rating": 3, "comment": "It's okay." }
    ]
}

#### More Schema Definitions 

#### Subschemas And Remote References: $ref & $defs

> * $ref enables schema modularization and reusability
> * $defs is the conventional named section for holding definitions in a schema
> * remote review definition: https://www.andybek.com/api/data/review-schema

#### Applicators And Advanced Techniques

> * applicators allow us to apply subschemas to specific parts of the model
> * they could be used to define highly specific and conditional validation conditions

In [None]:
# Define a new property called purchaserContact that could either be an email address or a phone number.

In [None]:
"purchaserContact": {"type": "string"}

# validates:
    # "purchaserContact": "hey@andybek.com"
    # "purchaserContact": "hello"

In [None]:
"purchaserContactEmail": { "type": "string", "format": "email"}
"purchaserContactPhone": { "type": "string", "pattern": "^[0-9]{10}$" }

# validates:
    # "purchaserContact": "hey@andybek.com"
    # "purchaserContact": "6475274486"

# it would NOT validate:
    # "purchaserContact": "hello"

#### Skill Challenge - Defining An Polymorphic JSON Schema

> * Define a schema that restrictively validates the following JSON document:
> * https://www.andybek.com/api/data/contentItems

* HINTS:
> * Focus on keywords like "array", "object", "type", and "enum"
> * Consider using the $defs keyword to organize your schema.
> * For "image/jpeg", contentEncoding: "base64" ensures image data is correctly encoded. For e.g.

In [None]:
{
    "type": "string",
    "contentEncoding": "base64"
}

#### Solution