# Data validation (Part 1)

This is a quick example on how to use the linkml-validator to validate an object against a given LinkML schema.

## Schema

First you define the schema YAML:

In [1]:
schema = """

id: https://w3id.org/Example-Schema
name: Example-Schema
description: >-
  An Example Schema
version: 0.0.0
imports:
  - linkml:types

prefixes:
  linkml: https://w3id.org/linkml/
  example: https://w3id.org/example/

default_prefix: example

classes:
  named thing:
    slots:
      - id
      - name
      - type

slots:
  id:
    required: true

  name:
    range: string

  type:
    range: type_enum

enums:
  type_enum:
    permissible_values:
      X:
      Y:
      Z:

"""

## Data as an object

Then define the data as a JSON object:

In [2]:
data = {
    "id": "obj1",
    "name": "Object 1",
    "type": "X"
}

Now, you can instantiate the Validator with the defined schema:

In [3]:
from linkml_validator.validator import Validator

validator = Validator(schema=schema)

And then run the `validate` method to validate the defined object:

In [4]:
report = validator.validate(obj=data, target_class='NamedThing')
print(f"Object valid: {report.valid}")

Object valid: True


## Data as a list of objects

If your data is a list of objects:

In [5]:
data = [
    {
        "id": "obj1",
        "name": "Object 1",
        "type": "X"
    },
    {
        "id": "obj2",
        "name": "Object 2",
        "type": "Y"
    }
]

You can run the validate method to validate each object in the list:

In [6]:
for obj in data:
    report = validator.validate(obj=obj, target_class='NamedThing')
    print(f"Object valid: {report.valid}")

Object valid: True
Object valid: True


## Validating invalid data

Lets assume we have a list of objects of which two of the objects violates the schema:

In [7]:
data = [
    {
        "id": "obj1",
        "name": "Object 1",
        "type": "X"
    },
    {
        "id": "obj2",
        "name": "Object 2",
        "type": "Y"
    },
    {
        "name": "Object 3", # <-- Missing 'id' field
        "type": "Y"
    },
    {
        "id": "obj4",
        "name": "Object 4",
        "type": "ABC" # <-- Incorrect enum used for 'type'
    }
]

Then when we run the validation on all the objects in the list, we should see some errors reported:

In [8]:
for obj in data:
    report = validator.validate(obj=obj, target_class='NamedThing')
    print(f"Object valid: {report.valid}")

Object valid: True
Object valid: True
Object valid: False
Object valid: False


But why?

In [9]:
for obj in data:
    report = validator.validate(obj=obj, target_class='NamedThing')
    print(f"Object valid: {report.valid}")
    if not report.valid:
        for result in report.validation_results:
            for message in result.validation_messages:
                print(f"[{result.plugin_name}] {message.message} for {report.object}")

Object valid: True
Object valid: True
Object valid: False
[JsonSchemaValidationPlugin] 'id' is a required property for {'name': 'Object 3', 'type': 'Y'}
Object valid: False
[JsonSchemaValidationPlugin] 'ABC' is not one of ['X', 'Y', 'Z'] for {'id': 'obj4', 'name': 'Object 4', 'type': 'ABC'}
