# Data validation (Part 2)

Let us consider a more realistic scenario where our data is to be validated against the Biolink Model.


## Schema

We will use Biolink Model v3.1.1.

In [1]:
schema_url = "https://raw.githubusercontent.com/biolink/biolink-model/v3.1.1/biolink-model.yaml"

## Data

And we have a list of Gene objects:

In [2]:
data = [
    {
        "category": [
            "biolink:Gene"
        ],
        "id": "HGNC:10848",
        "name": "SHH (human)",
        "provided_by": [
            "graph_nodes.tsv"
        ],
        "taxon": "NCBITaxon:9606"
    },
    {
        "category": [
            "biolink:Gene"
        ],
        "id": "NCBIGene:6469",
        "name": "SHH",
        "provided_by": [
            "graph_nodes.tsv"
        ],
        "taxon": "NCBITaxon:9606"
    },
    {
        "category": [
            "biolink:Gene"
        ],
        "id": "HGNC:9398",
        "name": "OLIG2",
        "provided_by": [
            "graph_nodes.tsv"
        ],
        "taxon": "NCBITaxon:9606"
    },
    {
        "id": "HGNC:9399", # <-- 'category' missing for object
        "name": "PRKCD",
        "provided_by": [
            "graph_nodes.tsv"
        ],
        "taxon": "NCBITaxon:9606"
    }
]

## Validate data against the schema

First we instantiate the Validator with the Biolink Model YAML:

In [3]:
from linkml_validator.validator import Validator

validator = Validator(schema=schema_url)

Then we can validate our data against the Biolink Model:

In [4]:
for obj in data:
    report = validator.validate(obj=obj, target_class='Gene')
    print(f"Object valid: {report.valid}")
    if not report.valid:
        for result in report.validation_results:
            for message in result.validation_messages:
                print(f"[{result.plugin_name}] {message.message} for {report.object}")

Object valid: True
Object valid: True
Object valid: True
Object valid: False
[JsonSchemaValidationPlugin] 'category' is a required property for {'id': 'HGNC:9399', 'name': 'PRKCD', 'provided_by': ['graph_nodes.tsv'], 'taxon': 'NCBITaxon:9606'}
