# Using Schemas in Kosh

This notebook shows how to use schema in Kosh to validate your metadata


In [1]:
import kosh
import os

kosh_example_sql_file = "kosh_schemas_example.sql"

# Create and open a new store (erase if exists)
store = kosh.create_new_db(kosh_example_sql_file)
# create a dataset
ds = store.create()

Let's create a schema to validate our metadata
a schema object takes two dictionaries as input
one for the required attributes and one for the optional attributes

For each attributes we need to provide validation functions or valid values
 - If the "validation" is a callable it will be applied on values of the attribute and must pass and return True
 - If the validation is an instance of 'type' the attribute must be an instance of the validation type
 - Otherwise the value must match "validation"
 
 It is possible though to have multiple possible validations for a single attribute, simply define them in the dictionary as a list, if any validation passes the attribute is considered valid
 

Let's create a validation schema that requires our datasets to have the attribute "must" with any value and allow for an attribute 'maybe' that must be one of 1, "yes" or True

In [2]:
required = {"must": None}
optional = {"maybe": [1, "yes"]}
schema = kosh.KoshSchema(required, optional)

Our current (blank) dataset will not validate, we can first try it as follow:

In [3]:
try:
    schema.validate(ds)
except ValueError as err:
    print("As expected, we failed to validate with error:", err)

As expected, we failed to validate with error: Could not validate 1434ee59f0fe48d1b7109e21d7b86225
1 required attribute errors: {'must': AttributeError('Object 1434ee59f0fe48d1b7109e21d7b86225 does not have must attribute',)}
0 optional attributes errors: {}


In [4]:
# Let's add the attribute 
ds.must = "I have must"
# Validation now passes
schema.validate(ds)

True

Now let's have must as an integer

In [5]:
required = {"must": int}
optional = {"maybe": [1, "yes"]}
schema = kosh.KoshSchema(required, optional)
# it does not validate anymore
try:
    schema.validate(ds)
except ValueError as err:
    print("As expected, it now fails to validate with error:", err)

As expected, it now fails to validate with error: Could not validate 1434ee59f0fe48d1b7109e21d7b86225
1 required attribute errors: {'must': ValueError('value I have must failed validation',)}
0 optional attributes errors: {}


In [6]:
# Let's fix this
ds.must = 5
# It now validates
schema.validate(ds)

True

In [7]:
# Note that any extra attribute is ok but will not be checked for validation
ds.any = "hi"
schema.validate(ds)

True

In [8]:
# We can now enforce this schema subsequently
ds.schema = schema

In [9]:
# Now we cannt set `must` to a bad value
try:
    ds.must = 7.6
except ValueError as err:
    print("Failed to set attribute as it did not validate (must be int). Error:", err)

Failed to set attribute as it did not validate (must be int). Error: value 7.6 failed validation


In [10]:
# Still at 5
ds.must

5

In [11]:
# Similarly optional attribute must validate
try:
    ds.maybe = "b"
except ValueError as err:
    print("Optional attributes must validate as well. Error:", err)

Optional attributes must validate as well. Error: Could not validate value 'b'


In [12]:
ds.maybe = "yes"
ds.maybe = 1

Now sometimes we need more complex validation let's create a simple validation function

In [13]:
def isYes(value):
    if isinstance(value, str):
        return value.lower()[0] == "y"
    elif isinstance(value, int):
        return value == 1
    
required = {"must": int}
optional = {"maybe": isYes}
schema = kosh.KoshSchema(required, optional)

ds.schema = schema
ds.maybe = "y"

we can also pass list of possible validations


In [14]:
def isNo(value):
    if isinstance(value, str):
        return value.lower()[0] == "n"
    elif isinstance(value, int):
        return value == 0
    
required = {"must": int}
optional = {"maybe": [isYes, isNo, "oui"]}
schema = kosh.KoshSchema(required, optional)

ds.schema = schema
ds.maybe = "N"
ds.maybe = 'No'
ds.maybe = 'oui'
ds.maybe = 'Yes'