# Table Schema extension:  conditional constraints

This Notebook presents an example of implementing conditional constrainst between Fields.

## Example
The choosen example is :

| observationType | scientificName |
|-----------------|----------------|
| animal          | Vulpes vulpes  |
| tree            | null           |
| animal          | null           |

The constraint to check is :    

    if the observationType is an animal, the scientificName has to be not null
    
This conditional constraint is applicable to the rows and is validated for the two first rows but not for the last. 

## Proposal

JSON schema proposes two solutions:

- schema composition (keywords `allOf`, `anyOf` and `oneOf`)
- conditional schema (keywords `if`, `then`, `else`)

The Table Schema solution can therefore consist of applying JSON schema rules for each row.

Note: Both JSON schema solutions are equivalent (`if A then B` is equivalent as `B or Not A`)


## Python example
The next cell is the application of the proposal (both equivalent options are included).

Note: The JSON schema uses the `properties` keyword to define the data to check (not used here). 


In [1]:
from frictionless import Resource, Schema

animal = Resource(data=[['observationType', 'scientificName'], 
                        ['animal',  'Vulpes vulpes'], 
                        ['tree',  'null'],
                        ['animal',  'null']
                        ])
schema = { "fields": [
                {"name": "observationType", "type": "string"}, 
                {"name": "scientificName", "type": "string"}], 
           "anyOf": [ 
                {"observationType": { "not": { "const": "animal" }}},
                {"scientificName": { "not": {"const": "null"}}}],
           "if":
                {"observationType": { "const": "animal" }},
           "then":
                {"scientificName": { "not": {"const": "null"}}}
}
animal.schema = Schema.from_descriptor(schema)

## Implementation
A row is represented in Table Schema as a JSON object :

 ```json
    { "observationType": "animal", "scientificName": "Vulpes vulpes" }
 ```
  
The JSON schema applicable to the rows are :

 ```json
    {"anyOf": [ 
            {"properties": {"observationType": { "not": { "const": "animal" }}}},
            {"properties": {"scientificName": { "not": {"const": "null"}}}}]}
 ```            
and 

 ```json
    {"if":
           {"properties": {"observationType": { "const": "animal" }}},
     "then":
           {"properties": {"scientificName": { "not": {"const": "null"}}}}}
 ```          
The implementation proposes to convert the schema into a JSON schema (add `properties` keyword) then apply this JSON schema for each row.

In [2]:
import attrs
import frictionless
import jsonschema
from frictionless import Check, Row
from frictionless.errors import RowError

def validate(resource):
    checks = [Composition({key:resource.schema.custom[key]}) 
              for key in resource.schema.custom if key in ['allOf', 'anyOf', 'oneOf']]
    if 'if' in resource.schema.custom:
        checks += [Composition({key:resource.schema.custom[key] 
                               for key in resource.schema.custom 
                               if key in ['if', 'then', 'else']})]
    return frictionless.validate(resource, checks=checks)
    
class CompositionError(RowError):
    title = None
    type = 'Composition'
    description = None

@attrs.define(kw_only=True, repr=False)
class Composition(Check):
    """Check a Composition of schemas"""

    Errors = [CompositionError]
    
    def __init__(self, descriptor):
        super().__init__()
        if len(descriptor) == 1:
            cat = list(descriptor)[0]
            self.__composition = {cat:[{'properties':desc} for desc in descriptor[cat]]}
        else:
            self.__composition = {cat:{'properties':descriptor[cat]} for cat in list(descriptor)}
        self.__descriptor = descriptor 
        
    def validate_row(self, row: Row):        
        try:
            jsonschema.validate(row, self.__composition)
        except Exception:
            note = 'the row is not conform to schema : ' + str(self.__descriptor)
            yield CompositionError.from_row(row, note=note)

## Tests
The validate function detects two errors :

- last Field with `anyOf` keyword,
- last Field with `if` keyword,

In [3]:
validate(animal)

{'valid': False,
 'errors': [],
 'tasks': [{'name': 'memory',
            'type': 'table',
            'valid': False,
            'place': '<memory>',
            'labels': ['observationType', 'scientificName'],
            'stats': {'errors': 2,
                      'seconds': 0.029,
                      'fields': 2,
                      'rows': 3},
            'errors': [{'type': 'Composition',
                        'message': 'Row Error',
                        'tags': ['#table', '#row'],
                        'note': "the row is not conform to schema : {'anyOf': "
                                "[{'observationType': {'not': {'const': "
                                "'animal'}}}, {'scientificName': {'not': "
                                "{'const': 'null'}}}]}",
                        'cells': ['animal', 'null'],
                        'rowNumber': 4},
                       {'type': 'Composition',
                        'message': 'Row Error',
                     

The test with the correct values ("Vulpes velox" for the last row) does not detect any errors.

In [4]:
animal_2 = Resource(data=[['observationType', 'scientificName'], 
                          ['animal',  'Vulpes vulpes'], 
                          ['tree',  'null'],
                          ['animal',  'Vulpes velox']
                         ])
animal_2.schema = Schema.from_descriptor(schema)

validate(animal_2)

{'valid': True,
 'errors': [],
 'tasks': [{'name': 'memory',
            'type': 'table',
            'valid': True,
            'place': '<memory>',
            'labels': ['observationType', 'scientificName'],
            'stats': {'errors': 0,
                      'seconds': 0.017,
                      'fields': 2,
                      'rows': 3},
            'errors': []}]}