# Python API usage

There are 3 `eido` functions in the public package interface:

- `validate_project` to validate the entire PEP
- `validate_sample` to validate only a selected sample
- `validate_config` to validate only the config part of the PEP

## Entire PEP validation

In [7]:
from eido import *
from peppy import Project

Within Python the `validate_project` function can be used to perform the entire PEP validation. It requires `peppy.Project` object and either a path to the YAML schema file or a read schema (`dict`) as inputs.

In [8]:
p = Project("../tests/data/peps/test_cfg.yaml")
validate_project(project=p, schema="../tests/data/schemas/test_schema.yaml")

from eido.eido import _load_yaml
s = _load_yaml("../tests/data/schemas/test_schema.yaml")
validate_project(project=p, schema=s)

If a validation is successful, no message is printed. An unsuccessful one is signalized with a corresponding `jsonschema.exceptions.ValidationError`

In [9]:
validate_project(project=p, schema="../tests/data/schemas/test_schema_invalid.yaml")

ValidationError: 'invalid' is a required property

Failed validating 'required' in schema:
    {'description': 'test PEP schema',
     'properties': {'_samples': {'items': {'properties': {'genome': {'type': 'string'},
                                                          'protocol': {'type': 'string'},
                                                          'sample_name': {'type': 'string'}},
                                           'type': 'object'},
                                 'type': 'array'},
                    'dcc': {'properties': {'compute_packages': {'type': 'object'}},
                            'type': 'object'},
                    'invalid': {'type': 'string'}},
     'required': ['dcc', '_samples', 'invalid']}

On instance:
    {'_main_index_cols': 'sample_name',
     '_sample_table':   sample_name protocol genome
    0  GSM1558746      GRO   hg38
    1  GSM1480327      PRO   hg38,
     '_samples': [{'derived_cols_done': [],
                   'genome': 'hg38',
                   'merged': False,
                   'merged_cols': PathExAttMap: {},
                   'name': 'GSM1558746',
                   'paths': Paths object.,
                   'protocol': 'GRO',
                   'required_paths': None,
                   'results_subdir': '/Users/mstolarczyk/Uczelnia/UVA/code/eido/tests/data/peps/test/results_pipeline',
                   'sample_name': 'GSM1558746',
                   'sheet_attributes': ['sample_name',
                                        'protocol',
                                        'genome'],
                   'yaml_file': None},
                  {'derived_cols_done': [],
                   'genome': 'hg38',
                   'merged': False,
                   'merged_cols': PathExAttMap: {},
                   'name': 'GSM1480327',
                   'paths': Paths object.,
                   'protocol': 'PRO',
                   'required_paths': None,
                   'results_subdir': '/Users/mstolarczyk/Uczelnia/UVA/code/eido/tests/data/peps/test/results_pipeline',
                   'sample_name': 'GSM1480327',
                   'sheet_attributes': ['sample_name',
                                        'protocol',
                                        'genome'],
                   'yaml_file': None}],
     '_sections': {'name', 'metadata', 'implied_attributes'},
     '_subproject': None,
     '_subs_index_cols': ('sample_name', 'subsample_name'),
     '_subsample_table': None,
     'config_file': '/Users/mstolarczyk/Uczelnia/UVA/code/eido/tests/data/peps/test_cfg.yaml',
     'constant_attributes': {},
     'data_sources': None,
     'dcc': {'_file_path': '/Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml',
             '_ro': True,
             '_wait_time': 10,
             'compute': {'partition': 'standard',
                         'submission_command': 'sbatch',
                         'submission_template': '/Users/mstolarczyk/Uczelnia/UVA/code/pepenv/templates/slurm_template.sub'},
             'compute_packages': {'default': {'partition': 'standard',
                                              'submission_command': 'sbatch',
                                              'submission_template': 'templates/slurm_template.sub'},
                                  'largemem': {'partition': 'largemem',
                                               'submission_command': 'sbatch',
                                               'submission_template': 'templates/slurm_template.sub'},
                                  'local': {'submission_command': 'sh',
                                            'submission_template': 'templates/localhost_template.sub'},
                                  'parallel': {'partition': 'parallel',
                                               'submission_command': 'sbatch',
                                               'submission_template': 'templates/slurm_template.sub'},
                                  'sigterm': {'partition': 'standard',
                                              'submission_command': 'sbatch',
                                              'submission_template': 'templates/slurm_sig_template.sub'},
                                  'singularity_local': {'singularity_args': '-B '
                                                                            '/ext:/ext',
                                                        'submission_command': 'sh',
                                                        'submission_template': 'templates/localhost_singularity_template.sub'},
                                  'singularity_slurm': {'singularity_args': '-B '
                                                                            '/sfs/lustre:/sfs/lustre,/nm/t1:/nm/t1',
                                                        'submission_command': 'sbatch',
                                                        'submission_template': 'templates/slurm_singularity_template.sub'}},
             'config_file': '/Users/mstolarczyk/Uczelnia/UVA/code/pepenv/uva_rivanna.yaml'},
     'derived_attributes': ['data_source'],
     'file_checks': False,
     'implied_attributes': {'organism': {'Homo sapiens': {'genome': 'hg38'}}},
     'metadata': {'output_dir': '/Users/mstolarczyk/Uczelnia/UVA/code/eido/tests/data/peps/test',
                  'pipeline_interfaces': [],
                  'sample_table': '/Users/mstolarczyk/Uczelnia/UVA/code/eido/tests/data/peps/test_sample_table.csv'},
     'name': 'test',
     'permissive': True}

## Config validation

Similarily, the config part of the PEP can be validated; the function inputs remain the same

In [10]:
validate_config(project=p, schema="../tests/data/schemas/test_schema.yaml")

## Sample validation

To validate a specific `peppy.Sample` object within a PEP, one needs to also specify the `sample_name` argument which can be the `peppy.Sample.name` attribute (`str`) or the ID of the sample (`int`)

In [11]:
validate_sample(project=p, schema="../tests/data/schemas/test_schema.yaml", sample_name=0)

## Output details

As depicted above the error raised by the `jsonschema` package is very detailed. That's because the entire validated PEP is printed out for the user reference. Since it can get overwhelming in case of the multi sample PEPs each of the `eido` functions presented above privide a way to limit the output to just the general information indicating the unmet schema requirements

In [12]:
validate_project(project=p, schema="../tests/data/schemas/test_schema_invalid.yaml", exclude_case=True)

ValidationError: 'invalid' is a required property