# DRS applications tutorial

The following DRS applications are mainly based on the validation functions seen in the API tutorial.

## Imports

In [1]:
import esgvoc

## DRS validation

The DRS validation is the process to validate an character expression against a DRS specification of a project.

### Instantiation

Import the validator class:

In [2]:
from esgvoc.apps.drs.validator import DrsValidator

Instanciate a validator object, for example the CMIP6Plus CV:

In [3]:
validator = DrsValidator(project_id="cmip6plus")

### Validation methods

Check the compliance of a DRS expression. For example a dataset id:

In [4]:
validator.validate_dataset_id(drs_expression="CMIP6Plus.CMIP.IPSL.MIROC6.amip.r2i2p1f2.ACmon.od550aer.gn")



You can also check directories and file names:

In [5]:
validator.validate_directory(drs_expression="CMIP6Plus/CMIP/NCC/MIROC6/amip/r2i2p1f2/ACmon/od550aer/gn/v20190923")



In [6]:
validator.validate_file_name(drs_expression="od550aer_ACmon_MIROC6_amip_r2i2p1f2_gn_201211-201212.nc")



The last one depends of the DRS type:

In [7]:
validator.validate(drs_expression="CMIP6Plus.CMIP.IPSL.MIROC6.amip.r2i2p1f2.ACmon.od550aer.gn", type='dataset_id')



### Reporting

The validator returns value is not a string, but a full report:

In [8]:
report = validator.validate_file_name(drs_expression="od550aer_ACmon_MIROC6_amip_r2i2p1f2_gn.nc")
if report:
    print('valid')
else:
    print('unvalid')

valid


And has any errors and warnings (missing period at the end of the file name). See the full API documentation [here](https://esgf.github.io/esgf-vocab/api_documentation/drs.html#esgvoc.apps.drs.report.DrsValidationReport).

In [9]:
report.warnings

[missing token for time_range at position 7]

The validator supports a wild range issues. Such as blank token:

In [10]:
report = validator.validate_directory(drs_expression="CMIP6Plus/CMIP/ /NCC/MIROC6/amip/r2i2p1f2/ACmon/od550aer/gn/v20190923")
print(repr(report))
print(report.errors)

[blank token at column 16]


And of course an invalid term:

In [11]:
report = validator.validate_directory(drs_expression="CMIP6Plus/CMIP_ERROR_HERE/NCC/MIROC6/amip/r2i2p1f2/ACmon/od550aer/gn/v20190923")
print(repr(report))
print(report.errors)

[token 'CMIP_ERROR_HERE' not compliant with activity_id at position 2]


The validation issues can be processed by implementing a [parser issue visitor](https://esgf.github.io/esgf-vocab/api_documentation/drs.html#esgvoc.apps.drs.report.ParserIssueVisitor) and a [validation issue visitor](https://esgf.github.io/esgf-vocab/api_documentation/drs.html#esgvoc.apps.drs.report.ValidationIssueVisitor):

In [14]:
class MyValidationVisitor(esgvoc.apps.drs.report.ParserIssueVisitor):
    def visit_invalid_token_issue(self, issue):
        print(f'Doing something automatically with a invalid token issue, other than printing it')
    # You should implement the other methods of ParserIssueVisitor and ValidatorIssueVisitor too!

my_visitor = MyValidationVisitor()
report.errors[0].accept(my_visitor)

Doing something automatically with a invalid token issue, other than printing it


## DRS generation

The DRS generation consists of generate a DRS expression from an unordered mapping of collections and tokens or a bag of unordered tokens.

### Instantiation

Import the generator class:

In [17]:
from esgvoc.apps.drs.generator import DrsGenerator

Instanciate a generator object, for example the CMIP6Plus CV:

In [18]:
generator = DrsGenerator("cmip6plus")

### Mapping

Build a dictionary that maps tokens with their collections:

In [26]:
mapping = {
    'member_id': 'r2i2p1f2',
    'activity_id': 'CMIP',
    'source_id': 'MIROC6',
    'mip_era': 'CMIP6Plus',
    'experiment_id': 'amip',
    'variable_id': 'od550aer',
    'table_id': 'ACmon',
    'grid_label': 'gn',
    'version': 'v20190923',
    'institution_id': 'IPSL',
    'extra_information': 'some_value'
}

Then generate a DRS directory expression:

In [28]:
generator.generate_directory_from_mapping(mapping=mapping)



It has successfully generate the directory expression, even if the mapping has some extra information (quite the opposite of the DRS validation). The same mapping can also generate the associated dataset id and file name expressions, *provided it has enough information!*

In [30]:
generator.generate_dataset_id_from_mapping(mapping=mapping)



In [32]:
generator.generate_file_name_from_mapping(mapping=mapping) # This one has a warning because the period is missing.

