### Confirm Curator Notebook Validation Remains Intact

Adopted from cxg_4.0.0_testing/405_category_of_bool.ipynb  

For this schema update, we will focus most of the individual test cases through pytest. However, in our workflow, we primarily access the Validator through the cellxgene-schema module and the terminal using the subprocess module. Running this notebook during each iteration of curation validation checks that our standard validation workflow remains intact.

Make sure to select the correct test env with the latest version of cellxgene-schema installed via the github repo

Also set scc_repo_loc to the directory of the local repo for single-cell-curation

In [1]:
import os
import scanpy as sc
import subprocess

In [2]:
try:
    scc_repo_loc = os.path.expanduser('~/GitClones/CZI/single-cell-curation/')
    current_commit = subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD'], cwd=scc_repo_loc).decode('ascii').strip()
    main_commit = subprocess.check_output(['git', 'rev-parse', '--short', 'main'], cwd=scc_repo_loc).decode('ascii').strip()
except FileNotFoundError as e:
    print(f"{e}: Please enter correct local location of single-cell-curation repo")
    current_commit = 'Incorrect repo location'

In [3]:
def validate(file):
    validate_process = subprocess.run(['cellxgene-schema', 'validate', file], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    for line in validate_process.stdout.decode('utf-8').split('\n'):
        print(line)
    for line in validate_process.stderr.decode('utf-8').split('\n'):
        print(line)
        if 'is_valid=' in line:
            valid = line.split('=')[-1]
            return valid

In [4]:
def save_and_test(adata, expected):
    adata.write(filename='test.h5ad')
    adata = sc.read_h5ad('test.h5ad')

    print("A valid h5ad")
    print('------------------')

    valid = validate('test.h5ad')
    print('------------------')
    
    if expected != valid:
        print('\033[1m\033[91mERROR\033[0m')
    else:
        print('\033[1m\033[92mPASSED\033[0m')
    
    if current_commit != main_commit:
        print('NOT ON MAIN BRANCH')
    else:
        print(f'Using CZI single-cell-curation commit: {current_commit}')

    os.remove('test.h5ad')

## Test Validator Pathway

In [5]:
adata = sc.read_h5ad("fixtures/valid.h5ad")
adata.obs['assay_ontology_term_id'] = 'EFO:0022490'
save_and_test(adata,'True')

A valid h5ad
------------------
Loading dependencies
Loading validator modules

Starting validation...
Validation complete in 0:00:01.213689 with status is_valid=True
------------------
[1m[92mPASSED[0m
Using CZI single-cell-curation commit: 15bdad5
