**fileExtensionMissingSynonym.ipynb**

EDAM Format concept is missing synyonm or label matching the file extension.

**Documentation:** https://github.com/edamontology/edamverify/blob/master/docs/fileExtensionMissingSynonym.md

Set constants for script return values. Load EDAM_dev.owl from GitHub into an RDF graph.

In [37]:
import io
import sys
from rdflib import ConjunctiveGraph, Namespace

# Constants for script return value as per https://github.com/edamontology/edamverify.
NOERR = 0
INFO  = 1
WARN  = 2
ERROR = 3

#Load EDAM_dev.owl from GitHub into an RDF graph.
print("Loading graph ...", end="")
g = ConjunctiveGraph()
# g.load('https://raw.githubusercontent.com/edamontology/edamontology/master/EDAM_dev.owl', format='xml')
g.load('EDAM_dev.owl', format='xml')
g.bind('edam', Namespace('http://edamontology.org#'))
print("done!")



Loading graph ...done!


Define SPARQL query to retrieve file extension of (EDAM Format) concepts. Run the query.

**NB:** BASE is used to define the define the default namespace (for ``file_extension`` below).

In [38]:
# Compile SPARQL query
query_term = """
BASE <http://edamontology.org/>
SELECT ?id ?term ?ext ?exact_syn WHERE
{
?id rdfs:label ?term .
?id :file_extension ?ext .
?id oboInOwl:hasExactSynonym ?exact_syn 
}
"""

# Run SPARQL query and collate results
errfound = False    
report = list()
results = g.query(query_term)

No <exactSynonym> or <rdfs:label> found corresponding to <file_extension> for these concepts:
http://edamontology.org/format_3749 (JSON-LD): jsonld
http://edamontology.org/format_3789 (XQuery): xq|xqy|xquery
http://edamontology.org/format_3556 (MHTML): mhtml|mht|eml
http://edamontology.org/format_3556 (MHTML): mhtml|mht|eml
http://edamontology.org/format_3556 (MHTML): mhtml|mht|eml
http://edamontology.org/format_3556 (MHTML): mhtml|mht|eml
http://edamontology.org/format_3556 (MHTML): mhtml|mht|eml
http://edamontology.org/format_3556 (MHTML): mhtml|mht|eml
http://edamontology.org/format_3556 (MHTML): mhtml|mht|eml
http://edamontology.org/format_3750 (YAML): yaml|yml
http://edamontology.org/format_3475 (TSV): tsv|tab
http://edamontology.org/format_3475 (TSV): tsv|tab
http://edamontology.org/format_1930 (FASTQ): fq
http://edamontology.org/format_1930 (FASTQ): fq
http://edamontology.org/format_3746 (BIOM format): biom


SystemExit: 2

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


Analyse results of query.

In [None]:
for r in results :
#    print(str(r['id']), str(r['term']), str(r['ext']), str(r['exact_syn']))
    id   = str(r['id'])
    term = str(r['term']) 
    ext  = str(r['ext'])
    exact_syn = str(r['exact_syn'])

    if (ext.lower() != exact_syn.lower()) and (ext.lower() != term.lower()): 
        errfound = True
        report.append(id +  ' (' + term + '): ' + ext)

Write report and return approriate value.

In [None]:
# Return exit code (raises exception) 
if errfound == True:
    print("No <exactSynonym> or <rdfs:label> found matching to <file_extension> for these concepts:")
    print("\n".join(report))
    sys.exit(WARN)
else:
    print("No issues found.")
    sys.exit(NOERR)