<a id="begin"></a>
# Generating COMBINE Archives from the SBML Test Suite

The [SBML test suite](http://sbml.org/Software/SBML_Test_Suite) is intended to test the accuracy SBML ODE simulators by providing a normative standard to compare simulation results to. The test suite predates SED-ML and specifies the simulation parameters in a custom format. However, recent versions include the SED-ML necessary to reproduce the simulations. Here, we provide an automated script to convert these test cases into COMBINE archives, which can be imported in Tellurium.

This notebook is adapted from Stanley Gu's original [here](http://nbviewer.jupyter.org/github/stanleygu/sbmltest2archive/blob/master/create_archives.ipynb).

This notebook contains two parts: converting the SBML test cases to COMBINE archives and running the simulations in the archives. To skip to running the simulations, click [here](#simulations).

**Tip**: Double-click this cell to edit it.

### Outstanding issues: cannot cd to notebook dir, need to figure out which cases to support by default.

In [12]:
import pprint


In [13]:
pp = pprint.PrettyPrinter()

In [None]:
pp.

In [16]:
import pprint, tellurium as te

# level and version of SBML to use
lv_string = 'l3v1'
# subset of the cases to convert and/or run
cases = te.getSupportedTestCases()
print('Using the following cases:')
pprint.PrettyPrinter().pprint(cases)
# maximum cutoff for passing a test case (per-variable)
max_threshold = 1e-3

# change to the directory of this notebook
%cd ~/devel/src/tellurium-combine-archive-test-cases/sbml-test-suite

<a id="simulations"></a>
## Step 6: Run all simulations in the COMBINE archives in Tellurium.

This runs the simulations in all COMBINE archives specified by the `cases` variable. By default, this will run all tests that Tellurium supports, which will take a **long** time. You can set the `cases` variable to a subset of tests at the [beginning](#begin) of this notebook.

In [11]:
import os, tellurium as te

lv_archive_path = os.path.join('archives', lv_string)
n_failures = 0
n_successes = 0

for case in cases:
    archive_name = case+'.omex'
    print('Running {}'.format(archive_name))
    case_path = os.path.join(lv_archive_path, archive_name)
    te.convertAndExecuteCombineArchive(case_path)
    
    # compare results
    csv = te.extractFileFromCombineArchive(case_path, case+'-results.csv')
    from io import StringIO
    import pandas as pd
    df = pd.read_csv(StringIO(csv))
    report = te.getLastReport()
    report = report.drop(report.shape[0]-1)
    df.columns = report.columns
    # difference between simulation and expected results
    diff = report.subtract(df)
    max_val = (diff**2).mean().max()
    if max_val > max_threshold:
        n_failures += 1
    else:
        n_successes += 1

print('Finished running tests: {} PASS, {} FAIL'.format(n_successes, n_failures))

## Step 2: Download the SBML test cases

In [17]:

import os.path
import platform

if not platform.system() == 'Linux':
    import urllib.request
    # url for the test case archive 
    url = 'http://sourceforge.net/projects/sbml/files/test-suite/3.1.1/cases-archive/sbml-test-cases-2014-10-22.zip'

    test_cases_filename = 'sbml-test-cases.zip'

    # download the test case archive
    with urllib.request.urlopen(url) as response, open(test_cases_filename, 'wb') as out_file:
        out_file.write(response.read())
else:
    !wget http://sourceforge.net/projects/sbml/files/test-suite/3.1.1/cases-archive/sbml-test-cases-2014-10-22.zip
    %mv sbml-test-cases-2014-10-22.zip sbml-test-cases.zip

print('Downloaded test case archive to {}'.format(os.path.abspath(test_cases_filename)))

--2017-08-25 11:35:07--  http://sourceforge.net/projects/sbml/files/test-suite/3.1.1/cases-archive/sbml-test-cases-2014-10-22.zip
Resolving sourceforge.net (sourceforge.net)... 216.34.181.60
Connecting to sourceforge.net (sourceforge.net)|216.34.181.60|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://sourceforge.net/projects/sbml/files/test-suite/3.1.1/cases-archive/sbml-test-cases-2014-10-22.zip [following]
--2017-08-25 11:35:07--  https://sourceforge.net/projects/sbml/files/test-suite/3.1.1/cases-archive/sbml-test-cases-2014-10-22.zip
Connecting to sourceforge.net (sourceforge.net)|216.34.181.60|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://sourceforge.net/projects/sbml/files/test-suite/3.1.1/cases-archive/sbml-test-cases-2014-10-22.zip/download [following]
--2017-08-25 11:35:08--  https://sourceforge.net/projects/sbml/files/test-suite/3.1.1/cases-archive/sbml-test-cases-2014-10-22.zip/

## Step 3: Extract the Test Case Archive

In [18]:
import os, errno, zipfile

# extract to 'archives' directory
with zipfile.ZipFile(test_cases_filename) as z:
    z.extractall('.')
    
print('Extracted test cases to {}'.format(os.path.abspath('.')))

Extracted test cases to /home/poltergeist/devel/src/tellurium-combine-archive-test-cases/sbml-test-suite


## Step 5: Create COMBINE Archives

Here, we create COMBINE archives for SBML Level 3 Version 1. We could package all levels and versions in the same COMBINE archive, but this would drastically increase simulation time.

First, we will set the level and version used to pick out only L3V1 SBML test cases (all SED-ML in this example is L1V1):

Next, we create the directory structure for the archives

In [19]:
# create a function to make a new directory if it doesn't already exist
def mkdirp(path):
    try:
        os.makedirs(path)
    except OSError as exc:
        if exc.errno == errno.EEXIST and os.path.isdir(path):
            pass
        else:
            raise

# make a new directory for this level and version if it doesn't already exist
lv_archive_path = os.path.join('archives', lv_string)
mkdirp(lv_archive_path)

print('Created directory {}'.format(lv_archive_path))

Created directory archives/l3v1


## Step 6: Create the archives by including SBML, SED-ML, and results files.

In [21]:
import re
from xml.dom import minidom
import xml.etree.ElementTree as ET

n_archives_written = 0

for case in cases:
    test_case_path = os.path.join('cases', 'semantic', case)
    ls = os.listdir(test_case_path)
    # Only L3V1:
    regex_sbml = re.compile(case + r'-sbml-{}\.xml'.format(lv_string), re.IGNORECASE)
    regex_sedml = re.compile(case + r'-sbml-{}-sedml\.xml'.format(lv_string), re.IGNORECASE)
    regex_csv = re.compile(case + r'-results\.csv$', re.IGNORECASE)
    # All levels/versions:
    # regex_sbml = re.compile(case + '-sbml-l\dv\d\.xml', re.IGNORECASE)
    # regex_sedml = re.compile(case + '-sbml-l\dv\d\-sedml.xml', re.IGNORECASE)

    sbmlfiles = sorted([file for file in ls if regex_sbml.search(file)])
    sedmlfiles = sorted([file for file in ls if regex_sedml.search(file)])
    csvfiles = sorted([file for file in ls if regex_csv.search(file)])
    plot_file = [file for file in ls if 'plot.jpg' in file][0]

    ET.register_namespace('', 'http://identifiers.org/combine.specifications/omex-manifest')
    manifest_template = '''<?xml version="1.0" encoding="UTF-8"?>
    <omexManifest
        xmlns="http://identifiers.org/combine.specifications/omex-manifest"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://identifiers.org/combine.specifications/omex-manifest combine.xsd "></omexManifest>
    '''

    doc = ET.fromstring(manifest_template)
    manifest = ET.SubElement(doc, 'content')
    manifest.attrib['format'] = 'http://identifiers.org/combine.specifications/omex-manifest'
    manifest.attrib['location'] = './manifest.xml'

    for sbmlfile in sbmlfiles:
        model = ET.SubElement(doc, 'content')
        model.attrib['format'] = 'http://identifiers.org/combine.specifications/sbml'
        model.attrib['location'] = './' + sbmlfile

    for sedmlfile in sedmlfiles:
        sedml = ET.SubElement(doc, 'content')
        sedml.attrib['format'] = 'http://identifiers.org/combine.specifications/sed-ml.level-1.version-1'
        sedml.attrib['location'] = './' + sedmlfile
        sedml.attrib['master'] = 'true'

    for csvfile in csvfiles:
        csv = ET.SubElement(doc, 'content')
        csv.attrib['format'] = 'http://identifiers.org/combine.specifications/csv'
        csv.attrib['location'] = './' + csvfile
        
    archive_files = sbmlfiles + sedmlfiles + csvfiles

    xml_str = ET.tostring(doc, encoding='UTF-8')
    # reparse the xml string to pretty print it
    reparsed = minidom.parseString(xml_str)
    pretty_xml_str = reparsed.toprettyxml(indent="    ")

    # use zipfile to create Combine archive containing
    from zipfile import ZipFile
    archive_name = case + '.omex'
    archive_path = os.path.join('archives', lv_string, archive_name)
    initial_wd = os.getcwd()
    ls = os.listdir(test_case_path)
    with ZipFile(archive_path, 'w') as archive:
        # write the manifest
        archive.writestr('manifest.xml', pretty_xml_str.encode('utf-8'))
        os.chdir(test_case_path)
        for f in archive_files:
            archive.write(f)

    os.chdir(initial_wd)
    
    # print the number of contained files (add +1 for the manifest)
    print('Created {} (containing {} files)'.format(archive_path, 1+len(archive_files)))
    
    n_archives_written += 1

print('Finished writing archives ({} total archives written)'.format(n_archives_written))