In [1]:
import sys
print('Python %s on %s' % (sys.version, sys.platform))
sys.path.extend(['/Users/m097749/Source/cobrapy', '/Users/m097749/Source/cobrapy-modelseed'])

Python 2.7.11 (v2.7.11:6d1b6a68f775, Dec  5 2015, 12:54:16) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin


## ModelSEED for cobrapy

ModelSEED for cobrapy provides support for creating a COBRA model from a ModelSEED model and using the ModelSEED web service to create draft models from genomes available in the [Pathosystems Resource Integration Center](https://www.patricbrc.org/portal/portal/patric/Home) (PATRIC). If you are not a [registered PATRIC user](http://enews.patricbrc.org/faqs/workspace-faqs/registration-faqs/), you must complete a [new user registration](https://user.patricbrc.org/register/) to work with the ModelSEED web service.

Before using ModelSEED functions, you must first get an authentication token with your PATRIC username and password. The `get_token()` function stores the authentication token in the `.patric_config` file in your home directory. You only need to get an authentication token the first time you use the ModelSEED functions. Change `username` in the cell below to your PATRIC username and enter your password when prompted. The returned user ID identifies your ModelSEED workspace.


In [2]:
import modelseed
modelseed.get_token('username')

patric password: ········


u'mmundy@patricbrc.org'

First, reconstruct a draft model for an organism with `reconstruct_modelseed_model()`. You need to provide a PATRIC genome ID to identify the organism. You can [search for genomes](https://www.patricbrc.org/portal/portal/patric/Genomes) on the PATRIC website from the thousands of bacterial organisms available. After a model is reconstructed, you refer to it by ID. By default, the ID of the model is the genome ID. You can give a model a different ID with the `model_id` parameter. Note that it takes a minute or two for the ModelSEED web service to run a function and return a result.

In [3]:
modelseed.reconstruct_modelseed_model('226186.12')

{'fba_count': 0,
 'gapfilled_reactions': 0,
 'gene_associated_reactions': 1034,
 'genome_ref': u'/mmundy@patricbrc.org/modelseed/226186.12/genome',
 'id': u'226186.12',
 'integrated_gapfills': 0,
 'name': u'Bacteroides thetaiotaomicron VPI-5482',
 'num_biomass_compounds': 85,
 'num_biomasses': 1,
 'num_compartments': 2,
 'num_compounds': 1202,
 'num_genes': 739,
 'num_reactions': 1034,
 'ref': u'/mmundy@patricbrc.org/modelseed/226186.12',
 'rundate': u'2016-12-12T17:21:01',
 'source': u'ModelSEED',
 'source_id': u'226186.12',
 'template_ref': u'/chenry/public/modelsupport/templates/GramNegative.modeltemplate',
 'type': u'GenomeScale',
 'unintegrated_gapfills': 0}

Next, gap fill using the ModelSEED algorithm with `gapfill_modelseed_model()`. By default the ModelSEED model is gap filled on complete media. Use the `media_reference` parameter to specify a different media. ModelSEED provides over 500 media in the `/chenry/public/modelsupport/media` folder (see below for how to show all of the available media). This step is optional if you want to use other gap fill algorithms in cobrapy.

In [4]:
modelseed.gapfill_modelseed_model('226186.12')

{'fba_count': 0,
 'gapfilled_reactions': 0,
 'gene_associated_reactions': 1034,
 'genome_ref': u'/mmundy@patricbrc.org/modelseed/226186.12/genome',
 'id': u'226186.12',
 'integrated_gapfills': 1,
 'name': u'Bacteroides thetaiotaomicron VPI-5482',
 'num_biomass_compounds': 85,
 'num_biomasses': 1,
 'num_compartments': 2,
 'num_compounds': 1253,
 'num_genes': 739,
 'num_reactions': 1129,
 'ref': u'/mmundy@patricbrc.org/modelseed/226186.12',
 'rundate': u'2016-12-12T17:21:01',
 'source': u'ModelSEED',
 'source_id': u'226186.12',
 'template_ref': u'/chenry/public/modelsupport/templates/GramNegative.modeltemplate',
 'type': u'GenomeScale',
 'unintegrated_gapfills': 0}

Next, run a simulation using the ModelSEED flux balance analysis algorithm with `optimize_modelseed_model()`. Use the `media_reference` parameter to specify a different media for the simulation. This step is optional if you want to run the analysis in cobrapy.

In [5]:
modelseed.optimize_modelseed_model('226186.12')

99.9203

Finally, create a COBRA model from the ModelSEED model with `create_cobra_model_from_modelseed_model()`. It is possible for duplicate metabolites to be removed when creating the COBRA model. Now you can analyze the model using all of the functionality in cobrapy. 

In [6]:
model = modelseed.create_cobra_model_from_modelseed_model('226186.12')
model.id

u'226186.12'

### Managing your ModelSEED models and workspace

There are more functions for managing your ModelSEED models and workspace. Get current statistics about a ModelSEED model with `get_modelseed_model_stats()`.

In [7]:
modelseed.get_modelseed_model_stats('226186.12')

{'fba_count': 0,
 'gapfilled_reactions': 0,
 'gene_associated_reactions': 1034,
 'genome_ref': u'/mmundy@patricbrc.org/modelseed/226186.12/genome',
 'id': u'226186.12',
 'integrated_gapfills': 1,
 'name': u'Bacteroides thetaiotaomicron VPI-5482',
 'num_biomass_compounds': 85,
 'num_biomasses': 1,
 'num_compartments': 2,
 'num_compounds': 1253,
 'num_genes': 739,
 'num_reactions': 1129,
 'ref': u'/mmundy@patricbrc.org/modelseed/226186.12',
 'rundate': u'2016-12-12T17:21:01',
 'source': u'ModelSEED',
 'source_id': u'226186.12',
 'template_ref': u'/chenry/public/modelsupport/templates/GramNegative.modeltemplate',
 'type': u'GenomeScale',
 'unintegrated_gapfills': 0}

Get the details of the ModelSEED gap fill solutions with `get_modelseed_gapfill_solutions()`. There can be multiple gap fill solutions for a model and the returned list is sorted from newest to oldest.

In [8]:
gf_solutions = modelseed.get_modelseed_gapfill_solutions('226186.12')

Get the number of reactions in a gap fill solution by checking the length of the `reactions` key in a solution. 

In [9]:
len(gf_solutions[0]['reactions'])

97

The `reactions` key is a dictionary keyed by reaction ID with details on the reactions added to the model.

In [10]:
gf_solutions[0]['reactions']['rxn00737_c0']

{u'compartment': u'c0',
 u'direction': u'>',
 u'reaction': u'~/fbamodel/template/reactions/id/rxn00737_c'}

Get the details of a ModelSEED flux balance analysis solution with `get_modelseed_fba_solutions()`. There can be multiple fba solutions for a model and the returned list is sorted from newest to oldest.

In [11]:
fba_solutions = modelseed.get_modelseed_fba_solutions('226186.12')

In an fba solution, the `exchanges` key is a dictionary keyed by metabolite ID of the metabolites that can be exchanged with the boundary. Metabolites with a positive flux are consumed and metabolites with a negative flux are produced.

In [12]:
fba_solutions[0]['exchanges']['cpd00001_e0']

{'lower_bound': -1000, 'upper_bound': 100, 'x': -830.615}

The `reactions` key is a dictionary keyed by reaction ID with details on the bounds and flux for every reaction in model.

In [13]:
fba_solutions[0]['reactions']['rxn00737_c0']

{'lower_bound': 0, 'upper_bound': 1000, 'x': 32.6722}

Get a list of all of the ModelSEED models stored in your ModelSEED workspace with `list_modelseed_models()`.  Remove the `print_output` parameter to return a list of model statistics about your models.

In [14]:
modelseed.list_modelseed_models(print_output=True)

Model /mmundy@patricbrc.org/modelseed/226186.12 for organism Bacteroides thetaiotaomicron VPI-5482 with 1129 reactions and 1253 metabolites


If you no longer need a ModelSEED model, delete it from your ModelSEED workspace with `delete_modelseed_model()`.

In [15]:
modelseed.delete_modelseed_model('226186.12')

Get a list of objects in a folder in a ModelSEED workspace with `list_workspace_objects()`. For example, get a list of all of the media available for gap filling with this command:

In [16]:
modelseed.list_workspace_objects('/chenry/public/modelsupport/media', print_output=True)

Contents of /chenry/public/modelsupport/media:
-rr chenry    	       605	2015-05-11T05:39:01	media       	/chenry/public/modelsupport/media/Sulfate-N-Acetyl-D-galactosamine
-rr chenry    	       590	2015-05-11T05:39:01	media       	/chenry/public/modelsupport/media/Sulfate-L-Arabitol
-rr chenry    	       584	2015-05-11T05:39:01	media       	/chenry/public/modelsupport/media/Carbon-tricarballylate
-rr chenry    	       584	2015-05-11T05:39:01	media       	/chenry/public/modelsupport/media/Sulfate-Cystathionine
-rr chenry    	       590	2015-05-11T05:39:01	media       	/chenry/public/modelsupport/media/Sulfate-Thymidine
-rr chenry    	       582	2015-05-11T05:39:02	media       	/chenry/public/modelsupport/media/Phosphate-O-Phospho-L-Serine
-rr chenry    	       583	2015-05-11T05:39:02	media       	/chenry/public/modelsupport/media/Sulfate-L-Methionine
-rr chenry    	       590	2015-05-11T05:39:02	media       	/chenry/public/modelsupport/media/Sulfate-D-Galactose
-rr chenry    	       58

Get the metadata for an object in the workspace with `get_workspace_object_metadata()`. The metadata for an object can include additional information about the contents or attributes of the object.

In [17]:
modelseed.get_workspace_object_meta('/chenry/public/modelsupport/templates/GramNegative.modeltemplate')

[u'GramNegative.modeltemplate',
 u'modeltemplate',
 u'/chenry/public/modelsupport/templates/',
 u'2016-11-16T06:52:49',
 u'49A0E23A-ABC9-11E6-87C4-6AEA682E0674',
 u'chenry',
 26559014,
 {},
 {u'is_folder': 0},
 u'r',
 u'r',
 u'']

### Working with PATRIC genomes

Get summary information for a PATRIC genome with `get_genome_summary()`. Note that information available in the summary can be different for different genomes depending on the source of the genome

In [18]:
modelseed.get_genome_summary('226186.12')

{u'_version_': 1552608979212828700,
 u'assembly_accession': u'GCA_000011065.1',
 u'bioproject_accession': u'PRJNA399',
 u'biosample_accession': u'SAMN02604314',
 u'brc1_cds': 0,
 u'cell_shape': u'Rod',
 u'chromosomes': 1,
 u'class': u'Bacteroidia',
 u'comments': [u'Bacteroides thetaiotaomicron strain VPI-5482. This is the type strain for this organism and was isolated from the feces of a healthy adult.'],
 u'common_name': u'Bacteroides_thetaiotaomicron_VPI-5482',
 u'completion_date': u'2003-03-29T00:00:00Z',
 u'contigs': 0,
 u'date_inserted': u'2014-12-08T22:10:24.729Z',
 u'date_modified': u'2015-03-16T03:17:09.594Z',
 u'disease': [u'Peritonitis'],
 u'document_type': u'genome',
 u'family': u'Bacteroidaceae',
 u'gc_content': 42.9,
 u'genbank_accessions': u'AE015928,AY171301',
 u'genome_id': u'226186.12',
 u'genome_length': 6293399,
 u'genome_name': u'Bacteroides thetaiotaomicron VPI-5482',
 u'genome_status': u'Complete',
 u'genus': u'Bacteroides',
 u'gram_stain': u'-',
 u'habitat': u'Ho

Get the features for an annotated genome with `get_genome_features()`. Both PATRIC and RefSeq annotations are available.

In [19]:
features = modelseed.get_genome_features('226186.12', annotation='PATRIC')
features[0]

{u'accession': u'NC_004663',
 u'alt_locus_tag': u'VBIBacThe70966_r045',
 u'annotation': u'PATRIC',
 u'date_inserted': u'2014-10-21T00:21:20.047Z',
 u'date_modified': u'2014-10-22T09:01:13.537Z',
 u'end': 3034136,
 u'feature_id': u'PATRIC.226186.12.NC_004663.rRNA.3031256.3034136.rev',
 u'feature_type': u'rRNA',
 u'gene_id': 0,
 u'genome_id': u'226186.12',
 u'genome_name': u'Bacteroides thetaiotaomicron VPI-5482',
 u'gi': 0,
 u'location': u'complement(3031256..3034136)',
 u'na_length': 2881,
 u'na_sequence': u'aagaaagtaagcaagggcgcatggcggatgccttggctctcggaggcgatgaaggacgtgataagctgcgataagctctgggtaggtgcaaataacctttgatccagagatttccgaatgggacaacccggcattctgaaggaatgtcatccatctttgatggaagctaacgcagggaactgaaacatcttagtacctgtaggaaaagaaaataataatgattcccctagtagtggcgagcgaacggggaatagcccaaaccacccatgttacggcatgtgtggggttgtaggaccacgatgtcgcaagacatttgatgagtagaatcctctggaaagttgaaccatagacggtgatagtccggtatacgaagtcaaattaagcgtagtggtatcctgagtagcgcgggacacgagaaatcttgcgtgaatctgccgggaccatccggtaaggctaaatactcccgagagaccgatagcgaaccag