## Working with PATRIC genomes

ModelSEED for cobrapy provides functions for working with PATRIC genomes.

In [1]:
import modelseed

Get summary information for a PATRIC genome with `get_genome_summary()`. Note that information available in the summary can be different for different genomes depending on the source of the genome.

In [2]:
modelseed.get_genome_summary('226186.12')

{u'_version_': 1552608979212828700,
 u'assembly_accession': u'GCA_000011065.1',
 u'bioproject_accession': u'PRJNA399',
 u'biosample_accession': u'SAMN02604314',
 u'brc1_cds': 0,
 u'cell_shape': u'Rod',
 u'chromosomes': 1,
 u'class': u'Bacteroidia',
 u'comments': [u'Bacteroides thetaiotaomicron strain VPI-5482. This is the type strain for this organism and was isolated from the feces of a healthy adult.'],
 u'common_name': u'Bacteroides_thetaiotaomicron_VPI-5482',
 u'completion_date': u'2003-03-29T00:00:00Z',
 u'contigs': 0,
 u'date_inserted': u'2014-12-08T22:10:24.729Z',
 u'date_modified': u'2015-03-16T03:17:09.594Z',
 u'disease': [u'Peritonitis'],
 u'document_type': u'genome',
 u'family': u'Bacteroidaceae',
 u'gc_content': 42.9,
 u'genbank_accessions': u'AE015928,AY171301',
 u'genome_id': u'226186.12',
 u'genome_length': 6293399,
 u'genome_name': u'Bacteroides thetaiotaomicron VPI-5482',
 u'genome_status': u'Complete',
 u'genus': u'Bacteroides',
 u'gram_stain': u'-',
 u'habitat': u'Ho

Get the features for an annotated genome with `get_genome_features()`. Both PATRIC and RefSeq annotations are available.

In [3]:
features = modelseed.get_genome_features('226186.12', annotation='PATRIC')
len(features)

4965

The returned list has detailed information about each feature including the type and DNA sequence. If the feature is a coding sequence, the returned data also includes the amino acid sequence.

In [4]:
features[100]

{u'aa_length': 695,
 u'aa_sequence': u'MMIRKTLTILAVSCMMYSCGTKTESNPFFTEFQTEYGVPSFDKIKLEHYEPAFLKGIEEQNQNIQAIIASPEVPTFDNTIVALDSSAPILDRVSAIFFNMTDAETTDELTELSIKMAPVLSEHEDNISLNQELFKRVNVVYQQKDSMNLTTEQKRLLDKTYKGFVRSGANLDAEKQARLREINKELSTLGITFSNNILNENNAFQLFVDKKEDLAGLPEWFCQSAAEEAKAAGQPGKWLFTLHNASRLPFLQYAENRPLREKMYKAYINRGNNNDKNDNKETIRKIVSLRLEKARLLGFNNYANFVLDETMSKNDSNVMSLLNNLWSYALPKAKAEAAELQQLMDKEGKGEKLEAWDWWYYTEKLRKEKYNLSEEDTKPYFKLENVREGAFAVANKLYGITLNKLEGIPTYHPDVEVFEVKDADGSQLGIFYVDYFPRSGKSGGAWMSNYREQQGATRPLVCNVCSFTKPVGDTPSLLTMDEVETLFHEFGHALHGLLTKCEYKGTSGTNVVRDFVELPSQINEHWATEPEVLKMYAKHYQTGEVIPDEIIEKILKQKTFNQGFMTTELLAAAILDMNLHMITDVKNLDMLAFEKEAMDKLGLIPEIAPRYRVTYFNHIIGGYAAGYYSYLWANVLDNDAFEAFKEHGIFDKNTADLFRYNVLEKGDSEDPMILYKNFRGAEPSLEPLLKNRGMK',
 u'aa_sequence_md5': u'5e357f79ce4c35c27824cc81d38127c6',
 u'accession': u'NC_004663',
 u'alt_locus_tag': u'VBIBacThe70966_2881',
 u'annotation': u'PATRIC',
 u'date_inserted': u'2014-10-21T02:19:49.692Z',
 u'date_modified': u'2014-10-27T18:24:47.095Z',
 u'ec': [u