## Playing with the 'eutils'
See [Eutils on PyPu](https://pypi.org/project/eutils/)

It's not yet clear to me the relative merits between [eutils](https://pypi.org/project/eutils/) and [Biopython.Entrez](https://biopython.org/DIST/docs/api/Bio.Entrez-module.html). 
One advantage of etils seems to be caching requrests.

In [1]:
import os
import eutils

In [2]:
os.environ['NCBI_API_KEY'] = '1dfc9b8cead3b15249f3b59c933ab702b109'

In [3]:
print(os.environ['NCBI_API_KEY'])

1dfc9b8cead3b15249f3b59c933ab702b109


In [4]:
# Initialize a client. This client handles all caching and query
# throttling.  For example:
ec = eutils.Client(api_key=os.environ.get("NCBI_API_KEY", None), cache=True)

In [5]:
esr = ec.efetch(db='gene', id=7124)

In [6]:
[method for method in dir(esr.entrezgenes[0]) if '__' not in method]

['_root_tag',
 '_xml_root',
 'common_tax',
 'description',
 'gene_commentaries',
 'gene_id',
 'genus_species',
 'hgnc',
 'locus',
 'maploc',
 'references',
 'summary',
 'synonyms',
 'tax_id',
 'type']

In [7]:
eg = esr.entrezgenes[0]

In [8]:
eg.references

[<eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc3466aa90>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc346cf358>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc346cf3c8>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc341b7358>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc1ed03f28>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc1ed03a58>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc1ed03b38>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc1ed039e8>,
 <eutils.xmlfacades.genecommentary.GeneCommentary at 0x7efc1ed03b70>]

In [9]:
# fetch one of those 
egs = ec.efetch(db='gene', id=7124)

# One may fetch multiple genes at a time. These are returned as an
# EntrezgeneSet. We'll grab the first child, which returns
# an instance of the Entrezgene class.
eg = egs.entrezgenes[0]

# Easily access some basic information about the gene
eg.hgnc, eg.maploc, eg.description, eg.type, eg.genus_species

('TNF', '6p21.33', 'tumor necrosis factor', 'protein-coding', 'Homo sapiens')

In [10]:
# get a list of genomic references
sorted([(r.acv, r.label) for r in eg.references])

[('NC_000006.12', 'Chromosome 6 Reference GRCh38.p12 Primary Assembly'),
 ('NG_007462.1', 'RefSeqGene'),
 ('NT_113891.3', 'Chromosome 6 Reference GRCh38.p12 ALT_REF_LOCI_2'),
 ('NT_167244.2', 'Chromosome 6 Reference GRCh38.p12 ALT_REF_LOCI_1'),
 ('NT_167245.2', 'Chromosome 6 Reference GRCh38.p12 ALT_REF_LOCI_3'),
 ('NT_167246.2', 'Chromosome 6 Reference GRCh38.p12 ALT_REF_LOCI_4'),
 ('NT_167247.2', 'Chromosome 6 Reference GRCh38.p12 ALT_REF_LOCI_5'),
 ('NT_167248.2', 'Chromosome 6 Reference GRCh38.p12 ALT_REF_LOCI_6'),
 ('NT_167249.2', 'Chromosome 6 Reference GRCh38.p12 ALT_REF_LOCI_7')]

In [11]:
# Get the first three products defined on GRCh38
#>>> [p.acv for p in eg.references[0].products][:3]
#['NM_001126112.2', 'NM_001276761.1', 'NM_000546.5']

# As a sample, grab the first product defined on this reference (order is arbitrary)
mrna = eg.references[0].products[0]
str(mrna)

'GeneCommentary(acv=NM_000594.3,type=mRNA,heading=Reference,label=None)'

In [12]:
# mrna.genomic_coords provides access to the exon definitions on this reference

mrna.genomic_coords.gi, mrna.genomic_coords.strand

('568815592', 1)

In [13]:
mrna.genomic_coords.intervals

[(31575566, 31575926),
 (31576533, 31576578),
 (31576766, 31576813),
 (31577115, 31578335)]

In [14]:
# and the mrna has a product, the resulting protein:
str(mrna.products[0])

'GeneCommentary(acv=NP_000585.2,type=peptide,heading=Reference,label=None)'