Preface: I used Python 3 (v 5.5.0) to write this project. Other used software are annotated below in **boldface**.

# Biological Database: Metabolism

In this lab, I used **NCBI GenBank** and **SIB SwissProt ExPASy** (accessed via **Entrez**) to find gene and protein data for enzymes in several metabolic pathways. In particular, I found data for the following enzymes from three organisms: _Drosophila melanogaster_, _Escherichia coli_, and _Homo sapiens_.

Glycolysis: hexokinase, phosphofructokinase, enolase, pyruvate kinase

Tricarboxylic Acid Cycle (TCA): aconitase, succcinate dehydrogenase, fumarase, citrate synthase

Pentose Phosphate Pathway (PPP): G6P dehydrogenase, 6-phosphogluconate dehydrogenase, ribulose-phosphate 3-epimerase, ribose phosphate isomerase


For gene data, I found **NCBI GenBank** to be the most comprehensive. For protein data, I found **ExPASy** most comprehensive. I also referenced **KEGG PATHWAY** metabolic pathway charts. Having comprehensive data is, of course, preferable. I also liked these databases for their easy readability.

## 1. Gene Table

In [1]:
# I used Entrez to find gene data for the metabolic enzymes listed above. A sample (Homo sapiens hexokinase) is provided below. In particular, I looked for "CDS" data and searched up each enzyme's GeneID to confirm sequences.

from Bio import Entrez
Entrez.email = 'michael193@berkeley.edu'
handle = Entrez.esearch(db='nucleotide',
                       term ='homo sapiens[ORGN] Hexokinase',
                       sort='relevance',
                       idtype='acc')

for i in Entrez.read(handle)['IdList']:
    handle = Entrez.efetch(db='nucleotide',id=i,rettype='gb',retmode='text',retnum=1)
    print(handle.read())

LOCUS       AH005851                1676 bp    DNA     linear   PRI 10-JUN-2016
DEFINITION  Homo sapiens chromosome 10 hexokinase (HK) gene, partial cds.
ACCESSION   AH005851 AF029305 AF029306
VERSION     AH005851.2
KEYWORDS    .
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 1676)
  AUTHORS   Murakami,K. and Piomelli,S.
  TITLE     The erythrocyte-specific hexokinase isozyme (HKR) and the common
            hexokinase isozyme (HKI) are produced from a single gene by
            alternate promoters
  JOURNAL   Blood 90 (10), 272a (1998)
REFERENCE   2  (bases 1 to 1676)
  AUTHORS   Murakami,K. and Piomelli,S.
  TITLE     Direct Submission
  JOURNAL   Submitted (08-OCT-1997) Pediatrics, Columbia University, 630 West
            168th Street, New York, NY 10032, 

LOCUS       NM_001322365            4000 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 7, mRNA.
ACCESSION   NM_001322365
VERSION     NM_001322365.1
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 4000)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 4000)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the suscep

LOCUS       NM_001322366            3493 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 8, mRNA.
ACCESSION   NM_001322366
VERSION     NM_001322366.1
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3493)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3493)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the suscep

LOCUS       NM_000188               3617 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 1, mRNA.
ACCESSION   NM_000188
VERSION     NM_000188.2
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3617)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3617)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the susceptibili

LOCUS       NM_033500               3979 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 5, mRNA.
ACCESSION   NM_033500
VERSION     NM_033500.2
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3979)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3979)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the susceptibili

LOCUS       NM_033496               3614 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 2, mRNA.
ACCESSION   NM_033496
VERSION     NM_033496.2
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3614)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3614)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the susceptibili

LOCUS       NM_001358263            3998 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 10, mRNA.
ACCESSION   NM_001358263
VERSION     NM_001358263.1
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3998)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3998)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the susce

LOCUS       NM_001322364            3695 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 6, mRNA.
ACCESSION   NM_001322364 XM_005269736
VERSION     NM_001322364.1
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3695)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3695)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhan

LOCUS       NM_033497               3832 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 3, mRNA.
ACCESSION   NM_033497
VERSION     NM_033497.2
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3832)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3832)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the susceptibili

LOCUS       NM_033498               3886 bp    mRNA    linear   PRI 17-JUN-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), transcript variant 4, mRNA.
ACCESSION   NM_033498
VERSION     NM_033498.2
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3886)
  AUTHORS   Liu X, Salokas K, Tamene F, Jiu Y, Weldatsadik RG, Ohman T and
            Varjosalo M.
  TITLE     An AP-MS- and BioID-compatible MAC-tag enables comprehensive
            mapping of protein interactions and subcellular localizations
  JOURNAL   Nat Commun 9 (1), 1188 (2018)
   PUBMED   29568061
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 3886)
  AUTHORS   Zhou Y, Ding BZ, Lin YP and Wang HB.
  TITLE     MiR-34a, as a suppressor, enhance the susceptibili

LOCUS       NC_000010          133797422 bp    DNA     linear   CON 26-MAR-2018
DEFINITION  Homo sapiens chromosome 10, GRCh38.p12 Primary Assembly.
ACCESSION   NC_000010
VERSION     NC_000010.11
DBLINK      BioProject: PRJNA168
            Assembly: GCF_000001405.38
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 133797422)
  CONSRTM   International Human Genome Sequencing Consortium
  TITLE     Finishing the euchromatic sequence of the human genome
  JOURNAL   Nature 431 (7011), 931-945 (2004)
   PUBMED   15496913
REFERENCE   2  (bases 1 to 133797422)
  AUTHORS   Deloukas,P., Earthrowl,M.E., Grafham,D.V., Rubenfield,M.,
            French,L., Steward,C.A., Sims,S.K., Jones,M.C., Searle,S.,
            Scott,C., Howe,K., Hunt,S.E., Andrews,

LOCUS       XM_011539732            3871 bp    mRNA    linear   PRI 26-MAR-2018
DEFINITION  PREDICTED: Homo sapiens hexokinase 1 (HK1), transcript variant X2,
            mRNA.
ACCESSION   XM_011539732
VERSION     XM_011539732.1
DBLINK      BioProject: PRJNA168
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
COMMENT     MODEL REFSEQ:  This record is predicted by automated computational
            analysis. This record is derived from a genomic sequence
            (NC_000010.11) annotated using gene prediction method: Gnomon,
            supported by mRNA and EST evidence.
            Also see:
                Documentation of NCBI's Annotation Process
            
            ##Genome-Annotation-Data-START##
            Annotation Provider         :: NCBI
          

LOCUS       XM_005269737            3873 bp    mRNA    linear   PRI 26-MAR-2018
DEFINITION  PREDICTED: Homo sapiens hexokinase 1 (HK1), transcript variant X3,
            mRNA.
ACCESSION   XM_005269737
VERSION     XM_005269737.1
DBLINK      BioProject: PRJNA168
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
COMMENT     MODEL REFSEQ:  This record is predicted by automated computational
            analysis. This record is derived from a genomic sequence
            (NC_000010.11) annotated using gene prediction method: Gnomon,
            supported by mRNA and EST evidence.
            Also see:
                Documentation of NCBI's Annotation Process
            
            ##Genome-Annotation-Data-START##
            Annotation Provider         :: NCBI
          

LOCUS       NG_012077             138883 bp    DNA     linear   PRI 22-JUL-2018
DEFINITION  Homo sapiens hexokinase 1 (HK1), RefSeqGene (LRG_365) on chromosome
            10.
ACCESSION   NG_012077
VERSION     NG_012077.1
KEYWORDS    RefSeq; RefSeqGene.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 138883)
  AUTHORS   Andreoni F, Ruzzo A and Magnani M.
  TITLE     Structure of the 5' region of the human hexokinase type I (HKI)
            gene and identification of an additional testis-specific HKI mRNA
  JOURNAL   Biochim. Biophys. Acta 1493 (1-2), 19-26 (2000)
   PUBMED   10978502
REFERENCE   2  (bases 1 to 138883)
  AUTHORS   Bird,T.D.
  TITLE     Charcot-Marie-Tooth Neuropathy Type 4
  JOURNAL   (in) Adam MP, Ardinger HH, Pagon RA, Wallace SE, Bean LJH, S

LOCUS       XM_024447969            4061 bp    mRNA    linear   PRI 26-MAR-2018
DEFINITION  PREDICTED: Homo sapiens hexokinase 1 (HK1), transcript variant X1,
            mRNA.
ACCESSION   XM_024447969
VERSION     XM_024447969.1
DBLINK      BioProject: PRJNA168
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
COMMENT     MODEL REFSEQ:  This record is predicted by automated computational
            analysis. This record is derived from a genomic sequence
            (NC_000010.11) annotated using gene prediction method: Gnomon,
            supported by mRNA and EST evidence.
            Also see:
                Documentation of NCBI's Annotation Process
            
            ##Genome-Annotation-Data-START##
            Annotation Provider         :: NCBI
          

LOCUS       AH006101               18171 bp    DNA     linear   PRI 28-JUL-2016
DEFINITION  Homo sapiens clone YAC 908_E_9 hexokinase 1 isoform tb (HK1),
            hexokinase 1 isoform tc (HK1), and hexokinase 1 isoform ta (HK1)
            genes, partial sequence; and hexokinase (HK1) gene, complete cds,
            alternatively spliced.
ACCESSION   AH006101 AF000431 AF016349-AF016365 AF163908-AF163913
VERSION     AH006101.4
KEYWORDS    .
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 18171)
  AUTHORS   Ruzzo,A., Andreoni,F. and Magnani,M.
  TITLE     Structure of the human hexokinase type I gene and nucleotide
            sequence of the 5' flanking region
  JOURNAL   Biochem. J. 331 (Pt 2), 607-613 (1998)
   PUBMED   9531504
REFERENCE   2  (bases 1 to 57

LOCUS       NM_000189               7109 bp    mRNA    linear   PRI 16-SEP-2018
DEFINITION  Homo sapiens hexokinase 2 (HK2), mRNA.
ACCESSION   NM_000189
VERSION     NM_000189.4
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 7109)
  AUTHORS   Liu G, Zhang CD, Wang J and Jia WC.
  TITLE     Inhibition of the oxidative stress-induced miR-125b protects
            glucose metabolic disorders of human retinal pigment epithelium
            (RPE) cells
  JOURNAL   Cell. Mol. Biol. (Noisy-le-grand) 64 (4), 1-5 (2018)
   PUBMED   29631677
  REMARK    GeneRIF: Overexpression of miR-125b inhibits cellular glucose
            metabolism through direct targeting of hexokinase 2.
            Publication Status: Online-Only
REFERENCE   2  (bases 1 to 710

LOCUS       CH471083            28940744 bp    DNA     linear   CON 23-MAR-2015
DEFINITION  Homo sapiens 211000035830871 genomic scaffold, whole genome shotgun
            sequence.
ACCESSION   CH471083 AADB02000000
VERSION     CH471083.1
DBLINK      BioProject: PRJNA1431
            BioSample: SAMN02981219
KEYWORDS    WGS.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 28940744)
  AUTHORS   Venter,J.C., Adams,M.D., Myers,E.W., Li,P.W., Mural,R.J.,
            Sutton,G.G., Smith,H.O., Yandell,M., Evans,C.A., Holt,R.A.,
            Gocayne,J.D., Amanatides,P., Ballew,R.M., Huson,D.H., Wortman,J.R.,
            Zhang,Q., Kodira,C.D., Zheng,X.H., Chen,L., Skupski,M.,
            Subramanian,G., Thomas,P.D., Zhang,J., Gabor Miklos,G.L.,
            Nelson,C., Brod

LOCUS       NM_025130               3719 bp    mRNA    linear   PRI 24-JUN-2018
DEFINITION  Homo sapiens hexokinase domain containing 1 (HKDC1), mRNA.
ACCESSION   NM_025130
VERSION     NM_025130.3
KEYWORDS    RefSeq.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 3719)
  AUTHORS   Evstafieva AG, Kovaleva IE, Shoshinova MS, Budanov AV and Chumakov
            PM.
  TITLE     Implication of KRT16, FAM129A and HKDC1 genes as ATF4 regulated
            components of the integrated stress response
  JOURNAL   PLoS ONE 13 (2), e0191107 (2018)
   PUBMED   29420561
  REMARK    GeneRIF: results suggest a conditional regulation of KRT16 gene by
            ATF4 that may be inhibited in normal cells, but engaged during
            cancer progression. Potential roles of K

In [2]:
# I created a database (my.db) to store all the data I found through NCBI GenBank. The following code manually adds each data entry into the database table "genes".

import sqlite3
conn = sqlite3.connect('my.db')
c = conn.cursor()

c.execute("""CREATE TABLE genes (id INT PRIMARY KEY ASC, 
                                name TEXT,
                                species TEXT,
                                description TEXT, 
                                chromosome TEXT, 
                                start INT, 
                                end INT, 
                                strand VARCHAR(1));""")

# Glycolysis

    # hexokinase/glucokinase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (3098,'HK1','Homo sapiens','hexokinase 1','10',69269991,69401882,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (45875,'Hex-A','Drosophila melanogaster','hexokinase A','X',9585675,9589813,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (946858,'Glk','Escherichia coli','glucokinase','N/A',2508461,2509426,'-');""")

    # phosphofructokinase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (5213,'PFKM','Homo sapiens','phosphofructokinase, muscle','12',48105253,48146404,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (36060,'Pfk','Drosophila melanogaster','phosphofructokinase','2R',10109740,10117457,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (946230,'pfkB','Escherichia coli','6-phosphofructokinase II','N/A',1806370,1807299,'-');""")

    # enolase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (2026,'ENO2','Homo sapiens','enolase 2','12',6914450,6923696,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (33351,'Eno','Drosophila melanogaster','enolase','2L',1724768,1729636,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (945032,'eno','Escherichia coli','enolase','N/A',2906643,2907941,'-');""")

    # pyruvate kinase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (5313,'PKLR','Homo sapiens','pyruvate kinase L/R','1',155289293,155301434,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (42620,'PyK','Drosophila melanogaster','pyruvate kinase','3R',22367536,22372746,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (946179,'pykF','Escherichia coli','pyruvate kinase I','N/A',1755698,1757110,'-');""")

# Tricarboxylic acid cycle

    # aconitase/aconitate hydratase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (50,'ACO2','Homo sapiens','aconitase 2','22',41468756,41528989,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (44149,'Acon','Drosophila melanogaster','aconitase','2L',21168570,21173185,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (946724,'acnA','Escherichia coli','aconitate hydratase 1','N/A',1335831,1338506,'-');""")

    # succinate dehydrogenase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (6389,'SDHA','Homo sapiens','succinate dehydrogenase complex flavoprotein subunit A','5',218223,264816,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (37228,'SdhA','Drosophila melanogaster','succinate dehydrogenase, subunit A (flavoprotein)','2R',19391863,19395971,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (945402,'sdhA','Escherichia coli','succinate dehydrogenase, flavoprotein subunit','N/A',755907,757673,'-');""")

    # fumarase/fumarate hydratase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (2271,'FH','Homo sapiens','fumarate hydratase','1',241497557,241519785,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (31606,'CG4095','Drosophila melanogaster','CG4095','X',6681767,6683696,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (946826,'fumA','Escherichia coli','fumarate hydratase (fumarase A), aerobic Class I','N/A',1686731,1688377,'-');""")

    # citrate synthase
    
c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (1431,'CS','Homo sapiens','citrate synthase','12',56271699,56300391,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (36760,'kdn','Drosophila melanogaster','knockdown','X',6350629,6361323,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (945323,'gltA','Escherichia coli','citrate synthase','N/A',753185,754468,'-');""")
    
# Pentose phosphate pathway

    # G6P dehydrogenase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (9563,'H6PD','Homo sapiens','hexose-6-phosphate dehydrogenase','1',9234767,9271337,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (32974,'Zw','Drosophila melanogaster','zwischenferment','X',19667252,19672353,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (946370,'zwf','Escherichia coli','glucose-6-phosphate 1-dehydrogenase','N/A',1934839,1936314,'-');""")

    # 6-phosphogluconate dehydrogenase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (5226,'PGD','Homo sapiens','phosphogluconate dehydrogenase','1',10398992,10420511,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (31185,'Pgd','Drosophila melanogaster','phosphogluconate dehydrogenase','X',2145136,2148566,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (946554,'gnd','Escherichia coli','6-phosphogluconate dehydrogenase, decarboxylating','N/A',2099862,2101268,'-');""")
    
    # ribulose phosphate 3-epimerase
    
c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (6120,'RPE','Homo sapiens','ribulose-5-phosphate-3-epimerase','2',210002565,210023363,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (35692,'CG30499','Drosophila melanogaster','CG30499','2R',7564434,7565751,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (947896,'rpe','Escherichia coli','D-ribulose-5-phosphate 3-epimerase','N/A',3514382,3515059,'-');""")
    
    # ribose phosphate isomerase

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (22934,'RPIA','Homo sapiens','ribose 5-phosphate isomerase A','2',88691658,88750935,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (246599,'Rpi','Drosophila melanogaster','ribose-5-phosphate isomerase','2R',23355183,23356441,'-');""")

c.execute("""INSERT INTO genes (id,name,species,description,chromosome,start,end,strand)
                        VALUES (947407,'rpiA','Escherichia coli','ribose 5-phosphate isomerase, constitutive','N/A',3058666,3059325,'-');""")
    
conn.commit()

c.execute("SELECT * FROM genes;")
print(c.fetchall())

[(3098, 'HK1', 'Homo sapiens', 'hexokinase 1', '10', 69269991, 69401882, '-'), (45875, 'Hex-A', 'Drosophila melanogaster', 'hexokinase A', 'X', 9585675, 9589813, '-'), (946858, 'Glk', 'Escherichia coli', 'glucokinase', 'N/A', 2508461, 2509426, '-'), (5213, 'PFKM', 'Homo sapiens', 'phosphofructokinase, muscle', '12', 48105253, 48146404, '-'), (36060, 'Pfk', 'Drosophila melanogaster', 'phosphofructokinase', '2R', 10109740, 10117457, '-'), (946230, 'pfkB', 'Escherichia coli', '6-phosphofructokinase II', 'N/A', 1806370, 1807299, '-'), (2026, 'ENO2', 'Homo sapiens', 'enolase 2', '12', 6914450, 6923696, '-'), (33351, 'Eno', 'Drosophila melanogaster', 'enolase', '2L', 1724768, 1729636, '-'), (945032, 'eno', 'Escherichia coli', 'enolase', 'N/A', 2906643, 2907941, '-'), (5313, 'PKLR', 'Homo sapiens', 'pyruvate kinase L/R', '1', 155289293, 155301434, '-'), (42620, 'PyK', 'Drosophila melanogaster', 'pyruvate kinase', '3R', 22367536, 22372746, '-'), (946179, 'pykF', 'Escherichia coli', 'pyruvate k

## 2. Pathway Table

In [3]:
# The following code adds another table, "pathways", to the database. Only 12 enzymes are included for brevity (isoforms of enzymes have different names but all operate in their respective cycles).

conn = sqlite3.connect('my.db')
c = conn.cursor()

c.execute("""CREATE TABLE pathways (name TEXT,
                                description TEXT, 
                                strand VARCHAR(1));""")

# Glycolysis

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('hexokinase','glycolysis','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('phosphofructokinase','glycolysis','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('enolase','glycolysis','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('pyruvate kinase','glycolysis','-');""")

# Tricarboxylic acid cycle

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('aconitase','TCA','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('succinate dehydrogenase','TCA','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('fumarase','TCA','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('citrate synthase','TCA','-');""")

# Pentose phosphate pathway

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('G6P dehydrogenase','PPP','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('6-phosphogluconate dehydrogenase','PPP','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('ribulose-phosphate 3-epimerase','PPP','-');""")

c.execute("""INSERT INTO pathways (name,description,strand)
                        VALUES ('ribose phosphate isomerase','PPP','-');""")

conn.commit()

c.execute("SELECT * FROM pathways;")
print(c.fetchall())

[('hexokinase', 'glycolysis', '-'), ('phosphofructokinase', 'glycolysis', '-'), ('enolase', 'glycolysis', '-'), ('pyruvate kinase', 'glycolysis', '-'), ('aconitase', 'TCA', '-'), ('succinate dehydrogenase', 'TCA', '-'), ('fumarase', 'TCA', '-'), ('citrate synthase', 'TCA', '-'), ('G6P dehydrogenase', 'PPP', '-'), ('6-phosphogluconate dehydrogenase', 'PPP', '-'), ('ribulose-phosphate 3-epimerase', 'PPP', '-'), ('ribose phosphate isomerase', 'PPP', '-')]


## 3. Enzyme Table

In [4]:
# The following code adds yet another table, "enzymes", to the database. Only 12 enzymes are listed, as multiple genes encode the same enzyme (though different isoforms). All data from SBI ExPASy.

conn = sqlite3.connect('my.db')
c = conn.cursor()

c.execute("""CREATE TABLE enzymes (name TEXT, 
                                function TEXT, 
                                commission TEXT,
                                strand VARCHAR(1));""")

# Glycolysis

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('hexokinase','transfers phosphate group from ATP to glucose','2.7.1.1','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('ADP-specific phosphofructokinase','transfers phosphate group from ADP to D-fructose-6-phosphate','2.7.1.146','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('phosphopyruvate hydratase','converts 2-phospho-D-glycerate to phosphoenolpyruvate','4.2.1.11','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('pyruvate kinase','transfers phosphate group from phosphoenolpyruvate to ADP',
                        '2.7.1.40','-');""")

# Tricarboxylic acid cycle

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('aconitate hydratase','isomerizes citrate to isocitrate','4.2.1.3','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('succinate dehydrogenase (quinone)','converts succinate to fumarate','1.3.5.1','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('fumarate hydratase','converts fumarate to malate','4.2.1.2','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('citrate synthase','reacts acetyl-CoA and oxaloacetate to form citrate and CoA','2.3.3.16','-');""")

# Pentose phosphate pathway

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('glucose-6-phosphate dehydrogenase','oxidizes G6P with NADP+ to form 6-phosphogluconate and NADPH','1.1.1.49','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('phosphogluconate 2-dehydrogenase','oxidizes 6-phosphogluconate with NADP+ to form 6-phospho-2-dehydro-gluconate','1.1.1.43','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('ribulose-phosphate 3-epimerase','converts ribulose 5-phosphate to xylulose 5-phosphate','5.1.3.1','-');""")

c.execute("""INSERT INTO enzymes (name,function,commission,strand)
                        VALUES ('ribose-5-phosphate isomerase','isomerization between ribulose 5-phosphate and ribose 5-phosphate','5.3.1.6','-');""")

conn.commit()

c.execute("SELECT * FROM enzymes;")
print(c.fetchall())

[('hexokinase', 'transfers phosphate group from ATP to glucose', '2.7.1.1', '-'), ('ADP-specific phosphofructokinase', 'transfers phosphate group from ADP to D-fructose-6-phosphate', '2.7.1.146', '-'), ('phosphopyruvate hydratase', 'converts 2-phospho-D-glycerate to phosphoenolpyruvate', '4.2.1.11', '-'), ('pyruvate kinase', 'transfers phosphate group from phosphoenolpyruvate to ADP', '2.7.1.40', '-'), ('aconitate hydratase', 'isomerizes citrate to isocitrate', '4.2.1.3', '-'), ('succinate dehydrogenase (quinone)', 'converts succinate to fumarate', '1.3.5.1', '-'), ('fumarate hydratase', 'converts fumarate to malate', '4.2.1.2', '-'), ('citrate synthase', 'reacts acetyl-CoA and oxaloacetate to form citrate and CoA', '2.3.3.16', '-'), ('glucose-6-phosphate dehydrogenase', 'oxidizes G6P with NADP+ to form 6-phosphogluconate and NADPH', '1.1.1.49', '-'), ('phosphogluconate 2-dehydrogenase', 'oxidizes 6-phosphogluconate with NADP+ to form 6-phospho-2-dehydro-gluconate', '1.1.1.43', '-'), (

## 4. Associative "Tables"

Rather than encode tables that find associations between pathway enzymes, I figured it would be best to write my thoughts on their association with each other:

It is noteworthy that there is significant overlap between glycolysis, TCA, and PPP. Of the enzymes I added into the database, _hexokinase_ is the notable example. Indeed, hexokinase phosphorylates glucose to glucose-6-phosphate in both glycolysis and PPP. Enzymes that belong to multiple pathways are _one-to-many_ pathways.

Order of enzyme appearance in each cycle could be represented in the table by adding another parameter column. For example, I could add the "order" column and add integer values for the appearance of hexokinase in glycolysis (1) or PPP (1), and etc. Then, simply sort the data table by the "order" column by coding "PRIMARY KEY ASC".

Enzymes in the enzyme table are _one-to-many_ genes in the gene table. That is, many genes from different organisms encode for the same enzyme (though they exist as different isoforms). 