# Code examples for: The UniProt website API: facilitating programmatic access to protein knowledge.

## There are two versions of this notebook:

1. Interactive [Google Colab Notebook](https://colab.research.google.com/drive/16gG2a0BpIMe3zr0VLlisRWKGDkxaemk1?usp=sharing)
2. Static [ipynb file on github](https://github.com/ebi-uniprot/uniprot-manual/blob/main/notebooks/uniprot-rest-tutorial.ipynb)


## Setup working environment:

In [1]:
from io import StringIO
from collections import defaultdict
import json
import requests
import pandas as pd

# Increase display width of pandas columns
pd.set_option('max_colwidth', 200)


# URL for the REST API used throughtout the notebook
WEBSITE_API = "https://rest.uniprot.org"

# Make GET requests from the REST API and return results in a convenient data structure.
def get(path, format="JSON"):
    response = requests.get(f"{WEBSITE_API}/{path}")
    if response.ok:
        if "fasta" in path:
            return response.text
        if "format=tsv" in path:
            return pd.read_csv(StringIO(response.text), sep="\t")
        response_json = response.json()
        if "results" in response_json:
            return response_json["results"]
        else:
            return response_json
    response.raise_for_status()

## Use case 1: Cross-database querying using the Website API

For this example we will take the sequence of a SARS-CoV-2 protein that was sequenced from a nasal swab in Washington in March 2020 and use it to understand the mechanism of SARS-CoV-2 transmission to wild white-tailed deer during the global pandemic.

Search for EMBL ID QIZ14413 in UniParc and return a UniParc ID:

In [3]:
embl_id = "QIZ14413"
uniparc_QIZ14413_search = get(f"/uniparc/search?query=(dbid:{embl_id}) AND (database:embl-cds)&fields=upi")
if not uniparc_QIZ14413_search:
    raise ValueError(f"No UniParc entries found for {embl_id}.")

print(json.dumps(uniparc_QIZ14413_search, indent=2))

[
  {
    "uniParcId": "UPI00131F240A",
    "oldestCrossRefCreated": "2020-01-14",
    "mostRecentCrossRefUpdated": "2025-02-05"
  }
]


In [4]:
try:
    uniparc_id = uniparc_QIZ14413_search[0]["uniParcId"]
except (IndexError, KeyError) as e:
    print(f"Index or key not found: {e}")

Get the complete UPI00131F240A entry and filter the cross-referenced databases for our specific EMBL ID QIZ14413:

In [5]:
uniparc_UPI00131F240A = get(f"/uniparc/{uniparc_id}")
embl_QIZ14413_xrefs = [xref for xref in uniparc_UPI00131F240A["uniParcCrossReferences"] if xref["database"] == "EMBL" and xref["id"] == embl_id]
try:
    embl_QIZ14413_xref = embl_QIZ14413_xrefs[0]
except IndexError as e:
    print(f"EMBL {embl_id} cross-reference not found. Error: {e}")

embl_QIZ14413_xref

{'database': 'EMBL',
 'id': 'QIZ14413',
 'versionI': 1,
 'version': 1,
 'active': True,
 'created': '2020-04-14',
 'lastUpdated': '2024-07-17',
 'geneName': 'S',
 'proteinName': 'Surface glycoprotein',
 'organism': {'scientificName': 'Severe acute respiratory syndrome coronavirus 2',
  'commonName': '2019-nCoV',
  'taxonId': 2697049}}

Search UPI00131F240A for active UniProtKB Reviewed/Swiss-Prot entries:

In [6]:
active_swissprot_xrefs = get(f"/uniparc/{uniparc_id}/databases?active=true&dbTypes=UniProtKB%2FSwiss-Prot")
try:
    # Try to get the first cross-reference
    active_swissprot_xref = active_swissprot_xrefs[0]
except IndexError as e:
    raise ValueError(f"No active Swiss-Prot entries found for {embl_id}: {e}")
print(json.dumps(active_swissprot_xref, indent=2))

{
  "database": "UniProtKB/Swiss-Prot",
  "id": "P0DTC2",
  "versionI": 1,
  "version": 1,
  "active": true,
  "created": "2020-04-22",
  "lastUpdated": "2025-02-05",
  "geneName": "S",
  "proteinName": "Spike glycoprotein",
  "organism": {
    "scientificName": "Severe acute respiratory syndrome coronavirus 2",
    "commonName": "2019-nCoV",
    "taxonId": 2697049
  }
}


It can be seen that P0DTC2 is the active entry in UPI00131F240A. We will now search the P0DTC2 entry in UniProtKB and return Protein name, Gene name and Taxonomic lineage in TSV format:

In [7]:
active_swissprot_id = active_swissprot_xref["id"]
df_uniprotkb_P0DTC2 = get(f"/uniprotkb/{active_swissprot_id}?format=tsv&fields=protein_name,gene_names,lineage")
if df_uniprotkb_P0DTC2.empty:
    raise ValueError(f"No UniProtKB entry found for {active_swissprot_id}.")

df_uniprotkb_P0DTC2

Unnamed: 0,Protein names,Gene Names,Taxonomic lineage
0,Spike glycoprotein (S glycoprotein) (E2) (Peplomer protein) [Cleaved into: Spike protein S1; Spike protein S2; Spike protein S2'],S 2,"Viruses (superkingdom), Riboviria (clade), Orthornavirae (kingdom), Pisuviricota (phylum), Pisoniviricetes (class), Nidovirales (order), Cornidovirineae (suborder), Coronaviridae (family), Orthoco..."


Find proteins that spike glycoprotein interacts with and the return `cc_subunit` field from P0DTC2. Filter these to find comments that include "Spike protein S1":

In [8]:
uniprotkb_P0DTC2_cc_subunit = get(f"/uniprotkb/{active_swissprot_id}?fields=cc_subunit")
uniprotkb_P0DTC2_spike_comments = [
    text["value"]
    for comment in uniprotkb_P0DTC2_cc_subunit["comments"]
    for text in comment["texts"]
    if "Spike protein S1" in comment["molecule"]
]
if not uniprotkb_P0DTC2_spike_comments:
    raise ValueError(f"No 'Spike protein S1' comments found for {active_swissprot_id}.")

uniprotkb_P0DTC2_spike_comments

["Binds to host ACE2 (PubMed:32075877, PubMed:32132184, PubMed:32155444, PubMed:32221306, PubMed:32225175, PubMed:32225176, PubMed:33607086). RBD also interacts with the N-linked glycan on 'Asn-90' of ACE2 (PubMed:33607086). Cleavage of S generates a polybasic C-terminal sequence on S1 that binds to host Neuropilin-1 (NRP1) and Neuropilin-2 (NRP2) receptors (PubMed:33082293, PubMed:33082294). Interacts with host integrin alpha-5/beta-1 (ITGA5:ITGB1) and with ACE2 in complex with integrin alpha-5/beta-1 (ITGA5:ITGB1) (PubMed:33102950). May interact via cytoplasmic c-terminus with M protein (PubMed:33229438). May interact (via N-terminus) with host bilirubin and biliverdin, thereby preventing antibody binding to the SARS-CoV-2 spike NTD via an allosteric mechanism (PubMed:33888467). May interact with host LRRC15, thereby allowing attachment to host cells (PubMed:36735681)"]

Results from the previous query indicate that interaction with Human ACE2 is the primary infection initiation mechanism.
Now we will search UniProtKB for reviewed entries which have gene name ACE2 and the taxonomy ID 9606 (human):

In [9]:
spike_binding_gene = "ACE2"
human_taxonomy_id = 9606
swissprot_human_spike_binding_gene_search = get(f"/uniprotkb/search?query=(gene:{spike_binding_gene}) AND (taxonomy_id:{human_taxonomy_id}) AND (reviewed:true)&format=tsv")
if swissprot_human_spike_binding_gene_search.empty:
    raise ValueError(f"No Swiss-Prot entries found for {spike_binding_gene} in human.")

swissprot_human_spike_binding_gene_search

Unnamed: 0,Entry,Entry Name,Reviewed,Protein names,Gene Names,Organism,Length
0,Q9BYF1,ACE2_HUMAN,reviewed,Angiotensin-converting enzyme 2 (EC 3.4.17.23) (Angiotensin-converting enzyme homolog) (ACEH) (Angiotensin-converting enzyme-related carboxypeptidase) (ACE-related carboxypeptidase) (EC 3.4.17.-) ...,ACE2 UNQ868/PRO1885,Homo sapiens (Human),805


Building on interaction information discovered in the previous task, now look for amino acid regions that spike glycoprotein may bind to. Searching Q9BYF1 for feature regions 'FT_REGIONS':

In [10]:
swissprot_human_spike_binding_entry_id = swissprot_human_spike_binding_gene_search.loc[0]["Entry"]
swissprot_human_spike_binding_entry = get(f"/uniprotkb/{swissprot_human_spike_binding_entry_id}")
swissprot_human_spike_binding_regions = [feature for feature in swissprot_human_spike_binding_entry["features"] if feature['type'] == "Region"]
if not swissprot_human_spike_binding_regions:
    raise ValueError(f"No 'Region' features found for {swissprot_human_spike_binding_entry_id}.")

print(json.dumps(swissprot_human_spike_binding_regions, indent=2))

[
  {
    "type": "Region",
    "location": {
      "start": {
        "value": 30,
        "modifier": "EXACT"
      },
      "end": {
        "value": 41,
        "modifier": "EXACT"
      }
    },
    "description": "Interaction with SARS-CoV spike glycoprotein",
    "evidences": [
      {
        "evidenceCode": "ECO:0000269",
        "source": "PubMed",
        "id": "15791205"
      }
    ]
  },
  {
    "type": "Region",
    "location": {
      "start": {
        "value": 82,
        "modifier": "EXACT"
      },
      "end": {
        "value": 84,
        "modifier": "EXACT"
      }
    },
    "description": "Interaction with SARS-CoV spike glycoprotein",
    "evidences": [
      {
        "evidenceCode": "ECO:0000269",
        "source": "PubMed",
        "id": "15791205"
      }
    ]
  },
  {
    "type": "Region",
    "location": {
      "start": {
        "value": 353,
        "modifier": "EXACT"
      },
      "end": {
        "value": 357,
        "modifier": "EXACT"
      }

Regions between residues 30-41, 82-84 and 353-357 are identified as spike glycoprotein interacting regions via their free-text comment. What experimental evidence is there to support this?
Searching for mutagenesis data that are present between residue numbers 30-41, 82-84 and 353-357 in Q9BYF1:

In [11]:
# Function used to filter the features based position and Mutagenesis
def is_sarscov_mutagenesis_region(feature):
    if feature['type'] != "Mutagenesis":
        return False
    start = feature["location"]["start"]["value"]
    end = feature["location"]["end"]["value"]
    # These feature bounds are determined from binding regions in cell above
    return (30 <= start and end <= 41) or \
      (82 <= start and end <= 84) or \
      (353 <= start and end <= 357)

Create a dataframe from a filtered list of dicts:

In [12]:
df_sarscov_mutagenesis = pd.DataFrame([
    dict(start=feature["location"]["start"]["value"], end=feature["location"]["end"]["value"],description=feature["description"])
    for feature in swissprot_human_spike_binding_entry["features"] if is_sarscov_mutagenesis_region(feature)
])
if df_sarscov_mutagenesis.empty:
    raise ValueError(f"No mutagenesis data found for {swissprot_human_spike_binding_entry_id}.")

df_sarscov_mutagenesis

Unnamed: 0,start,end,description
0,31,31,Abolishes interaction with SARS-CoV spike glycoprotein.
1,31,31,Increases slightly the interaction with RBD domain of SARS-CoV-2 spike protein.
2,33,33,Increases slightly the interaction with RBD domain of SARS-CoV-2 spike protein.
3,34,34,Increases slightly the interaction with RBD domain of SARS-CoV-2 spike protein.
4,37,37,No effect on interaction with SARS-CoV spike glycoprotein.
5,38,38,No effect on interaction with SARS-CoV spike glycoprotein.
6,39,39,Increases slightly the interaction with RBD domain of SARS-CoV-2 spike protein.
7,40,40,Increases slightly the interaction with RBD domain of SARS-CoV-2 spike protein.
8,41,41,Strongly inhibits interaction with SARS-CoV spike glycoprotein.
9,41,41,Increases slightly the interaction with RBD domain of SARS-CoV-2 spike protein.


Results confirm there are multiple residue mutants within the interaction regions that indicate SARS-CoV and SARS-CoV-2 spike glycoprotein binding.

After confirmation of spike glycoprotein interaction with ACE2 protein regions in human, we will now investigate if the same interaction sites are conserved in white-tailed deer. First we will find the white-tailed deer (TaxID 9874) entry for ACE2 in UniProtKB:

In [13]:
white_tailed_dear_taxonomy_id = "9874"
white_tailed_dear_spike_binding_search = get(f"/uniprotkb/search?query=(gene:{spike_binding_gene}) AND (taxonomy_id:{white_tailed_dear_taxonomy_id})&format=tsv")
if white_tailed_dear_spike_binding_search.empty:
    raise ValueError(f"No UniProtKB entries found for {spike_binding_gene} in white-tailed deer.")

white_tailed_dear_spike_binding_search

Unnamed: 0,Entry,Entry Name,Reviewed,Protein names,Gene Names,Organism,Length
0,A0A6J0Z472,A0A6J0Z472_ODOVR,unreviewed,Angiotensin-converting enzyme (EC 3.4.-.-),ACE2,Odocoileus virginianus texanus,679


Align Q9BYF1 and A0A6J0Z472 to investigate the sequence conservation of human ACE2 and white-tailed deer ACE2:

In [14]:
trembl_white_tailed_dear_spike_binding_entry_id = white_tailed_dear_spike_binding_search.loc[0]["Entry"]
accessions_to_align = ",".join([
    swissprot_human_spike_binding_entry_id,
    trembl_white_tailed_dear_spike_binding_entry_id
])
print(f"Fetching FASTAs for {accessions_to_align}")
spike_binding_fastas = get(f"/uniprotkb/accessions?accessions={accessions_to_align}&format=fasta")
if spike_binding_fastas.count(">") != 2 \
        and swissprot_human_spike_binding_entry_id not in spike_binding_fastas \
        and trembl_white_tailed_dear_spike_binding_entry_id not in spike_binding_fastas:
    raise ValueError(f"FASTAs not found for {accessions_to_align}.")

print(spike_binding_fastas)

Fetching FASTAs for Q9BYF1,A0A6J0Z472
>sp|Q9BYF1|ACE2_HUMAN Angiotensin-converting enzyme 2 OS=Homo sapiens OX=9606 GN=ACE2 PE=1 SV=2
MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLASWNYNTNITEENVQ
NMNNAGDKWSAFLKEQSTLAQMYPLQEIQNLTVKLQLQALQQNGSSVLSEDKSKRLNTIL
NTMSTIYSTGKVCNPDNPQECLLLEPGLNEIMANSLDYNERLWAWESWRSEVGKQLRPLY
EEYVVLKNEMARANHYEDYGDYWRGDYEVNGVDGYDYSRGQLIEDVEHTFEEIKPLYEHL
HAYVRAKLMNAYPSYISPIGCLPAHLLGDMWGRFWTNLYSLTVPFGQKPNIDVTDAMVDQ
AWDAQRIFKEAEKFFVSVGLPNMTQGFWENSMLTDPGNVQKAVCHPTAWDLGKGDFRILM
CTKVTMDDFLTAHHEMGHIQYDMAYAAQPFLLRNGANEGFHEAVGEIMSLSAATPKHLKS
IGLLSPDFQEDNETEINFLLKQALTIVGTLPFTYMLEKWRWMVFKGEIPKDQWMKKWWEM
KREIVGVVEPVPHDETYCDPASLFHVSNDYSFIRYYTRTLYQFQFQEALCQAAKHEGPLH
KCDISNSTEAGQKLFNMLRLGKSEPWTLALENVVGAKNMNVRPLLNYFEPLFTWLKDQNK
NSFVGWSTDWSPYADQSIKVRISLKSALGDKAYEWNDNEMYLFRSSVAYAMRQYFLKVKN
QMILFGEEDVRVANLKPRISFNFFVTAPKNVSDIIPRTEVEKAIRMSRSRINDAFRLNDN
SLEFLGIQPTLGPPNQPPVSIWLIVFGVVMGVIVVGIVILIFTGIRDRKKKNKARSGENP
YASIDISKGENNPGFQNTDDVQTSF
>tr|A0A6J0Z472|A0A6J0Z472_ODOVR Angiotensin-con

In [15]:
# Submit an align job to EBI's clustalo service
spike_binding_align_job = requests.post("https://www.ebi.ac.uk/Tools/services/rest/clustalo/run", data={
    "email": "example@example.com",
    "iterations": 0,
    "outfmt": "clustal_num",
    "order": "aligned",
    "sequence": spike_binding_fastas
})

# Documentation here https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Clustal+Omega+Help+and+Documentation#ClustalOmegaHelpandDocumentation-RESTAPI

spike_binding_align_job_id = spike_binding_align_job.text
print(spike_binding_align_job_id)

# Get the job status
spike_binding_align_job_status = requests.get(f"https://www.ebi.ac.uk/Tools/services/rest/clustalo/status/{spike_binding_align_job_id}")
print(spike_binding_align_job_status.text)

clustalo-R20250402-132456-0671-48435719-p1m
QUEUED


In [16]:
# Run the following again to check the status until finished
spike_binding_align_job_status = requests.get(f"https://www.ebi.ac.uk/Tools/services/rest/clustalo/status/{spike_binding_align_job_id}")
print(spike_binding_align_job_status.text)

FINISHED


In [17]:
# Get the results of the job
spike_binding_align_job_results = requests.get(f"https://www.ebi.ac.uk/Tools/services/rest/clustalo/result/{spike_binding_align_job_id}/aln-clustal_num")
spike_binding_alignment = spike_binding_align_job_results.text
if not spike_binding_alignment:
    raise ValueError(f"Alignment results not found for {spike_binding_fastas}.")

print(spike_binding_alignment)

# * : Fully conserved residues.
# : : Conservation between groups of strongly similar properties (Gonnet PAM 250 score > 0.5).
# . : Conservation between groups of weakly similar properties (Gonnet PAM 250 score ≤ 0.5).
#   : Non-conserved residues.

CLUSTAL O(1.2.4) multiple sequence alignment


sp|Q9BYF1|ACE2_HUMAN                MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLASWNYNTNITEENVQ	60
tr|A0A6J0Z472|A0A6J0Z472_ODOVR      MTGSFWLLLSLVAVTAAQSTTEEQAKTFLEKFNHEAEDLSYQSSLASWNYNTNITDENVQ	60
                                    *:.* *************** ********:********* ***************:****

sp|Q9BYF1|ACE2_HUMAN                NMNNAGDKWSAFLKEQSTLAQMYPLQEIQNLTVKLQLQALQQNGSSVLSEDKSKRLNTIL	120
tr|A0A6J0Z472|A0A6J0Z472_ODOVR      KMNEARAKWSAFYEEQSRMAKTYSLEEIQNLTLKRQLKALQQSGTSVLSAEKSKRLNTIL	120
                                    :**:*  ***** :*** :*: * *:******:* **:****.*:**** :*********

sp|Q9BYF1|ACE2_HUMAN                NTMSTIYSTGKVCNPDNPQECLLLEPGLNEIMANSLDYNERLWAWESWRSEVGKQLRPLY	180
tr|A0A6J0Z472|A0A6J0Z472_ODOVR      NTMSTIYSTGKVLDPN-TQECLALEPGLDDIMENSRDYNRRLWAWEGWRAEVGKQLRPLY	179
                                    ************ :*:  **** *****::** ** ***.******.**:**********

sp|Q9BYF1|ACE2_HUMAN                EEYVVLKNEMARANHYEDY

Focus alignment specifically at positions 30-41, 82-84 and 353-357 of the Human ACE2 sequence which interact with spike glycoprotein:

In [18]:
def get_alignment_subsequence(alignment, highlight_start, highlight_end):
    """
    Returns a subsequence of an alignment between the positions highlight_start and highlight_end (inclusive)
    with positions defined based on the first (reference) sequence in the alignment.
    """
    try:
        # Split the alignment into lines and ignore header/footer lines.
        alignment_lines = alignment.split("\n")[3:-1]
    except Exception as e:
        print("Error splitting alignment into lines:", e)
        return ""

    try:
        # Determine the indices where the actual sequence data starts and ends on each line.
        sequence_start_index = alignment_lines[0].rfind(" ") + 1
        sequence_end_index = alignment_lines[0].rfind("\t")
    except Exception as e:
        print("Error determining sequence start and end indices:", e)
        return ""

    # Build a dictionary mapping sequence labels to their full sequence.
    sequences = defaultdict(str)
    for line in alignment_lines:
        try:
            if line:
                label = line[:sequence_start_index]
                sequence_fragment = line[sequence_start_index:sequence_end_index]
                sequences[label] += sequence_fragment
        except Exception as e:
            print("Error processing alignment line:", line, "Error:", e)

    result = ""
    for seq_index, (label, sequence) in enumerate(sequences.items()):
        try:
            if seq_index == 0:
                # For the reference sequence, adjust the indices to account for gap characters.
                subsequence_start = highlight_start + sequence[:highlight_start].count("-") - 1
                subsequence_end = highlight_end + sequence[:highlight_end].count("-")
                adjusted_position = highlight_end
            elif seq_index == len(sequences) - 1:
                adjusted_position = ""
            else:
                adjusted_position = highlight_end - sequence[:highlight_end].count("-")

            result += f"{label}{sequence[subsequence_start:subsequence_end]} {adjusted_position}\n"
        except Exception as e:
            print("Error processing sequence for label", label, "Error:", e)

    return result

highlight_ranges = ((30, 41), (82, 84), (353, 357))
for start, end in highlight_ranges:
    try:
        print(f"Positions {start}-{end}:")
        print(get_alignment_subsequence(spike_binding_alignment, start, end))
    except Exception as e:
        print("Error processing highlight range:", start, end, "Error:", e)


Positions 30-41:
sp|Q9BYF1|ACE2_HUMAN                DKFNHEAEDLFY 41
tr|A0A6J0Z472|A0A6J0Z472_ODOVR      EKFNHEAEDLSY 41
                                    :********* * 

Positions 82-84:
sp|Q9BYF1|ACE2_HUMAN                MYP 84
tr|A0A6J0Z472|A0A6J0Z472_ODOVR      TYS 84
                                     *  

Positions 353-357:
sp|Q9BYF1|ACE2_HUMAN                KGDFR 357
tr|A0A6J0Z472|A0A6J0Z472_ODOVR      KGDFR 356
                                    ***** 



## Use case 2: ID mapping service

Map human gene names to UniProt/Swiss-Prot IDs.
Input of 16 Gene names restricted to human (TaxID:9606) to be mapped to UniProt/Swiss-Prot IDs:

In [19]:
accessions = [
    "ACTB",
    "TUBA1A",
    "HBA1",
    "HBB",
    "G6PD",
    "LDHA",
    "CYP2D6",
    "KRT8",
    "SLC22A5",
    "MT-CO1",
    "SOD2",
    "FGA",
    "MTR",
    "MT-ATP6",
    "VEGFA",
    "TP53",
]

# POST request with parameters provided as a dict
idmapping_job = requests.post(f"{WEBSITE_API}/idmapping/run", data={
    "from": "Gene_Name",
    "to": "UniProtKB-Swiss-Prot",
    "ids": accessions,
    "taxId": "9606",
})

idmapping_job_id = idmapping_job.json()['jobId']
print("Job ID:", idmapping_job_id)

Job ID: ebaefff87d020d2ce35d71658ebd00b40efd1e46


Once complete fetch the results using the job ID:

In [21]:
idmapping_job_results = get(f"/idmapping/status/{idmapping_job_id}")

Determine the number of results that have been found:

In [22]:
if not len(idmapping_job_results):
    raise ValueError(f"No results found for {accessions}.")

len(idmapping_job_results)

16

Convert into a Pandas DataFrame and create the column `fromGene` for each row:

In [23]:
try:
    df_idmapping_job_results = pd.DataFrame([{"fromGene": result["from"], **result["to"]} for result in idmapping_job_results])
except NameError as e:
    print(f"Name not found: {e}")

df_idmapping_job_results

Unnamed: 0,fromGene,entryType,primaryAccession,secondaryAccessions,uniProtkbId,entryAudit,annotationScore,organism,proteinExistence,proteinDescription,genes,comments,features,keywords,references,uniProtKBCrossReferences,sequence,extraAttributes,geneLocations
0,ACTB,UniProtKB reviewed (Swiss-Prot),P60709,"[P02570, P70514, P99021, Q11211, Q64316, Q75MN2, Q96B34, Q96HG5]",ACTB_HUMAN,"{'firstPublicDate': '1986-07-21', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '1988-04-01', 'entryVersion': 214, 'sequenceVersion': 1}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'value': 'Actin, cytoplasmic 1'}, 'ecNumbers': [{'evidences': [{'evidenceCode': 'ECO:0000250', 'source': 'UniProtKB', 'id': 'P68137'}], 'value': '3.6.4.-'}]}, 'al...",[{'geneName': {'value': 'ACTB'}}],"[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000250', 'source': 'UniProtKB', 'id': 'Q6QAQ1'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '25255767'}, {'evidenceCode': 'ECO:0000...","[{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 375, 'modifier': 'EXACT'}}, 'description': 'Actin, cytoplasmic 1', 'featureId': 'PRO_0000367073'}, {'ty...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-0067', 'category': 'Ligand', 'name': 'ATP-binding'...","[{'referenceNumber': 1, 'citation': {'id': '6322116', 'citationType': 'journal article', 'authors': ['Ponte P.', 'Ng S.Y.', 'Engel J.', 'Gunning P.', 'Kedes L.'], 'citationCrossReferences': [{'dat...","[{'database': 'EMBL', 'id': 'X00351', 'properties': [{'key': 'ProteinId', 'value': 'CAA25099.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', ...",{'value': 'MDDDIAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDL...,"{'countByCommentType': {'FUNCTION': 1, 'CATALYTIC ACTIVITY': 1, 'SUBUNIT': 1, 'INTERACTION': 35, 'SUBCELLULAR LOCATION': 1, 'PTM': 7, 'DISEASE': 5, 'MISCELLANEOUS': 1, 'SIMILARITY': 1, 'CAUTION': ...",
1,TUBA1A,UniProtKB reviewed (Swiss-Prot),Q71U36,"[A8K0B8, G3V1U9, P04687, P05209]",TBA1A_HUMAN,"{'firstPublicDate': '1987-08-13', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '2004-07-05', 'entryVersion': 194, 'sequenceVersion': 1}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'value': 'Tubulin alpha-1A chain'}, 'ecNumbers': [{'evidences': [{'evidenceCode': 'ECO:0000250', 'source': 'UniProtKB', 'id': 'P68363'}], 'value': '3.6.5.-'}]}, '...","[{'geneName': {'value': 'TUBA1A'}, 'synonyms': [{'value': 'TUBA3'}]}]","[{'texts': [{'value': 'Tubulin is the major constituent of microtubules, a cylinder consisting of laterally associated linear protofilaments composed of alpha- and beta-tubulin heterodimers. Micro...","[{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 451, 'modifier': 'EXACT'}}, 'description': 'Tubulin alpha-1A chain', 'featureId': 'PRO_0000048111'}, {'...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-0025', 'category': 'Coding sequence diversity', 'n...","[{'referenceNumber': 1, 'citation': {'id': '3839072', 'citationType': 'journal article', 'authors': ['Hall J.L.', 'Cowan N.J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '3839072'}...","[{'database': 'EMBL', 'id': 'X01703', 'properties': [{'key': 'ProteinId', 'value': 'CAA25855.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': '...",{'value': 'MRECISIHVGQAGVQIGNACWELYCLEHGIQPDGQMPSDKTIGGGDDSFNTFFSETGAGKHVPRAVFVDLEPTVIDEVRTGTYRQLFHPEQLITGKEDAANNYARGHYTIGKEIIDLVLDRIRKLADQCTGLQGFLVFHSFGGGTGSGFTSLLMERLSVDYGKKSKLEFSIYPAPQVSTAVVEPY...,"{'countByCommentType': {'FUNCTION': 1, 'CATALYTIC ACTIVITY': 1, 'COFACTOR': 1, 'SUBUNIT': 1, 'INTERACTION': 9, 'SUBCELLULAR LOCATION': 1, 'ALTERNATIVE PRODUCTS': 2, 'TISSUE SPECIFICITY': 1, 'PTM':...",
2,HBA1,UniProtKB reviewed (Swiss-Prot),P69905,"[P01922, Q1HDT5, Q3MIF5, Q53F97, Q96KF1, Q9NYR7, Q9UCM0]",HBA_HUMAN,"{'firstPublicDate': '1986-07-21', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '2007-01-23', 'entryVersion': 214, 'sequenceVersion': 2}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'value': 'Hemoglobin subunit alpha'}}, 'alternativeNames': [{'fullName': {'value': 'Alpha-globin'}}, {'fullName': {'value': 'Hemoglobin alpha chain'}}], 'contains...","[{'geneName': {'value': 'HBA1'}}, {'geneName': {'value': 'HBA2'}}]","[{'texts': [{'value': 'Involved in oxygen transport from the lung to the various peripheral tissues'}], 'commentType': 'FUNCTION'}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'sourc...","[{'type': 'Initiator methionine', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 1, 'modifier': 'EXACT'}}, 'description': 'Removed', 'evidences': [{'evidenceCode': 'ECO:...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-0903', 'category': 'Technical term', 'name': 'Dire...","[{'referenceNumber': 1, 'citation': {'id': '7448866', 'citationType': 'journal article', 'authors': ['Michelson A.M.', 'Orkin S.H.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '7448...","[{'database': 'EMBL', 'id': 'J00153', 'properties': [{'key': 'ProteinId', 'value': 'AAB59407.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': '...","{'value': 'MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR', 'length': 142, 'molWeight': 15258, 'crc6...","{'countByCommentType': {'FUNCTION': 2, 'SUBUNIT': 2, 'INTERACTION': 13, 'TISSUE SPECIFICITY': 1, 'PTM': 1, 'DISEASE': 4, 'MISCELLANEOUS': 1, 'SIMILARITY': 1, 'SEQUENCE CAUTION': 1, 'WEB RESOURCE':...",
3,HBB,UniProtKB reviewed (Swiss-Prot),P68871,"[A4GX73, B2ZUE0, P02023, Q13852, Q14481, Q14510, Q45KT0, Q549N7, Q6FI08, Q6R7N2, Q8IZI1, Q9BX96, Q9UCD6, Q9UCP8, Q9UCP9]",HBB_HUMAN,"{'firstPublicDate': '1986-07-21', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '2007-01-23', 'entryVersion': 214, 'sequenceVersion': 2}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'value': 'Hemoglobin subunit beta'}}, 'alternativeNames': [{'fullName': {'value': 'Beta-globin'}}, {'fullName': {'value': 'Hemoglobin beta chain'}}], 'contains': ...",[{'geneName': {'value': 'HBB'}}],"[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '28066926'}], 'value': 'Involved in oxygen transport from the lung to the various peripheral tissues'}], 'comme...","[{'type': 'Initiator methionine', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 1, 'modifier': 'EXACT'}}, 'description': 'Removed', 'evidences': [{'evidenceCode': 'ECO:...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-1055', 'category': 'Disease', 'name': 'Congenital ...","[{'referenceNumber': 1, 'citation': {'id': '1019344', 'citationType': 'journal article', 'authors': ['Marotta C.', 'Forget B.', 'Cohen-Solal M.', 'Weissman S.M.'], 'citationCrossReferences': [{'da...","[{'database': 'EMBL', 'id': 'M25079', 'properties': [{'key': 'ProteinId', 'value': 'AAA35597.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', ...","{'value': 'MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH', 'length': 147, 'molWeight': 15998, ...","{'countByCommentType': {'FUNCTION': 3, 'SUBUNIT': 1, 'INTERACTION': 5, 'TISSUE SPECIFICITY': 1, 'PTM': 3, 'MASS SPECTROMETRY': 1, 'POLYMORPHISM': 1, 'DISEASE': 4, 'MISCELLANEOUS': 1, 'SIMILARITY':...",
4,G6PD,UniProtKB reviewed (Swiss-Prot),P11413,"[D3DWX9, Q16000, Q16765, Q8IU70, Q8IU88, Q8IUA6, Q96PQ2]",G6PD_HUMAN,"{'firstPublicDate': '1989-10-01', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '2007-01-23', 'entryVersion': 275, 'sequenceVersion': 4}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'value': 'Glucose-6-phosphate 1-dehydrogenase'}, 'shortNames': [{'value': 'G6PD'}], 'ecNumbers': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed...",[{'geneName': {'value': 'G6PD'}}],"[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '15858258'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '24769394'}, {'evidenceCode': 'ECO:00002...","[{'type': 'Initiator methionine', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 1, 'modifier': 'EXACT'}}, 'description': 'Removed', 'evidences': [{'evidenceCode': 'ECO:...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-0025', 'category': 'Coding sequence diversity', 'n...","[{'referenceNumber': 1, 'citation': {'id': '3515319', 'citationType': 'journal article', 'authors': ['Persico M.G.', 'Viglietto G.', 'Martini G.', 'Toniolo D.', 'Paonessa G.', 'Moscatelli C.', 'Do...","[{'database': 'EMBL', 'id': 'X03674', 'properties': [{'key': 'ProteinId', 'value': 'CAA27309.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', ...",{'value': 'MAEQVALSRTQVCGILREELFQGDAFHQSDTHIFIIMGASGDLAKKKIYPTIWWLFRDGLLPENTFIVGYARSRLTVADIRKQSEPFFKATPEEKLKLEDFFARNSYVAGQYDDAASYQRLNSHMNALHLGSQANRLFYLALPPTVYEAVTKNIHESCMSQIGWNRIIVEKPFGRDLQSSDRLSN...,"{'countByCommentType': {'FUNCTION': 1, 'CATALYTIC ACTIVITY': 1, 'BIOPHYSICOCHEMICAL PROPERTIES': 1, 'PATHWAY': 1, 'SUBUNIT': 1, 'INTERACTION': 3, 'SUBCELLULAR LOCATION': 1, 'ALTERNATIVE PRODUCTS':...",
5,LDHA,UniProtKB reviewed (Swiss-Prot),P00338,"[B4DKQ2, B7Z5E3, D3DQY3, F8W819, Q53G53, Q6IBM7, Q6ZNV1, Q9UDE8, Q9UDE9]",LDHA_HUMAN,"{'firstPublicDate': '1986-07-21', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '2007-01-23', 'entryVersion': 264, 'sequenceVersion': 2}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'evidences': [{'evidenceCode': 'ECO:0000305'}], 'value': 'L-lactate dehydrogenase A chain'}, 'shortNames': [{'value': 'LDH-A'}], 'ecNumbers': [{'evidences': [{'ev...","[{'geneName': {'evidences': [{'evidenceCode': 'ECO:0000312', 'source': 'HGNC', 'id': 'HGNC:6535'}], 'value': 'LDHA'}, 'orfNames': [{'value': 'PIG19'}]}]","[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '11276087'}], 'value': 'Interconverts simultaneously and stereospecifically pyruvate and lactate with concomita...","[{'type': 'Initiator methionine', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 1, 'modifier': 'EXACT'}}, 'description': 'Removed', 'evidences': [{'evidenceCode': 'ECO:...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-0025', 'category': 'Coding sequence diversity', 'n...","[{'referenceNumber': 1, 'citation': {'id': '3838278', 'citationType': 'journal article', 'authors': ['Tsujibo H.', 'Tiano H.F.', 'Li S.S.-L.'], 'citationCrossReferences': [{'database': 'PubMed', '...","[{'database': 'EMBL', 'id': 'X02152', 'properties': [{'key': 'ProteinId', 'value': 'CAA26088.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', ...",{'value': 'MATLKDQLIYNLLKEEQTPQNKITVVGVGAVGMACAISILMKDLADELALVDVIEDKLKGEMMDLQHGSLFLRTPKIVSGKDYNVTANSKLVIITAGARQQEGESRLNLVQRNVNIFKFIIPNVVKYSPNCKLLIVSNPVDILTYVAWKISGFPKNRVIGSGCNLDSARFRYLMGERLGVHPLSC...,"{'countByCommentType': {'FUNCTION': 1, 'CATALYTIC ACTIVITY': 1, 'ACTIVITY REGULATION': 1, 'PATHWAY': 1, 'SUBUNIT': 1, 'INTERACTION': 7, 'SUBCELLULAR LOCATION': 1, 'ALTERNATIVE PRODUCTS': 5, 'TISSU...",
6,CYP2D6,UniProtKB reviewed (Swiss-Prot),P10635,"[Q16752, Q2XND6, Q2XND7, Q2XNE0, Q6B012, Q6NXU8]",CP2D6_HUMAN,"{'firstPublicDate': '1989-07-01', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '2005-12-20', 'entryVersion': 240, 'sequenceVersion': 2}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'evidences': [{'evidenceCode': 'ECO:0000303', 'source': 'PubMed', 'id': '18698000'}], 'value': 'Cytochrome P450 2D6'}, 'ecNumbers': [{'evidences': [{'evidenceCode...","[{'geneName': {'evidences': [{'evidenceCode': 'ECO:0000303', 'source': 'PubMed', 'id': '21289075'}, {'evidenceCode': 'ECO:0000312', 'source': 'HGNC', 'id': 'HGNC:2625'}], 'value': 'CYP2D6'}, 'syno...","[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10681376'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '16352597'}, {'evidenceCode': 'ECO:00002...","[{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 497, 'modifier': 'EXACT'}}, 'description': 'Cytochrome P450 2D6', 'featureId': 'PRO_0000051731'}, {'typ...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0025', 'category': 'Coding sequence diversity', 'name': 'Alternative splicing'}, {'id': 'KW-0153', 'category': ...","[{'referenceNumber': 1, 'citation': {'id': '3410476', 'citationType': 'journal article', 'authors': ['Gonzalez F.J.', 'Vilbois F.', 'Hardwick J.P.', 'McBride O.W.', 'Nebert D.W.', 'Gelboin H.V.', ...","[{'database': 'EMBL', 'id': 'M20403', 'properties': [{'key': 'ProteinId', 'value': 'AAA52153.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', ...",{'value': 'MGLEALVPLAVIVAIFLLLVDLMHRRQRWAARYPPGPLPLPGLGNLLHVDFQNTPYCFDQLRRRFGDVFSLQLAWTPVVVLNGLAAVREALVTHGEDTADRPPVPITQILGFGPRSQGVFLARYGPAWREQRRFSVSTLRNLGLGKKSLEQWVTEEAACLCAAFANHSGRPFRPNGLLDKAVSNV...,"{'countByCommentType': {'FUNCTION': 1, 'CATALYTIC ACTIVITY': 12, 'COFACTOR': 1, 'BIOPHYSICOCHEMICAL PROPERTIES': 1, 'PATHWAY': 3, 'SUBCELLULAR LOCATION': 1, 'ALTERNATIVE PRODUCTS': 2, 'INDUCTION':...",
7,KRT8,UniProtKB reviewed (Swiss-Prot),P05787,"[A8K4H3, B0AZN5, F8VXB4, Q14099, Q14716, Q14717, Q53GJ0, Q6DHW5, Q6GMY0, Q6P4C7, Q96J60]",K2C8_HUMAN,"{'firstPublicDate': '1988-11-01', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '2007-01-23', 'entryVersion': 261, 'sequenceVersion': 7}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'value': 'Keratin, type II cytoskeletal 8'}}, 'alternativeNames': [{'fullName': {'value': 'Cytokeratin-8'}, 'shortNames': [{'value': 'CK-8'}]}, {'fullName': {'val...","[{'geneName': {'value': 'KRT8'}, 'synonyms': [{'value': 'CYK8'}]}]","[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '16000376'}], 'value': 'Together with KRT19, helps to link the contractile apparatus to dystrophin at the costa...","[{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 483, 'modifier': 'EXACT'}}, 'description': 'Keratin, type II cytoskeletal 8', 'featureId': 'PRO_0000063...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-0025', 'category': 'Coding sequence diversity', 'n...","[{'referenceNumber': 1, 'citation': {'id': '1691124', 'citationType': 'journal article', 'authors': ['Krauss S.', 'Franke W.W.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '1691124'...","[{'database': 'EMBL', 'id': 'M34482', 'properties': [{'key': 'ProteinId', 'value': 'AAA35763.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': '...",{'value': 'MSIRVTQKSYKVSTSGPRAFSSRSYTSGPGSRISSSSFSRVGSSNFRGGLGGGYGGASGMGGITAVTVNQSLLSPLVLEVDPNIQAVRTQEKEQIKTLNNKFASFIDKVRFLEQQNKMLETKWSLLQQQKTARSNMDNMFESYINNLRRQLETLGQEKLKLEAELGNMQGLVEDFKNKYEDEINK...,"{'countByCommentType': {'FUNCTION': 1, 'SUBUNIT': 2, 'INTERACTION': 28, 'SUBCELLULAR LOCATION': 1, 'ALTERNATIVE PRODUCTS': 2, 'TISSUE SPECIFICITY': 1, 'PTM': 3, 'DISEASE': 1, 'MISCELLANEOUS': 1, '...",
8,SLC22A5,UniProtKB reviewed (Swiss-Prot),O76082,"[A2Q0V1, B2R844, D3DQ87, Q6ZQZ8, Q96EH6]",S22A5_HUMAN,"{'firstPublicDate': '2000-12-01', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '1998-11-01', 'entryVersion': 213, 'sequenceVersion': 1}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'evidences': [{'evidenceCode': 'ECO:0000305'}], 'value': 'Organic cation/carnitine transporter 2'}}, 'alternativeNames': [{'fullName': {'value': 'High-affinity so...","[{'geneName': {'evidences': [{'evidenceCode': 'ECO:0000312', 'source': 'HGNC', 'id': 'HGNC:10969'}], 'value': 'SLC22A5'}, 'synonyms': [{'evidences': [{'evidenceCode': 'ECO:0000303', 'source': 'Pub...","[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10454528'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10525100'}, {'evidenceCode': 'ECO:00002...","[{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 557, 'modifier': 'EXACT'}}, 'description': 'Organic cation/carnitine transporter 2', 'featureId': 'PRO_...","[{'id': 'KW-0025', 'category': 'Coding sequence diversity', 'name': 'Alternative splicing'}, {'id': 'KW-0067', 'category': 'Ligand', 'name': 'ATP-binding'}, {'id': 'KW-1003', 'category': 'Cellular...","[{'referenceNumber': 1, 'citation': {'id': '9618255', 'citationType': 'journal article', 'authors': ['Wu X.', 'Prasad P.D.', 'Leibach F.H.', 'Ganapathy V.'], 'citationCrossReferences': [{'database...","[{'database': 'EMBL', 'id': 'AF057164', 'properties': [{'key': 'ProteinId', 'value': 'AAC24828.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL'...",{'value': 'MRDYDEVTAFLGEWGPFQRLIFFLLSASIIPNGFTGLSSVFLIATPEHRCRVPDAANLSSAWRNHTVPLRLRDGREVPHSCRRYRLATIANFSALGLEPGRDVDLGQLEQESCLDGWEFSQDVYLSTIVTEWNLVCEDDWKAPLTISLFFVGVLLGSFISGQLSDRFGRKNVLFVTMGMQTGFSF...,"{'countByCommentType': {'FUNCTION': 2, 'CATALYTIC ACTIVITY': 10, 'ACTIVITY REGULATION': 1, 'BIOPHYSICOCHEMICAL PROPERTIES': 1, 'SUBUNIT': 1, 'INTERACTION': 6, 'SUBCELLULAR LOCATION': 2, 'ALTERNATI...",
9,MT-CO1,UniProtKB reviewed (Swiss-Prot),P00395,[Q34770],COX1_HUMAN,"{'firstPublicDate': '1986-07-21', 'lastAnnotationUpdateDate': '2025-02-05', 'lastSequenceUpdateDate': '1986-07-21', 'entryVersion': 228, 'sequenceVersion': 1}",5.0,"{'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarch...",1: Evidence at protein level,"{'recommendedName': {'fullName': {'value': 'Cytochrome c oxidase subunit 1'}, 'ecNumbers': [{'value': '7.1.1.9'}]}, 'alternativeNames': [{'fullName': {'value': 'Cytochrome c oxidase polypeptide I'...","[{'geneName': {'value': 'MT-CO1'}, 'synonyms': [{'value': 'COI'}, {'value': 'COXI'}, {'value': 'MTCO1'}]}]","[{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000250', 'source': 'UniProtKB', 'id': 'P00401'}], 'value': 'Component of the cytochrome c oxidase, the last enzyme in the mitochondrial electron t...","[{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 513, 'modifier': 'EXACT'}}, 'description': 'Cytochrome c oxidase subunit 1', 'featureId': 'PRO_00001833...","[{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0106', 'category': 'Ligand', 'name': 'Calcium'}, {'id': 'KW-0186', 'category': 'Ligand', 'name': 'Copper'}, {'i...","[{'referenceNumber': 1, 'citation': {'id': '7219534', 'citationType': 'journal article', 'authors': ['Anderson S.', 'Bankier A.T.', 'Barrell B.G.', 'de Bruijn M.H.L.', 'Coulson A.R.', 'Drouin J.',...","[{'database': 'EMBL', 'id': 'V00662', 'properties': [{'key': 'ProteinId', 'value': 'CAA24028.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': '...",{'value': 'MFADRWLFSTNHKDIGTLYLLFGAWAGVLGTALSLLIRAELGQPGNLLGNDHIYNVIVTAHAFVMIFFMVMPIMIGGFGNWLVPLMIGAPDMAFPRMNNMSFWLLPPSLLLLLASAMVEAGAGTGWTVYPPLAGNYSHPGASVDLTIFSLHLAGVSSILGAINFITTIINMKPPAMTQYQTPLFV...,"{'countByCommentType': {'FUNCTION': 1, 'CATALYTIC ACTIVITY': 1, 'COFACTOR': 2, 'PATHWAY': 1, 'SUBUNIT': 1, 'INTERACTION': 3, 'SUBCELLULAR LOCATION': 1, 'DISEASE': 6, 'SIMILARITY': 1}, 'countByFeat...",[{'geneEncodingType': 'Mitochondrion'}]
