## Add Catalytic Activity to DB for UniPort Entries 

The uniprot fetcher was extended to add the catalytic activity to the database. This includes different reactions with the belonging substrates and products represented by the rhea ids. There are also the chebi ids and the smiles codes included. 

In [9]:
%reload_ext autoreload
%autoreload 2
import sys

from loguru import logger
import json

from pyeed import Pyeed
from pyeed.analysis.standard_numbering import StandardNumberingTool

logger.remove()
level = logger.add(sys.stderr, level="WARNING")

In [2]:
uri = "bolt://129.69.129.130:7687"
user = "neo4j"
password = "12345678"

eedb = Pyeed(uri, user=user, password=password)

📡 Connected to database.


In [3]:
uniprot_ids = ["A0A0K8P6T7",
        "R4YKL9", 
        "A0A1Z2SIQ1", 
        "A0A0G3BI90",
        "A0A1B0Z6Y3",
        "A4Y035",
        "P19833",
        "A0A061D1S0",
        "W6R2Y2",
        "A0A1W6L588",
        "G9BY57",]

eedb.fetch_from_primary_db(uniprot_ids, db="uniprot")

## NCBI to UniProt Mapper 

This function maps ncbi ids to uniprot and uniparc ids. Two json files containing dictionaries with the ncbi and uniprot or uniparc id are returned. 

In [None]:
# same ids as above but with different db- useful for later testing
ncbi_ids = ["GAP38373.1",
            "CCK74972.1",
            "ASA57064.1",
            "AKJ29164.1",
            "ANP21910.1",
            "ABP86951.1",
            "CAA37220.1", 
            "MDH1341286.1", 
            "CDM40731.1", 
            "ARN19491.1", 
            "AEV21261.1"] 

eedb.database_id_mapper(ncbi_ids, file_name="mapping_test")

✅ Downloaded: GAP38373.1.fasta
✅ Downloaded: CCK74972.1.fasta
✅ Downloaded: ASA57064.1.fasta
✅ Downloaded: AKJ29164.1.fasta
✅ Downloaded: ANP21910.1.fasta
✅ Downloaded: ABP86951.1.fasta
✅ Downloaded: CAA37220.1.fasta
✅ Downloaded: MDH1341286.1.fasta
✅ Downloaded: CDM40731.1.fasta
✅ Downloaded: ARN19491.1.fasta
✅ Downloaded: AEV21261.1.fasta


In [None]:
# test if the ids are the same
with open("mapping_test_uniprot.json", "r") as f:
    mapping = json.load(f)

values = [v for sub in mapping.values() for v in (sub if isinstance(sub, list) else [sub])]
values = list(set(values))

for v in values:
    if v not in uniprot_ids:
        print(v)
        
print(values)

['A0A1Z2SIQ1', 'W6R2Y2', 'A0A061D1S0', 'A0A0G3BI90', 'A0A1W6L588', 'R4YKL9', 'A0A1B0Z6Y3', 'G9BY57', 'P19833', 'A4Y035', 'A0A0K8P6T7']


In [38]:
with open("mapping_test_uniparc.json", "r") as f:
    mapping = json.load(f)

values = list(mapping.values())
print(values)

['UPI0006A8DE61', 'UPI0003429338', 'UPI000480AD3A', 'UPI00064004B7', 'UPI0003A9F0B5', 'UPI0000E79739', 'UPI000012E6AB', 'UPI000273702F', 'UPI000273702F', 'UPI000A201A5A', 'UPI000243C995']
