## Running profet

Example notebook to run profet using python commands.

In [1]:
import profet

ONLY_ALPHAFOLD = "F4HvG8"
ONLY_PDB = "7U6Q"
BOTH = "A0A023FDY8"

ModuleNotFoundError: No module named 'requests_html'

Create a default fetcher, and check that the current default database is PDB. This means that, by default, the first search will be on the PDB. However, if the structure is not found, the search will then be executed on the AlphaFold database.

In [4]:
fetcher = profet.Fetcher()
print('Current default database: ', fetcher.get_default_db())

Current default database:  pdb


For each of the proteins, check the availability, i.e., check which database has a file for the protein.

In [13]:
available_only_AF = fetcher.check_db(ONLY_ALPHAFOLD)
available_only_pdb = fetcher.check_db(ONLY_PDB)
available_both = fetcher.check_db(BOTH)

print('Database available for ', ONLY_ALPHAFOLD, ' is ', available_only_AF)
print('Database available for ', ONLY_PDB, ' is ', available_only_pdb)
print('Databases available for ', BOTH, ' are ', available_both)

Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "F4HvG8"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "7U6Q"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "A0A023FDY8"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Database available for  F4HvG8  is  ['alphafold']
Database available for  7U6Q  is  ['pdb']
Databases available for  A0A023FDY8  are  ['pdb', 'alphafold']


To get the files, simply run the function `get_file`. 

In [16]:
PDB = fetcher.get_file(ONLY_PDB)
AF = fetcher.get_file(ONLY_ALPHAFOLD)

Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "7U6Q"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Structure available on defaulted database: pdb
Sending GET request to https://files.rcsb.org/download/7U6Q.pdb.gz to fetch 7U6Q's pdb file as a string.
Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "F4HvG8"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Structure available on alternative database: alphafold


For cases where the file is available in both databases, the desired one can be specified:

In [21]:
from_pdb = fetcher.get_file(BOTH)
from_AF = fetcher.get_file(BOTH, db = 'alphafold')

Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "A0A023FDY8"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Structure available on defaulted database: pdb
Sending GET request to https://files.rcsb.org/download/7S4N.pdb.gz to fetch 7S4N's pdb file as a string.
Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "A0A023FDY8"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Structure available on defaulted database: alphafold


To save the files locally, set the parameter `filesave` to true.

In [22]:
PDB = fetcher.get_file(ONLY_PDB, filesave = True)

Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "full_text", "parameters": {"value": "7U6Q"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Structure available on defaulted database: pdb
Sending GET request to https://files.rcsb.org/download/7U6Q.pdb.gz to fetch 7U6Q's pdb file as a string.


The fetcher can also print the search history performed. Each time one protein was searched (and found in a database), it caches the information.

In [24]:
fetcher.search_history()

{'7U6Q': ['pdb'], 'F4HvG8': ['alphafold'], 'A0A023FDY8': ['pdb', 'alphafold']}