In [1]:
! pip install chembl_webresource_client

Collecting chembl_webresource_client
  Downloading chembl_webresource_client-0.10.9-py3-none-any.whl.metadata (1.4 kB)
Collecting requests-cache~=1.2 (from chembl_webresource_client)
  Downloading requests_cache-1.2.1-py3-none-any.whl.metadata (9.9 kB)
Collecting cattrs>=22.2 (from requests-cache~=1.2->chembl_webresource_client)
  Downloading cattrs-24.1.2-py3-none-any.whl.metadata (8.4 kB)
Collecting url-normalize>=1.4 (from requests-cache~=1.2->chembl_webresource_client)
  Downloading url_normalize-1.4.3-py2.py3-none-any.whl.metadata (3.1 kB)
Downloading chembl_webresource_client-0.10.9-py3-none-any.whl (55 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m55.2/55.2 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading requests_cache-1.2.1-py3-none-any.whl (61 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.4/61.4 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading cattrs-24.1.2-py3-none-any.whl (66 kB)
[2K   [90m━━━━━━━━━━

In [2]:
import pandas as pd
from chembl_webresource_client.new_client import new_client
target = new_client.target

In [4]:
#diseases to search for, such as prostate cancer and breast cancer
diseases = ["coronavirus", "diabetes", "influenza", "prostate cancer", "breast cancer"]
combined_df = pd.DataFrame()

for disease in diseases:
    print(f"Searching for targets related to {disease}...")

    target = new_client.target
    target_search = target.search(disease)
    targets = pd.DataFrame.from_dict(target_search)

    if targets.empty:
        print(f"No targets found for {disease}. Skipping...")
        continue

    #Displaying the number of targets found for the current disease
    print(f"Found {len(targets)} targets for {disease}.")

    for i in range(min(len(targets), 3)):  #Limiting to the first 3 targets for simplicity
        selected_target = targets.target_chembl_id[i]
        print(f"Fetching bioactivity data for target {selected_target} ({targets['pref_name'][i]})...")

        activity = new_client.activity
        res = activity.filter(target_chembl_id=selected_target).filter(standard_type="IC50")

        if not res:
            print(f"No bioactivity data found for target {selected_target}.")
            continue

        df = pd.DataFrame.from_dict(res)
        if df.empty:
            print(f"Bioactivity data is empty for target {selected_target}.")
            continue

        #Filtering out missing standard values
        df = df[df.standard_value.notna()]

        bioactivity_class = []
        for value in df.standard_value:
            if float(value) >= 10000:
                bioactivity_class.append("inactive")
            elif float(value) <= 1000:
                bioactivity_class.append("active")
            else:
                bioactivity_class.append("intermediate")

        wanted_selections = ["molecule_chembl_id", "canonical_smiles", "standard_value"]
        f_df = df[wanted_selections]
        f_df['bioactivity'] = bioactivity_class

        #Adding disease and target information
        f_df['disease'] = disease
        f_df['target_chembl_id'] = selected_target
        f_df['target_name'] = targets['pref_name'][i]  #Adding target name for clarity

        combined_df = pd.concat([combined_df, f_df], ignore_index=True)

combined_df.to_csv("bioactivity_data_combined.csv", index=False)
print("Data collection complete. Saved as bioactivity_data_combined.csv.")


Searching for targets related to coronavirus...
Found 10 targets for coronavirus.
Fetching bioactivity data for target CHEMBL613732 (Coronavirus)...
No bioactivity data found for target CHEMBL613732.
Fetching bioactivity data for target CHEMBL612744 (Feline coronavirus)...
Fetching bioactivity data for target CHEMBL5209664 (Murine coronavirus)...
Searching for targets related to diabetes...
Found 13 targets for diabetes.
Fetching bioactivity data for target CHEMBL1914266 (Islet amyloid polypeptide)...
Fetching bioactivity data for target CHEMBL235 (Peroxisome proliferator-activated receptor gamma)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Fetching bioactivity data for target CHEMBL2459 (Peroxisome proliferator-activated receptor gamma)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Searching for targets related to influenza...
Found 10 targets for influenza.
Fetching bioactivity data for target CHEMBL613128 (unidentified influenza virus)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Fetching bioactivity data for target CHEMBL613129 (Influenza B virus)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Fetching bioactivity data for target CHEMBL613740 (Influenza A virus)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Searching for targets related to prostate cancer...
Found 117 targets for prostate cancer.
Fetching bioactivity data for target CHEMBL4148 (L-type amino acid transporter 3)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Fetching bioactivity data for target CHEMBL613656 (Prostate)...
No bioactivity data found for target CHEMBL613656.
Fetching bioactivity data for target CHEMBL3112376 (Alpha-ketoglutarate-dependent dioxygenase alkB homolog 3)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Searching for targets related to breast cancer...
Found 105 targets for breast cancer.
Fetching bioactivity data for target CHEMBL4739687 (ABC-type xenobiotic transporter)...
No bioactivity data found for target CHEMBL4739687.
Fetching bioactivity data for target CHEMBL5990 (Breast cancer type 1 susceptibility protein)...


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co

Fetching bioactivity data for target CHEMBL614788 (Breast cancer cell line)...
Data collection complete. Saved as bioactivity_data_combined.csv.


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['bioactivity'] = bioactivity_class
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['disease'] = disease
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  f_df['target_chembl_id'] = selected_target
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,co