# **CDD - CCR5 : Descriptor Calculation and Dataset Preparation Part 03**

khalid El Akri

[*'Chem Code Professor' YouTube channel*](http://youtube.com/@chemcodeprofessor)

In this Jupyter notebook, we will be building a real-life **data science project** that you can include in your **data science portfolio**. Particularly, we will be building a machine learning model using the Bindingdb bioactivity data.

In **Part 03**, we will be calculating molecular descriptors that are essentially quantitative description of the compounds in the dataset. Finally, we will be preparing this into a dataset for subsequent model building in Part 04.

---

## **Download PaDEL-Descriptor**

In [1]:
! curl -O https://github.com/chemcodeprofessor/data/raw/master/PaDel-Descriptor.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  206k    0  206k    0     0   266k      0 --:--:-- --:--:-- --:--:--  266k


In [1]:
!curl -O https://github.com/chemcodeprofessor/data/raw/master/PaDel-Descriptor.sh

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  206k    0  206k    0     0   210k      0 --:--:-- --:--:-- --:--:--  210k


In [2]:
! ls -l

total 90040
-rw-r--r--@  1 akrikhalid  staff  22654982 May 29 20:39 C-C chemokine receptor type 5_Inhibitors_3316.sdf
-rw-r--r--@  1 akrikhalid  staff   5861006 May 29 20:38 C-C chemokine receptor type 5_Inhibitors_3316.tsv
-rw-r--r--   1 akrikhalid  staff    437936 Jun  5 22:57 CCR5_bioa_data_preprocessed.csv
-rw-r--r--@  1 akrikhalid  staff    239166 Jun  5 22:54 CCR5_inhibitors 3316_Part 01.ipynb
-rw-r--r--@  1 akrikhalid  staff    256981 Jun  5 22:59 CCR5_inhibitors 3316_Part 02.ipynb
-rw-r--r--@  1 akrikhalid  staff     89689 May 29 10:53 CCR5_inhibitors 3316_Part 03.ipynb
-rw-r--r--@  1 akrikhalid  staff     76160 May 29 11:06 CCR5_inhibitors 3316_Part 04.ipynb
-rw-r--r--@  1 akrikhalid  staff    563191 May 29 11:20 CCR5_inhibitors 3316_Part_05.ipynb
-rw-r--r--   1 akrikhalid  staff   6052996 Jun  5 22:49 Output3_C-C chemokine receptor type 5_Inhibitors_3316.csv
drwxr-xr-x  21 akrikhalid  staff       672 May 24 13:06 [34mPaDel-Descriptor[m[m
-rw-r--r--@  1 akrikhali

## **UnZip PaDEL-Descriptor.zip**

In [None]:
import zipfile

with zipfile.ZipFile("PaDel-Descriptor.zip", "r") as zip_ref:
    zip_ref.extractall()

In [3]:
! ls -l

total 90040
-rw-r--r--@  1 akrikhalid  staff  22654982 May 29 20:39 C-C chemokine receptor type 5_Inhibitors_3316.sdf
-rw-r--r--@  1 akrikhalid  staff   5861006 May 29 20:38 C-C chemokine receptor type 5_Inhibitors_3316.tsv
-rw-r--r--   1 akrikhalid  staff    437936 Jun  5 22:57 CCR5_bioa_data_preprocessed.csv
-rw-r--r--@  1 akrikhalid  staff    239166 Jun  5 22:54 CCR5_inhibitors 3316_Part 01.ipynb
-rw-r--r--@  1 akrikhalid  staff    256981 Jun  5 22:59 CCR5_inhibitors 3316_Part 02.ipynb
-rw-r--r--@  1 akrikhalid  staff     89689 May 29 10:53 CCR5_inhibitors 3316_Part 03.ipynb
-rw-r--r--@  1 akrikhalid  staff     76160 May 29 11:06 CCR5_inhibitors 3316_Part 04.ipynb
-rw-r--r--@  1 akrikhalid  staff    563191 May 29 11:20 CCR5_inhibitors 3316_Part_05.ipynb
-rw-r--r--   1 akrikhalid  staff   6052996 Jun  5 22:49 Output3_C-C chemokine receptor type 5_Inhibitors_3316.csv
drwxr-xr-x  21 akrikhalid  staff       672 May 24 13:06 [34mPaDel-Descriptor[m[m
-rw-r--r--@  1 akrikhali

## **Loading SOAT-2 bioactivity data**

Download the curated Bindingdb bioactivity data that has been pre-processed from Parts 1 and 2 of this Machine learning Project series. Here we will be using the **soat_2_bioa_data_preprocessed.csv** file that essentially contain the pIC50 values that we will be using for building a regression model.

In [17]:
import pandas as pd

In [18]:
df = pd.read_csv('CCR5_bioa_data_preprocessed.csv')

In [19]:
df

Unnamed: 0,mol_bdID,mol_smiles,bioactivity_class,MW,LogP,NumHDonors,NumHAcceptors,pIC50
0,50853863,Cc1ccnc(C)c1C(=O)N1CCC(C)(N2CCN([C@@H](C)c3ccc...,active,502.625,5.47924,0.0,4.0,9.30103
1,50961778,Cc1cccc(C)c1C(=O)N1CCC(C)(N2CCC(N(c3ccccc3)c3c...,active,482.672,5.99584,0.0,4.0,9.30103
2,50448211,Cc1cc(Cl)nc(C)c1C(=O)NCC[C@@H](C)N1CCC(N(Cc2cc...,active,612.196,5.49634,3.0,6.0,9.30103
3,50417247,Cc1cccc(C)c1C(=O)N1CCC(C)(N2CCC(N(Cc3ccccc3)C(...,active,487.688,5.20134,0.0,3.0,9.30103
4,50260844,O=C(O)[C@@H](CC1CCC1)N1C[C@H](CN2CCC(CCCc3cccc...,active,506.706,6.21940,1.0,3.0,9.30103
...,...,...,...,...,...,...,...,...
2941,50260998,CCn1nc(Cc2ccc(Oc3ccccc3)cc2)cc1C1CCN(C[C@H]2CN...,active,678.893,8.35370,1.0,6.0,9.30103
2942,50264307,CC[C@@H](C)[C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(C...,active,666.882,7.60620,1.0,6.0,9.30103
2943,50260929,Cc1cccc(C)c1C(=O)N1CCC(C)(N2CCC(Nc3ccc(Br)cc3)...,active,484.482,5.63714,1.0,3.0,9.30103
2944,50260935,CC[C@@H](C)[C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(C...,active,658.781,6.96830,1.0,6.0,9.30103


In [20]:
selection = ['mol_smiles','mol_bdID']
df_selection = df[selection]
df_selection.to_csv('molecule.smi', sep='\t', index=False, header=False)

In [21]:
! cat molecule.smi | head -6

Cc1ccnc(C)c1C(=O)N1CCC(C)(N2CCN([C@@H](C)c3ccc(C(F)(F)F)cc3)[C@@H](C)C2)CC1	50853863
Cc1cccc(C)c1C(=O)N1CCC(C)(N2CCC(N(c3ccccc3)c3cccnc3)CC2)CC1	50961778
Cc1cc(Cl)nc(C)c1C(=O)NCC[C@@H](C)N1CCC(N(Cc2ccsc2)C(=O)NCc2ccc(C(=O)O)cc2)CC1	50448211
Cc1cccc(C)c1C(=O)N1CCC(C)(N2CCC(N(Cc3ccccc3)C(=O)C3CC3)CC2)CC1	50417247
O=C(O)[C@@H](CC1CCC1)N1C[C@H](CN2CCC(CCCc3ccccc3)CC2)[C@@H](c2cccc(F)c2)C1	50260844
CCCCOc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)C(C)C)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1	50417235
cat: stdout: Broken pipe


In [22]:
! cat molecule.smi | wc -l

    2946


## **Calculate fingerprint descriptors**

### **Calculate PaDEL descriptors**

In [23]:
! cat PaDel-Descriptor.sh

java -Xms1G -Xmx1G -Djava.awt.headless=true -jar ./PaDEL-Descriptor/PaDEL-Descriptor.jar -removesalt -standardizenitro -fingerprints -descriptortypes ./PaDEL-Descriptor/PubchemFingerprinter.xml -dir ./ -file descriptors_output.csv


In [24]:
! bash Padel-Descriptor.sh

Processing 50853863 in molecule.smi (1/2946). 
Processing 50961778 in molecule.smi (2/2946). 
Processing 50448211 in molecule.smi (3/2946). 
Processing 50417247 in molecule.smi (4/2946). 
Processing 50260844 in molecule.smi (5/2946). Average speed: 4.52 s/mol.
Processing 50417235 in molecule.smi (6/2946). Average speed: 2.49 s/mol.
Processing 50986660 in molecule.smi (7/2946). Average speed: 1.71 s/mol.
Processing 50879830 in molecule.smi (8/2946). Average speed: 1.31 s/mol.
Processing 50719084 in molecule.smi (9/2946). Average speed: 1.37 s/mol.
Processing 50986655 in molecule.smi (10/2946). Average speed: 1.18 s/mol.
Processing 50417237 in molecule.smi (11/2946). Average speed: 1.02 s/mol.
Processing 50260948 in molecule.smi (12/2946). Average speed: 0.92 s/mol.
Processing 50986651 in molecule.smi (13/2946). Average speed: 0.86 s/mol.
Processing 50260869 in molecule.smi (14/2946). Average speed: 0.83 s/mol.
Processing 50260882 in molecule.smi (15/2946). Average speed: 0.80 s/mol.
Pro

Processing 50304422 in molecule.smi (114/2946). Average speed: 0.32 s/mol.
Processing 50260822 in molecule.smi (115/2946). Average speed: 0.32 s/mol.
Processing 50417270 in molecule.smi (116/2946). Average speed: 0.32 s/mol.
Processing 50260968 in molecule.smi (117/2946). Average speed: 0.32 s/mol.
Processing 50260932 in molecule.smi (118/2946). Average speed: 0.32 s/mol.
Processing 51080201 in molecule.smi (119/2946). Average speed: 0.31 s/mol.
Processing 51080202 in molecule.smi (120/2946). Average speed: 0.31 s/mol.
Processing 51080203 in molecule.smi (121/2946). Average speed: 0.31 s/mol.
Processing 50961792 in molecule.smi (122/2946). Average speed: 0.31 s/mol.
Processing 50853846 in molecule.smi (123/2946). Average speed: 0.31 s/mol.
Processing 50450932 in molecule.smi (124/2946). Average speed: 0.31 s/mol.
Processing 50417242 in molecule.smi (125/2946). Average speed: 0.31 s/mol.
Processing 50260817 in molecule.smi (126/2946). Average speed: 0.31 s/mol.
Processing 50260919 in mo

Processing 50260986 in molecule.smi (224/2946). Average speed: 0.27 s/mol.
Processing 50260990 in molecule.smi (225/2946). Average speed: 0.27 s/mol.
Processing 50260941 in molecule.smi (226/2946). Average speed: 0.27 s/mol.
Processing 606873 in molecule.smi (227/2946). Average speed: 0.27 s/mol.
Processing 51118978 in molecule.smi (228/2946). Average speed: 0.27 s/mol.
Processing 50706841 in molecule.smi (229/2946). Average speed: 0.27 s/mol.
Processing 51089533 in molecule.smi (230/2946). Average speed: 0.27 s/mol.
Processing 51080186 in molecule.smi (231/2946). Average speed: 0.27 s/mol.
Processing 51080188 in molecule.smi (233/2946). Average speed: 0.27 s/mol.
Processing 51080187 in molecule.smi (232/2946). Average speed: 0.27 s/mol.
Processing 50986653 in molecule.smi (234/2946). Average speed: 0.27 s/mol.
Processing 50879818 in molecule.smi (235/2946). Average speed: 0.27 s/mol.
Processing 50260857 in molecule.smi (236/2946). Average speed: 0.27 s/mol.
Processing 50260960 in mole

Processing 51002626 in molecule.smi (334/2946). Average speed: 0.24 s/mol.
Processing 50417225 in molecule.smi (335/2946). Average speed: 0.24 s/mol.
Processing 51080147 in molecule.smi (336/2946). Average speed: 0.24 s/mol.
Processing 50853857 in molecule.smi (337/2946). Average speed: 0.24 s/mol.
Processing 50157377 in molecule.smi (338/2946). Average speed: 0.24 s/mol.
Processing 50157397 in molecule.smi (339/2946). Average speed: 0.24 s/mol.
Processing 50223246 in molecule.smi (340/2946). Average speed: 0.24 s/mol.
Processing 50341756 in molecule.smi (342/2946). Average speed: 0.24 s/mol.
Processing 606878 in molecule.smi (341/2946). Average speed: 0.24 s/mol.
Processing 606785 in molecule.smi (343/2946). Average speed: 0.24 s/mol.
Processing 50188587 in molecule.smi (344/2946). Average speed: 0.24 s/mol.
Processing 50188593 in molecule.smi (345/2946). Average speed: 0.24 s/mol.
Processing 50188604 in molecule.smi (346/2946). Average speed: 0.24 s/mol.
Processing 50188608 in molecu

Processing 51132243 in molecule.smi (445/2946). Average speed: 0.23 s/mol.
Processing 50610630 in molecule.smi (446/2946). Average speed: 0.23 s/mol.
Processing 50260867 in molecule.smi (447/2946). Average speed: 0.23 s/mol.
Processing 51132219 in molecule.smi (449/2946). Average speed: 0.23 s/mol.
Processing 50853871 in molecule.smi (448/2946). Average speed: 0.23 s/mol.
Processing 50264273 in molecule.smi (450/2946). Average speed: 0.23 s/mol.
Processing 50914933 in molecule.smi (451/2946). Average speed: 0.23 s/mol.
Processing 51002638 in molecule.smi (452/2946). Average speed: 0.23 s/mol.
Processing 50986806 in molecule.smi (453/2946). Average speed: 0.23 s/mol.
Processing 50260820 in molecule.smi (454/2946). Average speed: 0.23 s/mol.
Processing 51132220 in molecule.smi (455/2946). Average speed: 0.23 s/mol.
Processing 51132188 in molecule.smi (456/2946). Average speed: 0.23 s/mol.
Processing 50264215 in molecule.smi (457/2946). Average speed: 0.23 s/mol.
Processing 50853868 in mo

Processing 50847034 in molecule.smi (556/2946). Average speed: 0.21 s/mol.
Processing 50719086 in molecule.smi (557/2946). Average speed: 0.21 s/mol.
Processing 51080141 in molecule.smi (558/2946). Average speed: 0.21 s/mol.
Processing 50853853 in molecule.smi (559/2946). Average speed: 0.21 s/mol.
Processing 51080140 in molecule.smi (560/2946). Average speed: 0.21 s/mol.
Processing 50295085 in molecule.smi (561/2946). Average speed: 0.21 s/mol.
Processing 51431504 in molecule.smi (562/2946). Average speed: 0.21 s/mol.
Processing 50417217 in molecule.smi (563/2946). Average speed: 0.21 s/mol.
Processing 50215967 in molecule.smi (564/2946). Average speed: 0.21 s/mol.
Processing 50914945 in molecule.smi (565/2946). Average speed: 0.21 s/mol.
Processing 50879280 in molecule.smi (566/2946). Average speed: 0.21 s/mol.
Processing 50610637 in molecule.smi (567/2946). Average speed: 0.21 s/mol.
Processing 50417229 in molecule.smi (568/2946). Average speed: 0.21 s/mol.
Processing 50264299 in mo

Processing 50914940 in molecule.smi (666/2946). Average speed: 0.21 s/mol.
Processing 50260969 in molecule.smi (667/2946). Average speed: 0.21 s/mol.
Processing 50628087 in molecule.smi (668/2946). Average speed: 0.21 s/mol.
Processing 50450924 in molecule.smi (669/2946). Average speed: 0.21 s/mol.
Processing 50450929 in molecule.smi (670/2946). Average speed: 0.21 s/mol.
Processing 50450941 in molecule.smi (671/2946). Average speed: 0.21 s/mol.
Processing 50547950 in molecule.smi (672/2946). Average speed: 0.21 s/mol.
Processing 51021986 in molecule.smi (673/2946). Average speed: 0.21 s/mol.
Processing 51008173 in molecule.smi (674/2946). Average speed: 0.21 s/mol.
Processing 51008198 in molecule.smi (675/2946). Average speed: 0.21 s/mol.
Processing 606806 in molecule.smi (676/2946). Average speed: 0.21 s/mol.
Processing 50264252 in molecule.smi (679/2946). Average speed: 0.21 s/mol.
Processing 50853832 in molecule.smi (677/2946). Average speed: 0.21 s/mol.
Processing 50344567 in mole

Processing 50558741 in molecule.smi (776/2946). Average speed: 0.20 s/mol.
Processing 50450919 in molecule.smi (777/2946). Average speed: 0.20 s/mol.
Processing 50450926 in molecule.smi (778/2946). Average speed: 0.20 s/mol.
Processing 50450928 in molecule.smi (779/2946). Average speed: 0.20 s/mol.
Processing 50450940 in molecule.smi (780/2946). Average speed: 0.20 s/mol.
Processing 50417233 in molecule.smi (781/2946). Average speed: 0.20 s/mol.
Processing 50417213 in molecule.smi (782/2946). Average speed: 0.20 s/mol.
Processing 50215956 in molecule.smi (783/2946). Average speed: 0.20 s/mol.
Processing 50193743 in molecule.smi (784/2946). Average speed: 0.20 s/mol.
Processing 50193794 in molecule.smi (785/2946). Average speed: 0.20 s/mol.
Processing 50191299 in molecule.smi (786/2946). Average speed: 0.20 s/mol.
Processing 51080123 in molecule.smi (787/2946). Average speed: 0.20 s/mol.
Processing 50320344 in molecule.smi (788/2946). Average speed: 0.20 s/mol.
Processing 51008171 in mo

Processing 606872 in molecule.smi (887/2946). Average speed: 0.21 s/mol.
Processing 50264244 in molecule.smi (888/2946). Average speed: 0.21 s/mol.
Processing 51187564 in molecule.smi (889/2946). Average speed: 0.21 s/mol.
Processing 51187580 in molecule.smi (890/2946). Average speed: 0.21 s/mol.
Processing 50015721 in molecule.smi (891/2946). Average speed: 0.21 s/mol.
Processing 50022828 in molecule.smi (892/2946). Average speed: 0.21 s/mol.
Processing 51021980 in molecule.smi (893/2946). Average speed: 0.21 s/mol.
Processing 51080113 in molecule.smi (894/2946). Average speed: 0.21 s/mol.
Processing 50630113 in molecule.smi (895/2946). Average speed: 0.21 s/mol.
Processing 50558740 in molecule.smi (896/2946). Average speed: 0.22 s/mol.
Processing 50558757 in molecule.smi (897/2946). Average speed: 0.22 s/mol.
Processing 50547944 in molecule.smi (898/2946). Average speed: 0.22 s/mol.
Processing 50417252 in molecule.smi (899/2946). Average speed: 0.22 s/mol.
Processing 50417263 in mole

Processing 50320324 in molecule.smi (997/2946). Average speed: 0.22 s/mol.
Processing 50296734 in molecule.smi (998/2946). Average speed: 0.22 s/mol.
Processing 50260953 in molecule.smi (999/2946). Average speed: 0.22 s/mol.
Processing 50193758 in molecule.smi (1000/2946). Average speed: 0.22 s/mol.
Processing 606879 in molecule.smi (1001/2946). Average speed: 0.22 s/mol.
Processing 606861 in molecule.smi (1002/2946). Average speed: 0.22 s/mol.
Processing 50264260 in molecule.smi (1003/2946). Average speed: 0.22 s/mol.
Processing 50879286 in molecule.smi (1004/2946). Average speed: 0.22 s/mol.
Processing 606869 in molecule.smi (1005/2946). Average speed: 0.22 s/mol.
Processing 606856 in molecule.smi (1006/2946). Average speed: 0.22 s/mol.
Processing 606859 in molecule.smi (1007/2946). Average speed: 0.22 s/mol.
Processing 51209258 in molecule.smi (1008/2946). Average speed: 0.22 s/mol.
Processing 50912825 in molecule.smi (1009/2946). Average speed: 0.22 s/mol.
Processing 50912836 in mo

Processing 50015722 in molecule.smi (1106/2946). Average speed: 0.22 s/mol.
Processing 50417284 in molecule.smi (1107/2946). Average speed: 0.22 s/mol.
Processing 606846 in molecule.smi (1108/2946). Average speed: 0.22 s/mol.
Processing 50912839 in molecule.smi (1109/2946). Average speed: 0.22 s/mol.
Processing 51080288 in molecule.smi (1110/2946). Average speed: 0.22 s/mol.
Processing 51080286 in molecule.smi (1111/2946). Average speed: 0.22 s/mol.
Processing 51080287 in molecule.smi (1112/2946). Average speed: 0.22 s/mol.
Processing 50706852 in molecule.smi (1113/2946). Average speed: 0.22 s/mol.
Processing 50604985 in molecule.smi (1114/2946). Average speed: 0.22 s/mol.
Processing 50440387 in molecule.smi (1115/2946). Average speed: 0.22 s/mol.
Processing 50440389 in molecule.smi (1116/2946). Average speed: 0.22 s/mol.
Processing 50440376 in molecule.smi (1117/2946). Average speed: 0.22 s/mol.
Processing 50320338 in molecule.smi (1118/2946). Average speed: 0.22 s/mol.
Processing 503

Processing 50912828 in molecule.smi (1215/2946). Average speed: 0.21 s/mol.
Processing 51088023 in molecule.smi (1216/2946). Average speed: 0.21 s/mol.
Processing 51080279 in molecule.smi (1217/2946). Average speed: 0.21 s/mol.
Processing 50610636 in molecule.smi (1218/2946). Average speed: 0.21 s/mol.
Processing 50558747 in molecule.smi (1219/2946). Average speed: 0.21 s/mol.
Processing 50558749 in molecule.smi (1220/2946). Average speed: 0.21 s/mol.
Processing 50579374 in molecule.smi (1222/2946). Average speed: 0.21 s/mol.
Processing 50566061 in molecule.smi (1221/2946). Average speed: 0.21 s/mol.
Processing 50440379 in molecule.smi (1223/2946). Average speed: 0.21 s/mol.
Processing 50343952 in molecule.smi (1224/2946). Average speed: 0.21 s/mol.
Processing 50343963 in molecule.smi (1225/2946). Average speed: 0.21 s/mol.
Processing 50341753 in molecule.smi (1226/2946). Average speed: 0.21 s/mol.
Processing 50296761 in molecule.smi (1227/2946). Average speed: 0.21 s/mol.
Processing 5

Processing 51080275 in molecule.smi (1324/2946). Average speed: 0.21 s/mol.
Processing 50861017 in molecule.smi (1325/2946). Average speed: 0.21 s/mol.
Processing 50440395 in molecule.smi (1326/2946). Average speed: 0.21 s/mol.
Processing 50264233 in molecule.smi (1328/2946). Average speed: 0.21 s/mol.
Processing 50264276 in molecule.smi (1327/2946). Average speed: 0.21 s/mol.
Processing 50191309 in molecule.smi (1329/2946). Average speed: 0.21 s/mol.
Processing 50961796 in molecule.smi (1330/2946). Average speed: 0.21 s/mol.
Processing 51021985 in molecule.smi (1331/2946). Average speed: 0.21 s/mol.
Processing 606816 in molecule.smi (1332/2946). Average speed: 0.21 s/mol.
Processing 50295090 in molecule.smi (1333/2946). Average speed: 0.21 s/mol.
Processing 51080273 in molecule.smi (1334/2946). Average speed: 0.21 s/mol.
Processing 51080274 in molecule.smi (1335/2946). Average speed: 0.21 s/mol.
Processing 50579369 in molecule.smi (1336/2946). Average speed: 0.21 s/mol.
Processing 503

Processing 50343970 in molecule.smi (1432/2946). Average speed: 0.21 s/mol.
Processing 50188625 in molecule.smi (1433/2946). Average speed: 0.21 s/mol.
Processing 50188633 in molecule.smi (1434/2946). Average speed: 0.21 s/mol.
Processing 50173565 in molecule.smi (1435/2946). Average speed: 0.21 s/mol.
Processing 50173596 in molecule.smi (1436/2946). Average speed: 0.21 s/mol.
Processing 50610647 in molecule.smi (1438/2946). Average speed: 0.21 s/mol.
Processing 50173597 in molecule.smi (1437/2946). Average speed: 0.21 s/mol.
Processing 50448202 in molecule.smi (1439/2946). Average speed: 0.21 s/mol.
Processing 50264234 in molecule.smi (1440/2946). Average speed: 0.21 s/mol.
Processing 50260773 in molecule.smi (1441/2946). Average speed: 0.21 s/mol.
Processing 50193902 in molecule.smi (1442/2946). Average speed: 0.21 s/mol.
Processing 50915722 in molecule.smi (1443/2946). Average speed: 0.21 s/mol.
Processing 51080254 in molecule.smi (1444/2946). Average speed: 0.21 s/mol.
Processing 5

Processing 50191303 in molecule.smi (1540/2946). Average speed: 0.21 s/mol.
Processing 51444878 in molecule.smi (1541/2946). Average speed: 0.21 s/mol.
Processing 50558742 in molecule.smi (1542/2946). Average speed: 0.21 s/mol.
Processing 50320335 in molecule.smi (1543/2946). Average speed: 0.21 s/mol.
Processing 50547916 in molecule.smi (1544/2946). Average speed: 0.21 s/mol.
Processing 50173567 in molecule.smi (1545/2946). Average speed: 0.21 s/mol.
Processing 50173589 in molecule.smi (1546/2946). Average speed: 0.21 s/mol.
Processing 50173594 in molecule.smi (1547/2946). Average speed: 0.21 s/mol.
Processing 50173604 in molecule.smi (1548/2946). Average speed: 0.21 s/mol.
Processing 50193913 in molecule.smi (1549/2946). Average speed: 0.21 s/mol.
Processing 50288534 in molecule.smi (1550/2946). Average speed: 0.21 s/mol.
Processing 50440381 in molecule.smi (1551/2946). Average speed: 0.21 s/mol.
Processing 50440393 in molecule.smi (1552/2946). Average speed: 0.21 s/mol.
Processing 5

Processing 50915733 in molecule.smi (1648/2946). Average speed: 0.21 s/mol.
Processing 50847036 in molecule.smi (1649/2946). Average speed: 0.21 s/mol.
Processing 50547926 in molecule.smi (1651/2946). Average speed: 0.21 s/mol.
Processing 50621916 in molecule.smi (1650/2946). Average speed: 0.21 s/mol.
Processing 50320327 in molecule.smi (1652/2946). Average speed: 0.21 s/mol.
Processing 51080247 in molecule.smi (1653/2946). Average speed: 0.21 s/mol.
Processing 51080248 in molecule.smi (1654/2946). Average speed: 0.21 s/mol.
Processing 50879812 in molecule.smi (1655/2946). Average speed: 0.21 s/mol.
Processing 50199676 in molecule.smi (1656/2946). Average speed: 0.21 s/mol.
Processing 50548411 in molecule.smi (1657/2946). Average speed: 0.21 s/mol.
Processing 50417282 in molecule.smi (1658/2946). Average speed: 0.21 s/mol.
Processing 50915716 in molecule.smi (1659/2946). Average speed: 0.21 s/mol.
Processing 50929576 in molecule.smi (1660/2946). Average speed: 0.21 s/mol.
Processing 5

Processing 50193779 in molecule.smi (1756/2946). Average speed: 0.21 s/mol.
Processing 50260807 in molecule.smi (1757/2946). Average speed: 0.21 s/mol.
Processing 50568235 in molecule.smi (1758/2946). Average speed: 0.21 s/mol.
Processing 50914924 in molecule.smi (1759/2946). Average speed: 0.21 s/mol.
Processing 50547923 in molecule.smi (1760/2946). Average speed: 0.21 s/mol.
Processing 50706850 in molecule.smi (1761/2946). Average speed: 0.21 s/mol.
Processing 51080243 in molecule.smi (1762/2946). Average speed: 0.21 s/mol.
Processing 50264258 in molecule.smi (1763/2946). Average speed: 0.21 s/mol.
Processing 51187574 in molecule.smi (1764/2946). Average speed: 0.21 s/mol.
Processing 51187571 in molecule.smi (1765/2946). Average speed: 0.21 s/mol.
Processing 50839252 in molecule.smi (1766/2946). Average speed: 0.21 s/mol.
Processing 50839243 in molecule.smi (1767/2946). Average speed: 0.21 s/mol.
Processing 50630097 in molecule.smi (1768/2946). Average speed: 0.21 s/mol.
Processing 5

Processing 50630081 in molecule.smi (1864/2946). Average speed: 0.21 s/mol.
Processing 50188562 in molecule.smi (1865/2946). Average speed: 0.20 s/mol.
Processing 51020226 in molecule.smi (1866/2946). Average speed: 0.20 s/mol.
Processing 50547938 in molecule.smi (1867/2946). Average speed: 0.20 s/mol.
Processing 50879816 in molecule.smi (1868/2946). Average speed: 0.20 s/mol.
Processing 51246340 in molecule.smi (1869/2946). Average speed: 0.20 s/mol.
Processing 50628086 in molecule.smi (1870/2946). Average speed: 0.20 s/mol.
Processing 50630102 in molecule.smi (1871/2946). Average speed: 0.20 s/mol.
Processing 50288545 in molecule.smi (1872/2946). Average speed: 0.20 s/mol.
Processing 51080236 in molecule.smi (1873/2946). Average speed: 0.20 s/mol.
Processing 51088028 in molecule.smi (1874/2946). Average speed: 0.20 s/mol.
Processing 50188585 in molecule.smi (1875/2946). Average speed: 0.20 s/mol.
Processing 50191284 in molecule.smi (1876/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 50450934 in molecule.smi (1972/2946). Average speed: 0.20 s/mol.
Processing 50288511 in molecule.smi (1973/2946). Average speed: 0.20 s/mol.
Processing 50296762 in molecule.smi (1974/2946). Average speed: 0.20 s/mol.
Processing 50375901 in molecule.smi (1975/2946). Average speed: 0.20 s/mol.
Processing 50342442 in molecule.smi (1976/2946). Average speed: 0.20 s/mol.
Processing 51080231 in molecule.smi (1977/2946). Average speed: 0.20 s/mol.
Processing 50861013 in molecule.smi (1978/2946). Average speed: 0.20 s/mol.
Processing 51236535 in molecule.smi (1979/2946). Average speed: 0.20 s/mol.
Processing 50915728 in molecule.smi (1980/2946). Average speed: 0.20 s/mol.
Processing 51080230 in molecule.smi (1981/2946). Average speed: 0.20 s/mol.
Processing 50193774 in molecule.smi (1982/2946). Average speed: 0.20 s/mol.
Processing 51080229 in molecule.smi (1983/2946). Average speed: 0.20 s/mol.
Processing 50547918 in molecule.smi (1984/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 50288524 in molecule.smi (2081/2946). Average speed: 0.20 s/mol.
Processing 50343955 in molecule.smi (2082/2946). Average speed: 0.20 s/mol.
Processing 50839229 in molecule.smi (2083/2946). Average speed: 0.20 s/mol.
Processing 50157393 in molecule.smi (2084/2946). Average speed: 0.20 s/mol.
Processing 50915714 in molecule.smi (2085/2946). Average speed: 0.20 s/mol.
Processing 50173634 in molecule.smi (2086/2946). Average speed: 0.20 s/mol.
Processing 50173637 in molecule.smi (2087/2946). Average speed: 0.20 s/mol.
Processing 50625059 in molecule.smi (2088/2946). Average speed: 0.20 s/mol.
Processing 51236501 in molecule.smi (2089/2946). Average speed: 0.20 s/mol.
Processing 51187526 in molecule.smi (2090/2946). Average speed: 0.20 s/mol.
Processing 51187545 in molecule.smi (2091/2946). Average speed: 0.20 s/mol.
Processing 50264259 in molecule.smi (2092/2946). Average speed: 0.20 s/mol.
Processing 50375887 in molecule.smi (2093/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 50173569 in molecule.smi (2190/2946). Average speed: 0.20 s/mol.
Processing 50264231 in molecule.smi (2191/2946). Average speed: 0.20 s/mol.
Processing 50264236 in molecule.smi (2192/2946). Average speed: 0.20 s/mol.
Processing 50264296 in molecule.smi (2193/2946). Average speed: 0.20 s/mol.
Processing 50274630 in molecule.smi (2194/2946). Average speed: 0.20 s/mol.
Processing 50188591 in molecule.smi (2195/2946). Average speed: 0.20 s/mol.
Processing 50188623 in molecule.smi (2196/2946). Average speed: 0.20 s/mol.
Processing 50188628 in molecule.smi (2197/2946). Average speed: 0.20 s/mol.
Processing 50191265 in molecule.smi (2198/2946). Average speed: 0.20 s/mol.
Processing 50191270 in molecule.smi (2199/2946). Average speed: 0.20 s/mol.
Processing 50191279 in molecule.smi (2200/2946). Average speed: 0.20 s/mol.
Processing 50191292 in molecule.smi (2201/2946). Average speed: 0.20 s/mol.
Processing 50191296 in molecule.smi (2202/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 50173563 in molecule.smi (2298/2946). Average speed: 0.20 s/mol.
Processing 50364233 in molecule.smi (2299/2946). Average speed: 0.20 s/mol.
Processing 51187529 in molecule.smi (2300/2946). Average speed: 0.20 s/mol.
Processing 50364381 in molecule.smi (2301/2946). Average speed: 0.20 s/mol.
Processing 51246339 in molecule.smi (2302/2946). Average speed: 0.20 s/mol.
Processing 50264222 in molecule.smi (2303/2946). Average speed: 0.20 s/mol.
Processing 51187501 in molecule.smi (2304/2946). Average speed: 0.20 s/mol.
Processing 51187494 in molecule.smi (2305/2946). Average speed: 0.20 s/mol.
Processing 51187519 in molecule.smi (2306/2946). Average speed: 0.20 s/mol.
Processing 50364256 in molecule.smi (2308/2946). Average speed: 0.20 s/mol.
Processing 51187491 in molecule.smi (2307/2946). Average speed: 0.20 s/mol.
Processing 50364259 in molecule.smi (2309/2946). Average speed: 0.20 s/mol.
Processing 50364260 in molecule.smi (2310/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 51379822 in molecule.smi (2406/2946). Average speed: 0.20 s/mol.
Processing 50653311 in molecule.smi (2407/2946). Average speed: 0.20 s/mol.
Processing 51394205 in molecule.smi (2408/2946). Average speed: 0.20 s/mol.
Processing 51394204 in molecule.smi (2409/2946). Average speed: 0.20 s/mol.
Processing 51008193 in molecule.smi (2410/2946). Average speed: 0.20 s/mol.
Processing 51008209 in molecule.smi (2411/2946). Average speed: 0.20 s/mol.
Processing 51186858 in molecule.smi (2412/2946). Average speed: 0.20 s/mol.
Processing 50173640 in molecule.smi (2413/2946). Average speed: 0.20 s/mol.
Processing 50173584 in molecule.smi (2414/2946). Average speed: 0.20 s/mol.
Processing 50288553 in molecule.smi (2415/2946). Average speed: 0.20 s/mol.
Processing 50604999 in molecule.smi (2416/2946). Average speed: 0.20 s/mol.
Processing 51187551 in molecule.smi (2417/2946). Average speed: 0.20 s/mol.
Processing 50364252 in molecule.smi (2418/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 50220385 in molecule.smi (2514/2946). Average speed: 0.20 s/mol.
Processing 50191274 in molecule.smi (2515/2946). Average speed: 0.20 s/mol.
Processing 50191281 in molecule.smi (2516/2946). Average speed: 0.20 s/mol.
Processing 50074016 in molecule.smi (2517/2946). Average speed: 0.20 s/mol.
Processing 50074019 in molecule.smi (2518/2946). Average speed: 0.20 s/mol.
Processing 50074021 in molecule.smi (2519/2946). Average speed: 0.20 s/mol.
Processing 50285597 in molecule.smi (2521/2946). Average speed: 0.20 s/mol.
Processing 50282732 in molecule.smi (2520/2946). Average speed: 0.20 s/mol.
Processing 51080211 in molecule.smi (2522/2946). Average speed: 0.20 s/mol.
Processing 51008196 in molecule.smi (2523/2946). Average speed: 0.20 s/mol.
Processing 50945676 in molecule.smi (2524/2946). Average speed: 0.20 s/mol.
Processing 50949663 in molecule.smi (2525/2946). Average speed: 0.20 s/mol.
Processing 50949671 in molecule.smi (2526/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 50391723 in molecule.smi (2622/2946). Average speed: 0.20 s/mol.
Processing 50094518 in molecule.smi (2623/2946). Average speed: 0.20 s/mol.
Processing 50653369 in molecule.smi (2624/2946). Average speed: 0.20 s/mol.
Processing 51080536 in molecule.smi (2625/2946). Average speed: 0.20 s/mol.
Processing 50094522 in molecule.smi (2626/2946). Average speed: 0.20 s/mol.
Processing 51080540 in molecule.smi (2627/2946). Average speed: 0.20 s/mol.
Processing 50364261 in molecule.smi (2628/2946). Average speed: 0.20 s/mol.
Processing 50375876 in molecule.smi (2629/2946). Average speed: 0.20 s/mol.
Processing 50375904 in molecule.smi (2630/2946). Average speed: 0.20 s/mol.
Processing 50364226 in molecule.smi (2631/2946). Average speed: 0.20 s/mol.
Processing 50364228 in molecule.smi (2632/2946). Average speed: 0.20 s/mol.
Processing 50364229 in molecule.smi (2633/2946). Average speed: 0.20 s/mol.
Processing 50364230 in molecule.smi (2634/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 51080541 in molecule.smi (2730/2946). Average speed: 0.20 s/mol.
Processing 50653315 in molecule.smi (2731/2946). Average speed: 0.20 s/mol.
Processing 51080526 in molecule.smi (2732/2946). Average speed: 0.20 s/mol.
Processing 50653327 in molecule.smi (2733/2946). Average speed: 0.20 s/mol.
Processing 50945671 in molecule.smi (2734/2946). Average speed: 0.20 s/mol.
Processing 50945672 in molecule.smi (2735/2946). Average speed: 0.20 s/mol.
Processing 50949660 in molecule.smi (2736/2946). Average speed: 0.20 s/mol.
Processing 50391729 in molecule.smi (2737/2946). Average speed: 0.20 s/mol.
Processing 50653367 in molecule.smi (2738/2946). Average speed: 0.20 s/mol.
Processing 50653310 in molecule.smi (2739/2946). Average speed: 0.20 s/mol.
Processing 51236539 in molecule.smi (2740/2946). Average speed: 0.20 s/mol.
Processing 51080538 in molecule.smi (2741/2946). Average speed: 0.20 s/mol.
Processing 50653331 in molecule.smi (2742/2946). Average speed: 0.20 s/mol.
Processing 5

Processing 50417187 in molecule.smi (2839/2946). Average speed: 0.20 s/mol.
Processing 50417236 in molecule.smi (2840/2946). Average speed: 0.20 s/mol.
Processing 50272396 in molecule.smi (2841/2946). Average speed: 0.20 s/mol.
Processing 50272391 in molecule.smi (2842/2946). Average speed: 0.20 s/mol.
Processing 51002478 in molecule.smi (2843/2946). Average speed: 0.20 s/mol.
Processing 51031928 in molecule.smi (2844/2946). Average speed: 0.20 s/mol.
Processing 50719122 in molecule.smi (2845/2946). Average speed: 0.20 s/mol.
Processing 50986649 in molecule.smi (2846/2946). Average speed: 0.20 s/mol.
Processing 50417184 in molecule.smi (2847/2946). Average speed: 0.20 s/mol.
Processing 50215954 in molecule.smi (2848/2946). Average speed: 0.20 s/mol.
Processing 50919703 in molecule.smi (2849/2946). Average speed: 0.20 s/mol.
Processing 50220368 in molecule.smi (2850/2946). Average speed: 0.20 s/mol.
Processing 50220376 in molecule.smi (2851/2946). Average speed: 0.20 s/mol.
Processing 5

Descriptor calculation completed in 9 mins 50.381 secs . Average speed: 0.20 s/mol.


#### Descriptor calculation completed in 9 mins 50.381 secs . Average speed: 0.20 s/mol.

## Let check the Descriptors output file ''descriptors_output.csv''

In [25]:
! ls -l

total 45136
-rw-r--r--   1 akrikhalid  staff   437936 Jun  5 22:57 CCR5_bioa_data_preprocessed.csv
-rw-r--r--@  1 akrikhalid  staff   239166 Jun  5 22:54 CCR5_inhibitors 3316_Part 01.ipynb
-rw-r--r--@  1 akrikhalid  staff   256981 Jun  5 22:59 CCR5_inhibitors 3316_Part 02.ipynb
-rw-r--r--@  1 akrikhalid  staff   328932 Jun  5 23:44 CCR5_inhibitors 3316_Part 03.ipynb
-rw-r--r--@  1 akrikhalid  staff    76160 May 29 11:06 CCR5_inhibitors 3316_Part 04.ipynb
-rw-r--r--@  1 akrikhalid  staff   563191 May 29 11:20 CCR5_inhibitors 3316_Part_05.ipynb
-rw-r--r--   1 akrikhalid  staff  6052996 Jun  5 22:49 Output3_C-C chemokine receptor type 5_Inhibitors_3316.csv
drwxr-xr-x  21 akrikhalid  staff      672 May 24 13:06 [34mPaDel-Descriptor[m[m
-rw-r--r--@  1 akrikhalid  staff      231 May 24 10:18 PaDel-Descriptor.sh
-rw-r--r--@  1 akrikhalid  staff   211058 May 29 10:05 PaDel-Descriptor.zip
-rw-r--r--   1 akrikhalid  staff  5234382 Jun  5 23:33 descriptors_output.csv
-rw-r--r--   1

## **Preparing the X and Y Data Matrices**

### **X data matrix**

In [26]:
df2_X = pd.read_csv('descriptors_output.csv')

In [27]:
df2_X

Unnamed: 0,Name,PubchemFP0,PubchemFP1,PubchemFP2,PubchemFP3,PubchemFP4,PubchemFP5,PubchemFP6,PubchemFP7,PubchemFP8,...,PubchemFP871,PubchemFP872,PubchemFP873,PubchemFP874,PubchemFP875,PubchemFP876,PubchemFP877,PubchemFP878,PubchemFP879,PubchemFP880
0,50853863,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,50448211,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,50417247,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,50961778,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,50260844,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2941,50260929,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2942,50264307,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2943,50260998,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2944,50260935,1,1,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [28]:
df2_X = df2_X.drop(columns = ["Name"])

In [29]:
df2_X

Unnamed: 0,PubchemFP0,PubchemFP1,PubchemFP2,PubchemFP3,PubchemFP4,PubchemFP5,PubchemFP6,PubchemFP7,PubchemFP8,PubchemFP9,...,PubchemFP871,PubchemFP872,PubchemFP873,PubchemFP874,PubchemFP875,PubchemFP876,PubchemFP877,PubchemFP878,PubchemFP879,PubchemFP880
0,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
1,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
3,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
4,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2941,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2942,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2943,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2944,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0


## **Y variable**

### **Convert IC50 to pIC50**

In [30]:
df2_Y = df['pIC50']

In [31]:
df2_Y.info()

<class 'pandas.core.series.Series'>
RangeIndex: 2946 entries, 0 to 2945
Series name: pIC50
Non-Null Count  Dtype  
--------------  -----  
2946 non-null   float64
dtypes: float64(1)
memory usage: 23.1 KB


In [32]:
df2_Y.describe()

count    2946.000000
mean        7.306422
std         1.473506
min         2.000000
25%         6.346787
50%         7.522879
75%         8.431798
max        11.522879
Name: pIC50, dtype: float64

## **Combining X and Y variable**

In [33]:
dataset2 = pd.concat([df2_X,df2_Y], axis=1)

In [34]:
dataset2

Unnamed: 0,PubchemFP0,PubchemFP1,PubchemFP2,PubchemFP3,PubchemFP4,PubchemFP5,PubchemFP6,PubchemFP7,PubchemFP8,PubchemFP9,...,PubchemFP872,PubchemFP873,PubchemFP874,PubchemFP875,PubchemFP876,PubchemFP877,PubchemFP878,PubchemFP879,PubchemFP880,pIC50
0,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
1,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
2,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
3,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
4,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2941,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
2942,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
2943,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103
2944,1,1,1,1,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,9.30103


In [35]:
dataset2['pIC50'].describe()

count    2946.000000
mean        7.306422
std         1.473506
min         2.000000
25%         6.346787
50%         7.522879
75%         8.431798
max        11.522879
Name: pIC50, dtype: float64

In [37]:
dataset2.to_csv('CCR5_bioa_data_preprocessed_pIC50_pubchem_fp.csv', index=False)

In [38]:
! ls -l

total 113136
-rw-r--r--@  1 akrikhalid  staff  22654982 May 29 20:39 C-C chemokine receptor type 5_Inhibitors_3316.sdf
-rw-r--r--@  1 akrikhalid  staff   5861006 May 29 20:38 C-C chemokine receptor type 5_Inhibitors_3316.tsv
-rw-r--r--   1 akrikhalid  staff    437936 Jun  5 22:57 CCR5_bioa_data_preprocessed.csv
-rw-r--r--   1 akrikhalid  staff   5252224 Jun  5 23:46 CCR5_bioa_data_preprocessed_pIC50_pubchem_fp.csv
-rw-r--r--@  1 akrikhalid  staff    239166 Jun  5 22:54 CCR5_inhibitors 3316_Part 01.ipynb
-rw-r--r--@  1 akrikhalid  staff    256981 Jun  5 22:59 CCR5_inhibitors 3316_Part 02.ipynb
-rw-r--r--@  1 akrikhalid  staff    328485 Jun  5 23:48 CCR5_inhibitors 3316_Part 03.ipynb
-rw-r--r--@  1 akrikhalid  staff     76160 May 29 11:06 CCR5_inhibitors 3316_Part 04.ipynb
-rw-r--r--@  1 akrikhalid  staff    563191 May 29 11:20 CCR5_inhibitors 3316_Part_05.ipynb
-rw-r--r--   1 akrikhalid  staff   6052996 Jun  5 22:49 Output3_C-C chemokine receptor type 5_Inhibitors_3316.csv
dr