## Investigation about duplicate InChI-Keys in the 'ML-ready' DFs

### Problem Statement
In the notebook 'pubchem-sparse-matrix' on the Jovian Server it became evident that the Metadata_broad_sample identifier of the cp_annotations.csv file is not suited as a unique identifier. The DF resulting from overlapping on the Metadata_broad_sample has duplicate inchikeys. The notebook 'Troubleshooting' in this directory illuminated that the Identifier columns CPD_SMILES, inchikey, contained duplicate values.

### Aim of this Notebook
In this notebook the cause of duplicate InChI-Keys is to be investigated. For this purpose the columns 'inchikey', 'CPD_SMILES' and 'Metadata_broad_sample' (MBS) are taken from the file 'cp_annotations.csv' and the non-duplicate InChI-Keys are being dropped. The DF is then sorted by InChI-Keys to reveal the competing Metadata_broad_sample-Identifier (MBS). The resulting DF is then merged with the file 'raw_image_data.txt' on the column 'Metadata_broad_sample' (MBS). From the resulting dataframe all CellProfiler Columns and invariant Metadata columns are being dropped. The DF is then inspected visually to find patterns relating InChI-Keys and Metadata_broad_sample (MBS) to other information available in the Dataframe. Finally Hypothesis are formulated which are then tested against the given data. In this case the hypothesis is that the MBS of a given InChI-Key matches either the 'Metadata_Plate_Map_Number'-Column (MPMN) and the 'Metadata_mmoles_per_liter'-Columns (CONC) or just the MPMN or just the CONC which was found to be true.

### Summary of the Steps Undertaken
1. Get Relevant Identifiers from Annotations File
2. Get Relevant Rows in sorted order
3. Merge with Raw Data
4. Inspect Metadata Columns to find Patterns
5. Find precise relation of MBS to other Metadata

### Conclusion
Different MBS that are related to the same InChI-Key have either the same MPMN and the same CONC or just the same MPMN or just the same CONC. This was proven in step 5. In the corresponding jupyter-cell (s.b.) three conditional statements compared the number of unique MBS of one InChI-key with the nunmber of unique MPMN and unique CONC aswell as just with the number of unique MPMN aswell as just the number of unique CONC. Should these numbers be the same subsub-dataframes were built from the unique values of said columns and their MBS-list was compared. The same MBS-list in each of the subsub-dfs means that the MBS correspond the the respective column (either MPMN, CONC or both) which was the case for all 400 unique MBS that had duplicate InChI-keys.
That means, that the Preprocessing of the 'raw_image_data.txt' has to be revisited. More rows can be combined because of same concentration ranges and so on.
It can also be seen that Compounds with different CONC are fairly close together and might be able to be combined into one average value.

In [71]:
# Prerequisite

import pandas as pd
df = pd.read_csv("../../input/cp_annotations.csv")

In [72]:
# 1. Get Relevant Identifiers from Annotations File
df = df.loc[:,['inchikey','CPD_SMILES','Metadata_broad_sample']]
df['duplicate'] = df['inchikey'].duplicated(keep=False)
df['duplicateSMILES'] = df['CPD_SMILES'].duplicated(keep=False)

In [73]:
# 2. Get Relevant Rows in sorted order
df = df.query('duplicate == True')
df = df.dropna(subset=['inchikey'])
df = df.sort_values(by=['inchikey'])

In [74]:
# non essential control step
smiles_twice = ((df.CPD_SMILES.value_counts() > 2).to_numpy().sum() == 0) & ((df.CPD_SMILES.value_counts() < 2).to_numpy().sum() == 0)
inchikey_twice = ((df.inchikey.value_counts() > 2).to_numpy().sum() == 0) & ((df.inchikey.value_counts() < 2).to_numpy().sum() == 0)
unique_smiles = df.CPD_SMILES.value_counts().shape[0]
unique_inchikey = df.inchikey.value_counts().shape[0]

print('SMILES only twice? {}!'.format(smiles_twice))
print('InChIkeys only twice? {}!'.format(inchikey_twice))
print('#uniqueSMILES: {}'.format(unique_smiles))
print('#uniqueInChIKey: {}'.format(unique_inchikey))

SMILES only twice? False!
InChIkeys only twice? True!
#uniqueSMILES: 202
#uniqueInChIKey: 200


In [75]:
# 3. Merge with Raw Data
meta_data_cols = ['Metadata_broad_sample', 'Metadata_Plate','Metadata_Plate_Map_Name',
                  'Metadata_mmoles_per_liter', 'Metadata_pert_id','Metadata_pert_mfc_id', 'Metadata_pert_well']

df = pd.merge(left=df, right=pd.read_csv("../../input/Raw_Image_data.txt",usecols=meta_data_cols, sep='\t'), how='outer', on='Metadata_broad_sample')
df = df.dropna(subset=['duplicate'])

In [76]:
# 4. Inspect Metadata Columns to find Patterns

pd.set_option('max_colwidth', 25)

with pd.option_context('display.max_rows', None, 'display.max_columns', None):
    print(df[['inchikey','Metadata_broad_sample', 'Metadata_Plate_Map_Name']])

                      inchikey   Metadata_broad_sample Metadata_Plate_Map_Name
0     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-02-1            H-CBLD-002-4
1     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-02-1            H-CBLD-002-4
2     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-02-1            H-CBLD-002-4
3     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-02-1            H-CBLD-002-4
4     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-01-3            H-CBLD-004-4
5     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-01-3            H-CBLD-004-4
6     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-01-3            H-CBLD-004-4
7     APDMZOWNVCUUKT-LQWHRV...  BRD-K40294449-001-01-3            H-CBLD-004-4
8     APDMZOWNVCUUKT-OPYAII...  BRD-K76703685-001-01-3            H-CBLD-004-4
9     APDMZOWNVCUUKT-OPYAII...  BRD-K76703685-001-01-3            H-CBLD-004-4
10    APDMZOWNVCUUKT-OPYAII...  BRD-K76703685-001-01-3            H-CBLD-004-4
11    APDMZOWNVCUUKT-OPYAII...  BRD-K76703685-001-01

In [78]:
# 5. Find precise relation of MBS to other Metadata

# liste mit inchie keys machen
unique_inchikeys = df.inchikey.drop_duplicates().to_list()

# variables for counting the MBS that match MPMN and CONC respectively
MBS_matching_MPMN = 0
MBS_matching_CONC = 0
MBS_matching_MPMN_CONC = 0

#iterate through all unique inchies
for inchi in unique_inchikeys:
    # slice the df into a sub-df corresponding to the first unique inchikey
    sub_df_inchikey = df.query('inchikey == @inchi')
    
    # make lists of unique MBS MPMN and CONC present in the sub_df
    unique_MBS = sub_df_inchikey.Metadata_broad_sample.drop_duplicates().to_list()
    unique_MPMN = sub_df_inchikey.Metadata_Plate_Map_Name.drop_duplicates().to_list()
    unique_CONC = sub_df_inchikey.Metadata_mmoles_per_liter.drop_duplicates().to_list()
    
    # do the number of unique MBS match the number of unique MPMN entries for this sub_df?
    if (len(unique_MBS) == len(unique_MPMN)) & (len(unique_MBS) == len(unique_CONC)):
        # cycle through the number of unique MBS entries (== number of unique MPMN & CONC entries)
        for i in range(len(unique_MBS)):
            
            # creat subsub_dfs corresponding to either the MBS or MPMN or CONC
            subsub_df_MBS = sub_df_inchikey.query('Metadata_broad_sample == @unique_MBS[@i]')
            subsub_df_MPMN = sub_df_inchikey.query('Metadata_Plate_Map_Name == @unique_MPMN[@i]')
            subsub_df_CONC = sub_df_inchikey.query('Metadata_mmoles_per_liter == @unique_CONC[@i]')
            
            # are the subsub_dfs identical to each other ?
            if (subsub_df_MPMN.Metadata_broad_sample.to_list() == subsub_df_MBS.Metadata_broad_sample.to_list() == subsub_df_CONC.Metadata_broad_sample.to_list()):
                # if so the given MBS matches with the MPMN
                MBS_matching_MPMN_CONC += 1
                ####### print statement just to see how close different concentrations are
                print(sub_df_inchikey)
            # are the subsub_dfs different from each other ?
            else:
                # that means MBS does not correlate with MPMN and CONC even though their numbers match
                print("#MBS = #CONC but the subsubdfs are not identical!\n", sub_df_inchikey)
    
    # do the number of unique MBS match the number of unique MPMN entries for this sub_df?
    elif len(unique_MBS) == len(unique_MPMN):
        # cycle through the number of unique MBS entries (== number of unique MPMN entries)
        for i in range(len(unique_MBS)):
            
            # creat subsub_dfs corresponding to either the MBS or MPMN
            subsub_df_MBS = sub_df_inchikey.query('Metadata_broad_sample == @unique_MBS[@i]')
            subsub_df_MPMN = sub_df_inchikey.query('Metadata_Plate_Map_Name == @unique_MPMN[@i]')
            
            # are the subsub_dfs identical to each other ?
            if (subsub_df_MPMN.Metadata_broad_sample.to_list() == subsub_df_MBS.Metadata_broad_sample.to_list()):
                # if so the given MBS matches with the MPMN
                MBS_matching_MPMN += 1
            # are the subsub_dfs different from each other ?
            else:
                # that means MBS does not correlate with MPMN even though their numbers match
                print("#MBS = #CONC but the subsubdfs are not identical!\n", sub_df_inchikey)
    
    # do the number of unique MBS match the number of unique CONC entries for this sub_df?
    elif len(unique_MBS) == len(unique_CONC):
        # cycle through the number of unique MBS entries (== number of unique CONC entries)
        for i in range(len(unique_MBS)):
            
            # creat subsub_dfs corresponding to either the MBS or CONC
            subsub_df_MBS = sub_df_inchikey.query('Metadata_broad_sample == @unique_MBS[@i]')
            subsub_df_CONC = sub_df_inchikey.query('Metadata_mmoles_per_liter == @unique_CONC[@i]')
            
            # are the subsub_dfs identical to each other ?
            if (subsub_df_MBS.Metadata_broad_sample.to_list() == subsub_df_CONC.Metadata_broad_sample.to_list()):
                #if so the given MBS matches with the CONC
                MBS_matching_CONC += 1
            # are the subsub_dfs different from each other ?
            else:
                # that means MBS does not correlate with MPMN even though their numbers match
                print("#MBS = #CONC but the subsubdfs are not identical!\n", sub_df_inchikey)
    else:
        print("# of unique MBS does match # of either MPMN or CONC\n", inchi)

print('MBS matching MPMN & CONC: {}'.format(MBS_matching_MPMN_CONC))
print('MBS matching MPMN only:   {}'.format(MBS_matching_MPMN))
print('MBS matching CONC only:   {}'.format(MBS_matching_CONC))
print('MBS matching either MPMN & CONC or (CONC or MPMN): {}'.format(MBS_matching_CONC+MBS_matching_MPMN+MBS_matching_MPMN_CONC))
print('total unique MBS:  {}'.format(df.Metadata_broad_sample.drop_duplicates().shape[0]))

                   inchikey                CPD_SMILES   Metadata_broad_sample  \
0  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-02-1   
1  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-02-1   
2  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-02-1   
3  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-02-1   
4  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-01-3   
5  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-01-3   
6  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-01-3   
7  APDMZOWNVCUUKT-LQWHRV...  C[C@@H](CO)N1C[C@@H](...  BRD-K40294449-001-01-3   

  duplicate duplicateSMILES  Metadata_Plate Metadata_Plate_Map_Name  \
0      True            True           24734            H-CBLD-002-4   
1      True            True           24735            H-CBLD-002-4   
2      True            True           24736            H-

                    inchikey                CPD_SMILES  \
64  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   
65  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   
66  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   
67  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   
68  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   
69  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   
70  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   
71  BSKZHQCAOYVBFU-RELGLD...  CN1[C@@H]2CC[C@@H](CC...   

     Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
64  BRD-K06039860-001-01-4      True            True           26640   
65  BRD-K06039860-001-01-4      True            True           26641   
66  BRD-K06039860-001-01-4      True            True           26642   
67  BRD-K06039860-001-01-4      True            True           26643   
68  BRD-K06039860-001-02-2      True            True           26767   
69  BRD-K06039860-001-02-2      True         

                     inchikey                CPD_SMILES  \
111  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
112  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
113  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
114  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
115  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
116  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
117  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
118  BWTMYHAPSQQINJ-IECBHU...  C[C@H](CO)N1C[C@@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
111  BRD-K47380281-001-02-2      True            True           24726   
112  BRD-K47380281-001-02-2      True            True           24731   
113  BRD-K47380281-001-02-2      True            True           24732   
114  BRD-K47380281-001-02-2      True            True           24733   
115  BRD-K47380281-001-01-4      True            True           26679   
116  BRD-K47380281-001-01-4   

                     inchikey                CPD_SMILES  \
143  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   
144  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   
145  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   
146  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   
147  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   
148  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   
149  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   
150  CMMUKIKJYLMGMW-YQQAZP...  C[C@@H](CO)N1C[C@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
143  BRD-K57882732-001-01-1      True            True           26679   
144  BRD-K57882732-001-01-1      True            True           26680   
145  BRD-K57882732-001-01-1      True            True           26794   
146  BRD-K57882732-001-01-1      True            True           26795   
147  BRD-K57882732-001-02-9      True            True           24734   
148  BRD-K57882732-001-02-9   

                     inchikey                CPD_SMILES  \
187  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   
188  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   
189  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   
190  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   
191  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   
192  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   
193  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   
194  DEGZSTUNWMGQCY-GSHUGG...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
187  BRD-K19471718-001-02-6      True            True           24726   
188  BRD-K19471718-001-02-6      True            True           24731   
189  BRD-K19471718-001-02-6      True            True           24732   
190  BRD-K19471718-001-02-6      True            True           24733   
191  BRD-K19471718-001-01-8      True            True           26679   
192  BRD-K19471718-001-01-8   

                     inchikey                CPD_SMILES  \
226  DTQFALQSFDATTM-LZJOCL...  C[C@H](CO)N1C[C@@H](C...   
227  DTQFALQSFDATTM-LZJOCL...  C[C@H](CO)N1C[C@@H](C...   
228  DTQFALQSFDATTM-LZJOCL...  C[C@H](CO)N1C[C@@H](C...   
229  DTQFALQSFDATTM-LZJOCL...  C[C@H](CO)N1C[C@@H](C...   
230  DTQFALQSFDATTM-LZJOCL...  C[C@H](CO)N1C[C@@H](C...   
231  DTQFALQSFDATTM-LZJOCL...  C[C@H](CO)N1C[C@@H](C...   
232  DTQFALQSFDATTM-LZJOCL...  C[C@H](CO)N1C[C@@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
226  BRD-K83527041-001-02-4      True            True           24740   
227  BRD-K83527041-001-02-4      True            True           24750   
228  BRD-K83527041-001-02-4      True            True           24751   
229  BRD-K83527041-001-01-6      True            True           26679   
230  BRD-K83527041-001-01-6      True            True           26680   
231  BRD-K83527041-001-01-6      True            True           26794   
232  BRD-K835270

                     inchikey                CPD_SMILES  \
255  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
256  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
257  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
258  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
259  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
260  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
261  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   
262  DZRIWTKXHLXUTO-IECBHU...  C[C@H](CO)N1C[C@@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
255  BRD-K75755028-001-02-8      True            True           24726   
256  BRD-K75755028-001-02-8      True            True           24731   
257  BRD-K75755028-001-02-8      True            True           24732   
258  BRD-K75755028-001-02-8      True            True           24733   
259  BRD-K75755028-001-01-0      True            True           26679   
260  BRD-K75755028-001-01-0   

                     inchikey                CPD_SMILES  \
287  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   
288  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   
289  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   
290  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   
291  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   
292  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   
293  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   
294  FOLOUAULJSJVDY-GZBFAF...  CNC[C@H]1OCc2cnnn2CCC...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
287  BRD-K78824440-001-02-9      True            True           24726   
288  BRD-K78824440-001-02-9      True            True           24731   
289  BRD-K78824440-001-02-9      True            True           24732   
290  BRD-K78824440-001-02-9      True            True           24733   
291  BRD-K78824440-001-01-1      True            True           26679   
292  BRD-K78824440-001-01-1   

                     inchikey                CPD_SMILES  \
331  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   
332  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   
333  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   
334  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   
335  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   
336  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   
337  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   
338  GHLJLHKVBJDFKH-BPYKYC...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
331  BRD-K15489726-001-01-4      True            True           26679   
332  BRD-K15489726-001-01-4      True            True           26680   
333  BRD-K15489726-001-01-4      True            True           26794   
334  BRD-K15489726-001-01-4      True            True           26795   
335  BRD-K15489726-001-02-2      True            True           24734   
336  BRD-K15489726-001-02-2   

                     inchikey                CPD_SMILES  \
362  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
363  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
364  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
365  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
366  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
367  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
368  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
369  GMDADKPJAYHJFG-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
362  BRD-K66035321-001-02-0      True            True           24726   
363  BRD-K66035321-001-02-0      True            True           24731   
364  BRD-K66035321-001-02-0      True            True           24732   
365  BRD-K66035321-001-02-0      True            True           24733   
366  BRD-K66035321-001-01-2      True            True           26679   
367  BRD-K66035321-001-01-2   

                     inchikey                CPD_SMILES  \
416  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   
417  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   
418  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   
419  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   
420  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   
421  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   
422  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   
423  HEIBNMHSSNMFIU-RLFYNM...  C[C@@H](CO)N1C[C@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
416  BRD-K94672699-001-02-0      True            True           24726   
417  BRD-K94672699-001-02-0      True            True           24731   
418  BRD-K94672699-001-02-0      True            True           24732   
419  BRD-K94672699-001-02-0      True            True           24733   
420  BRD-K94672699-001-01-2      True            True           26679   
421  BRD-K94672699-001-01-2   

                     inchikey                CPD_SMILES  \
455  HYXYHDZNHQRBPW-CSODHU...  C[C@H](CO)N1C[C@H](C)...   
456  HYXYHDZNHQRBPW-CSODHU...  C[C@H](CO)N1C[C@H](C)...   
457  HYXYHDZNHQRBPW-CSODHU...  C[C@H](CO)N1C[C@H](C)...   
458  HYXYHDZNHQRBPW-CSODHU...  C[C@H](CO)N1C[C@H](C)...   
459  HYXYHDZNHQRBPW-CSODHU...  C[C@H](CO)N1C[C@H](C)...   
460  HYXYHDZNHQRBPW-CSODHU...  C[C@H](CO)N1C[C@H](C)...   
461  HYXYHDZNHQRBPW-CSODHU...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
455  BRD-K06507315-001-01-1      True            True           26679   
456  BRD-K06507315-001-01-1      True            True           26680   
457  BRD-K06507315-001-01-1      True            True           26794   
458  BRD-K06507315-001-01-1      True            True           26795   
459  BRD-K06507315-001-02-9      True            True           24740   
460  BRD-K06507315-001-02-9      True            True           24750   
461  BRD-K065073

                     inchikey                CPD_SMILES  \
483  IGJDKOMBBXSAPT-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
484  IGJDKOMBBXSAPT-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
485  IGJDKOMBBXSAPT-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
486  IGJDKOMBBXSAPT-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
487  IGJDKOMBBXSAPT-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
488  IGJDKOMBBXSAPT-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
489  IGJDKOMBBXSAPT-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
483  BRD-K49755501-001-02-7      True            True           24740   
484  BRD-K49755501-001-02-7      True            True           24750   
485  BRD-K49755501-001-02-7      True            True           24751   
486  BRD-K49755501-001-01-9      True            True           26679   
487  BRD-K49755501-001-01-9      True            True           26680   
488  BRD-K49755501-001-01-9      True            True           26794   
489  BRD-K497555

                     inchikey                CPD_SMILES  \
522  JATGFWCPGFHVCF-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
523  JATGFWCPGFHVCF-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
524  JATGFWCPGFHVCF-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
525  JATGFWCPGFHVCF-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
526  JATGFWCPGFHVCF-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
527  JATGFWCPGFHVCF-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
528  JATGFWCPGFHVCF-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
522  BRD-K14470344-001-01-7      True            True           26679   
523  BRD-K14470344-001-01-7      True            True           26680   
524  BRD-K14470344-001-01-7      True            True           26794   
525  BRD-K14470344-001-01-7      True            True           26795   
526  BRD-K14470344-001-02-5      True            True           24740   
527  BRD-K14470344-001-02-5      True            True           24750   
528  BRD-K144703

                     inchikey                CPD_SMILES  \
564  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   
565  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   
566  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   
567  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   
568  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   
569  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   
570  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   
571  JYOVEVCXPRGJRX-SQGPQF...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
564  BRD-K48005659-001-01-0      True            True           26679   
565  BRD-K48005659-001-01-0      True            True           26680   
566  BRD-K48005659-001-01-0      True            True           26794   
567  BRD-K48005659-001-01-0      True            True           26795   
568  BRD-K48005659-001-02-8      True            True           24734   
569  BRD-K48005659-001-02-8   

                     inchikey                CPD_SMILES  \
603  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   
604  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   
605  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   
606  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   
607  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   
608  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   
609  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   
610  KFMYUMLTCHBBFY-RLSLOF...  C[C@@H](CO)N1C[C@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
603  BRD-K18391257-001-02-4      True            True           24734   
604  BRD-K18391257-001-02-4      True            True           24735   
605  BRD-K18391257-001-02-4      True            True           24736   
606  BRD-K18391257-001-02-4      True            True           24739   
607  BRD-K18391257-001-01-6      True            True           26679   
608  BRD-K18391257-001-01-6   

                     inchikey                CPD_SMILES  \
642  KMYKLRXROVKHNJ-IMWIBF...  C[C@@H](CO)N1C[C@@H](...   
643  KMYKLRXROVKHNJ-IMWIBF...  C[C@@H](CO)N1C[C@@H](...   
644  KMYKLRXROVKHNJ-IMWIBF...  C[C@@H](CO)N1C[C@@H](...   
645  KMYKLRXROVKHNJ-IMWIBF...  C[C@@H](CO)N1C[C@@H](...   
646  KMYKLRXROVKHNJ-IMWIBF...  C[C@@H](CO)N1C[C@@H](...   
647  KMYKLRXROVKHNJ-IMWIBF...  C[C@@H](CO)N1C[C@@H](...   
648  KMYKLRXROVKHNJ-IMWIBF...  C[C@@H](CO)N1C[C@@H](...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
642  BRD-K30382953-001-01-6      True            True           26679   
643  BRD-K30382953-001-01-6      True            True           26680   
644  BRD-K30382953-001-01-6      True            True           26794   
645  BRD-K30382953-001-01-6      True            True           26795   
646  BRD-K30382953-001-02-4      True            True           24740   
647  BRD-K30382953-001-02-4      True            True           24750   
648  BRD-K303829

                     inchikey                CPD_SMILES  \
691  KPZGSTQZFNZEDR-SQWLQE...  C[C@H](CO)N1C[C@H](C)...   
692  KPZGSTQZFNZEDR-SQWLQE...  C[C@H](CO)N1C[C@H](C)...   
693  KPZGSTQZFNZEDR-SQWLQE...  C[C@H](CO)N1C[C@H](C)...   
694  KPZGSTQZFNZEDR-SQWLQE...  C[C@H](CO)N1C[C@H](C)...   
695  KPZGSTQZFNZEDR-SQWLQE...  C[C@H](CO)N1C[C@H](C)...   
696  KPZGSTQZFNZEDR-SQWLQE...  C[C@H](CO)N1C[C@H](C)...   
697  KPZGSTQZFNZEDR-SQWLQE...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
691  BRD-K65135171-001-02-0      True            True           24740   
692  BRD-K65135171-001-02-0      True            True           24750   
693  BRD-K65135171-001-02-0      True            True           24751   
694  BRD-K65135171-001-01-2      True            True           26679   
695  BRD-K65135171-001-01-2      True            True           26680   
696  BRD-K65135171-001-01-2      True            True           26794   
697  BRD-K651351

                     inchikey                CPD_SMILES  \
729  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   
730  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   
731  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   
732  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   
733  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   
734  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   
735  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   
736  MESWQEBVROHANN-JQVVWY...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
729  BRD-K16079243-001-01-6      True            True           26679   
730  BRD-K16079243-001-01-6      True            True           26680   
731  BRD-K16079243-001-01-6      True            True           26794   
732  BRD-K16079243-001-01-6      True            True           26795   
733  BRD-K16079243-001-02-4      True            True           24734   
734  BRD-K16079243-001-02-4   

                     inchikey                CPD_SMILES  \
775  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   
776  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   
777  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   
778  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   
779  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   
780  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   
781  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   
782  MNAJUJYQFCFYAB-LMMKCT...  CC(C)COC(=O)N(C)C[C@@...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
775  BRD-K70102743-001-01-4      True            True           26679   
776  BRD-K70102743-001-01-4      True            True           26680   
777  BRD-K70102743-001-01-4      True            True           26794   
778  BRD-K70102743-001-01-4      True            True           26795   
779  BRD-K70102743-001-02-2      True            True           24726   
780  BRD-K70102743-001-02-2   

                     inchikey                CPD_SMILES  \
812  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   
813  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   
814  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   
815  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   
816  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   
817  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   
818  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   
819  MSGDIDWYZNYHQU-DRLORS...  C[C@H](CO)N1C[C@@H](C...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
812  BRD-K02960110-001-01-3      True            True           26679   
813  BRD-K02960110-001-01-3      True            True           26680   
814  BRD-K02960110-001-01-3      True            True           26794   
815  BRD-K02960110-001-01-3      True            True           26795   
816  BRD-K02960110-001-02-1      True            True           24726   
817  BRD-K02960110-001-02-1   

                     inchikey                CPD_SMILES  \
851  MSGDIDWYZNYHQU-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
852  MSGDIDWYZNYHQU-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
853  MSGDIDWYZNYHQU-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
854  MSGDIDWYZNYHQU-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
855  MSGDIDWYZNYHQU-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
856  MSGDIDWYZNYHQU-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
857  MSGDIDWYZNYHQU-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
851  BRD-K57026098-001-01-3      True            True           26679   
852  BRD-K57026098-001-01-3      True            True           26680   
853  BRD-K57026098-001-01-3      True            True           26794   
854  BRD-K57026098-001-01-3      True            True           26795   
855  BRD-K57026098-001-02-1      True            True           24740   
856  BRD-K57026098-001-02-1      True            True           24750   
857  BRD-K570260

                     inchikey                CPD_SMILES  \
908  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
909  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
910  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
911  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
912  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
913  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
914  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   
915  NSEOCVGIAYKUAZ-UWVAXJ...  C[C@@H](CO)N1C[C@@H](...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
908  BRD-K78395012-001-01-0      True            True           26679   
909  BRD-K78395012-001-01-0      True            True           26680   
910  BRD-K78395012-001-01-0      True            True           26794   
911  BRD-K78395012-001-01-0      True            True           26795   
912  BRD-K78395012-001-02-8      True            True           24734   
913  BRD-K78395012-001-02-8   

                     inchikey                CPD_SMILES  \
951  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
952  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
953  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
954  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
955  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
956  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
957  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   
958  OIASBGFOQSUHIL-VVZHRX...  C[C@H](CO)N1C[C@H](C)...   

      Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
951  BRD-K03817886-001-02-4      True            True           24726   
952  BRD-K03817886-001-02-4      True            True           24731   
953  BRD-K03817886-001-02-4      True            True           24732   
954  BRD-K03817886-001-02-4      True            True           24733   
955  BRD-K03817886-001-01-6      True            True           26679   
956  BRD-K03817886-001-01-6   

                      inchikey                CPD_SMILES  \
1000  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   
1001  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   
1002  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   
1003  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   
1004  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   
1005  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   
1006  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   
1007  PUHFBKBPROGAIN-OFQRWU...  C[C@@H](CO)N1C[C@H](C...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1000  BRD-K30653168-001-01-1      True            True           26679   
1001  BRD-K30653168-001-01-1      True            True           26680   
1002  BRD-K30653168-001-01-1      True            True           26794   
1003  BRD-K30653168-001-01-1      True            True           26795   
1004  BRD-K30653168-001-02-9      True            True           24726   
1005  BRD-K3065

                      inchikey                CPD_SMILES  \
1039  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   
1040  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   
1041  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   
1042  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   
1043  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   
1044  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   
1045  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   
1046  QHTRHHVBBSPOHC-YCRNBW...  COc1ccccc1CN(C)C[C@H]...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1039  BRD-K09757845-001-01-8      True            True           26679   
1040  BRD-K09757845-001-01-8      True            True           26680   
1041  BRD-K09757845-001-01-8      True            True           26794   
1042  BRD-K09757845-001-01-8      True            True           26795   
1043  BRD-K09757845-001-02-6      True            True           24726   
1044  BRD-K0975

                      inchikey                CPD_SMILES  \
1090  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1091  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1092  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1093  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1094  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1095  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1096  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1097  QVHCTLOUCAXGOY-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1090  BRD-K25595086-001-02-8      True            True           24734   
1091  BRD-K25595086-001-02-8      True            True           24735   
1092  BRD-K25595086-001-02-8      True            True           24736   
1093  BRD-K25595086-001-02-8      True            True           24739   
1094  BRD-K25595086-001-01-0      True            True           26679   
1095  BRD-K2559

                      inchikey                CPD_SMILES  \
1129  RIHXBZGHHXCGPP-MIZPHK...  C[C@H](CO)N1C[C@H](C)...   
1130  RIHXBZGHHXCGPP-MIZPHK...  C[C@H](CO)N1C[C@H](C)...   
1131  RIHXBZGHHXCGPP-MIZPHK...  C[C@H](CO)N1C[C@H](C)...   
1132  RIHXBZGHHXCGPP-MIZPHK...  C[C@H](CO)N1C[C@H](C)...   
1133  RIHXBZGHHXCGPP-MIZPHK...  C[C@H](CO)N1C[C@H](C)...   
1134  RIHXBZGHHXCGPP-MIZPHK...  C[C@H](CO)N1C[C@H](C)...   
1135  RIHXBZGHHXCGPP-MIZPHK...  C[C@H](CO)N1C[C@H](C)...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1129  BRD-K05229696-001-01-2      True            True           26679   
1130  BRD-K05229696-001-01-2      True            True           26680   
1131  BRD-K05229696-001-01-2      True            True           26794   
1132  BRD-K05229696-001-01-2      True            True           26795   
1133  BRD-K05229696-001-02-0      True            True           24740   
1134  BRD-K05229696-001-02-0      True            True           24750   
1

                      inchikey                CPD_SMILES  \
1168  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1169  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1170  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1171  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1172  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1173  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1174  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1175  RMZGJSLXUIRDMP-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1168  BRD-K64560329-001-02-1      True            True           24734   
1169  BRD-K64560329-001-02-1      True            True           24735   
1170  BRD-K64560329-001-02-1      True            True           24736   
1171  BRD-K64560329-001-02-1      True            True           24739   
1172  BRD-K64560329-001-01-3      True            True           26679   
1173  BRD-K6456

                      inchikey                CPD_SMILES  \
1213  SUQCROIYCBRRRP-ZMSDIM...  C[C@H](CO)N1C[C@H](C)...   
1214  SUQCROIYCBRRRP-ZMSDIM...  C[C@H](CO)N1C[C@H](C)...   
1215  SUQCROIYCBRRRP-ZMSDIM...  C[C@H](CO)N1C[C@H](C)...   
1216  SUQCROIYCBRRRP-ZMSDIM...  C[C@H](CO)N1C[C@H](C)...   
1217  SUQCROIYCBRRRP-ZMSDIM...  C[C@H](CO)N1C[C@H](C)...   
1218  SUQCROIYCBRRRP-ZMSDIM...  C[C@H](CO)N1C[C@H](C)...   
1219  SUQCROIYCBRRRP-ZMSDIM...  C[C@H](CO)N1C[C@H](C)...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1213  BRD-K54332480-001-01-4      True            True           26679   
1214  BRD-K54332480-001-01-4      True            True           26680   
1215  BRD-K54332480-001-01-4      True            True           26794   
1216  BRD-K54332480-001-01-4      True            True           26795   
1217  BRD-K54332480-001-02-2      True            True           24740   
1218  BRD-K54332480-001-02-2      True            True           24750   
1

                      inchikey                CPD_SMILES  \
1264  UOJDQXZVAPFQKD-AOIWGV...  CC(C)COC(=O)N(C)C[C@@...   
1265  UOJDQXZVAPFQKD-AOIWGV...  CC(C)COC(=O)N(C)C[C@@...   
1266  UOJDQXZVAPFQKD-AOIWGV...  CC(C)COC(=O)N(C)C[C@@...   
1267  UOJDQXZVAPFQKD-AOIWGV...  CC(C)COC(=O)N(C)C[C@@...   
1268  UOJDQXZVAPFQKD-AOIWGV...  CC(C)COC(=O)N(C)C[C@@...   
1269  UOJDQXZVAPFQKD-AOIWGV...  CC(C)COC(=O)N(C)C[C@@...   
1270  UOJDQXZVAPFQKD-AOIWGV...  CC(C)COC(=O)N(C)C[C@@...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1264  BRD-K83190598-001-01-0      True            True           26679   
1265  BRD-K83190598-001-01-0      True            True           26680   
1266  BRD-K83190598-001-01-0      True            True           26794   
1267  BRD-K83190598-001-01-0      True            True           26795   
1268  BRD-K83190598-001-02-8      True            True           24740   
1269  BRD-K83190598-001-02-8      True            True           24750   
1

                      inchikey                CPD_SMILES  \
1307  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1308  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1309  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1310  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1311  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1312  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1313  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   
1314  WIMCLPXWPPGBKL-FHAGJX...  C[C@H](CO)N1C[C@@H](C...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1307  BRD-K94834134-001-02-9      True            True           24726   
1308  BRD-K94834134-001-02-9      True            True           24731   
1309  BRD-K94834134-001-02-9      True            True           24732   
1310  BRD-K94834134-001-02-9      True            True           24733   
1311  BRD-K94834134-001-01-1      True            True           26679   
1312  BRD-K9483

                      inchikey                CPD_SMILES  \
1346  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   
1347  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   
1348  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   
1349  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   
1350  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   
1351  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   
1352  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   
1353  WJAIMBKFVIKGAD-AOJNWG...  C[C@@H](CO)N1C[C@H](C...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1346  BRD-K68458302-001-02-5      True            True           24734   
1347  BRD-K68458302-001-02-5      True            True           24735   
1348  BRD-K68458302-001-02-5      True            True           24736   
1349  BRD-K68458302-001-02-5      True            True           24739   
1350  BRD-K68458302-001-01-7      True            True           26679   
1351  BRD-K6845

                      inchikey                CPD_SMILES  \
1385  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   
1386  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   
1387  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   
1388  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   
1389  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   
1390  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   
1391  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   
1392  WJAIMBKFVIKGAD-VRUMLP...  C[C@H](CO)N1C[C@H](C)...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1385  BRD-K64100337-001-01-8      True            True           26679   
1386  BRD-K64100337-001-01-8      True            True           26680   
1387  BRD-K64100337-001-01-8      True            True           26794   
1388  BRD-K64100337-001-01-8      True            True           26795   
1389  BRD-K64100337-001-02-6      True            True           24734   
1390  BRD-K6410

                      inchikey                CPD_SMILES  \
1422  WMEBRZUCENJNIY-UEXGIB...  C[C@H](CO)N1C[C@H](C)...   
1423  WMEBRZUCENJNIY-UEXGIB...  C[C@H](CO)N1C[C@H](C)...   
1424  WMEBRZUCENJNIY-UEXGIB...  C[C@H](CO)N1C[C@H](C)...   
1425  WMEBRZUCENJNIY-UEXGIB...  C[C@H](CO)N1C[C@H](C)...   
1426  WMEBRZUCENJNIY-UEXGIB...  C[C@H](CO)N1C[C@H](C)...   
1427  WMEBRZUCENJNIY-UEXGIB...  C[C@H](CO)N1C[C@H](C)...   
1428  WMEBRZUCENJNIY-UEXGIB...  C[C@H](CO)N1C[C@H](C)...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1422  BRD-K27248268-001-02-2      True            True           24740   
1423  BRD-K27248268-001-02-2      True            True           24750   
1424  BRD-K27248268-001-02-2      True            True           24751   
1425  BRD-K27248268-001-01-4      True            True           26679   
1426  BRD-K27248268-001-01-4      True            True           26680   
1427  BRD-K27248268-001-01-4      True            True           26794   
1

                      inchikey                CPD_SMILES  \
1459  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   
1460  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   
1461  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   
1462  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   
1463  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   
1464  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   
1465  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   
1466  WYPGSQIBOYUSAK-CZSZKK...  C[C@@H](CO)N1C[C@@H](...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1459  BRD-K04749191-001-01-0      True            True           26679   
1460  BRD-K04749191-001-01-0      True            True           26680   
1461  BRD-K04749191-001-01-0      True            True           26794   
1462  BRD-K04749191-001-01-0      True            True           26795   
1463  BRD-K04749191-001-02-8      True            True           24726   
1464  BRD-K0474

                      inchikey                CPD_SMILES  \
1506  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   
1507  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   
1508  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   
1509  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   
1510  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   
1511  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   
1512  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   
1513  XYWPMUCHOREPKK-AXHZCL...  C[C@@H](CO)N1C[C@H](C...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1506  BRD-K60129749-001-02-1      True            True           24726   
1507  BRD-K60129749-001-02-1      True            True           24731   
1508  BRD-K60129749-001-02-1      True            True           24732   
1509  BRD-K60129749-001-02-1      True            True           24733   
1510  BRD-K60129749-001-01-3      True            True           26679   
1511  BRD-K6012

                      inchikey                CPD_SMILES  \
1550  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
1551  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
1552  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
1553  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
1554  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
1555  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
1556  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   
1557  YRBTUIBMTXMQGX-DXIQSL...  C[C@@H](CO)N1C[C@@H](...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1550  BRD-K86033856-001-01-6      True            True           26679   
1551  BRD-K86033856-001-01-6      True            True           26680   
1552  BRD-K86033856-001-01-6      True            True           26794   
1553  BRD-K86033856-001-01-6      True            True           26795   
1554  BRD-K86033856-001-02-4      True            True           24734   
1555  BRD-K8603

                      inchikey                CPD_SMILES  \
1588  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1589  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1590  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1591  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1592  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1593  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1594  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   
1595  ZMFVFBQYBHAIKE-VCBZYW...  C[C@@H](CO)N1C[C@@H](...   

       Metadata_broad_sample duplicate duplicateSMILES  Metadata_Plate  \
1588  BRD-K60520051-001-02-3      True            True           24726   
1589  BRD-K60520051-001-02-3      True            True           24731   
1590  BRD-K60520051-001-02-3      True            True           24732   
1591  BRD-K60520051-001-02-3      True            True           24733   
1592  BRD-K60520051-001-01-5      True            True           26679   
1593  BRD-K6052