### The following code will generate a list of blinded compounds along with PCR plate assignments for each blinded compound

<p> For our screens, we aliquot 20 mM stock solution into 96 well PCR plates. Using this strategy, we are able to blind each of the testing compounds for the expirementers. This compound plate map will be used to unblind each of the compounds after the screen is complete. A list of any size can be passed to the following code and it will generate a subset of 90 randomly selected compounds. 6 reference and control conditions will be added to the list of 90 randomly selected compounds and will return a fully mapped and randomized list of 96 testable conditions.</p>

In [1]:
# Loading the necessary packages
import pandas as pd
import numpy as np
from datetime import date
import random

1. Read in and clean up the compound data

In [None]:
compound_df = pd.read_csv('/Volumes/LaCie/_2021_08_screen/<compound list.csv>')

In [47]:
# Resetting the row index to remove cells with instructions on how to fill in the form

new_header = compound_df.iloc[0] #Grab the first row for the header
compound_df = compound_df[1:] #Take the data less the header row
compound_df.columns = new_header #Set the header row as the df header
compound_df = compound_df.drop(compound_df.index[0]) #Drop the example row from the dataset
compound_df.head()

Unnamed: 0,Compound,CAS ID,Molecular Weight,Vendor,Using compound stock currently in lab?,Ordered?,Grant Charged,"Solid, Liquid or Chemical Library? (S, L, CL)",Quantity ordered (g or L),Received?,Storage location,Toxicity,Compound Classification,Notes
2,(-)-Cedrene,469-61-4,204.35,MCE,False,True,,CL,100 mL @ 50mM solution,True,-80,,,(-)-Cedrene (α-cedrene) is a sesquiterpene con...
3,(-)-Huperzine A,102518-79-6,242.32,MCE,False,True,,CL,100 mL @ 50mM solution,True,-80,,,(-)-Huperzine A (Huperzine A) is an alkaloid i...
4,"2,5-Dihydroxybenzoic acid",490-79-9,154.12,MCE,False,True,,CL,100 mL @ 50mM solution,True,-80,,,"2,5-Dihydroxybenzoic acid is a derivative of b..."
5,4-Methoxybenzaldehyde,123-11-5,136.15,MCE,False,True,,CL,100 mL @ 50mM solution,True,-80,,,4-Methoxybenzaldehyde is a naturally occurring...
6,5-Aminolevulinic acid (hydrochloride),5451-09-2,167.59,MCE,False,True,,CL,100 mL @ 50mM solution,True,-80,,,5-Aminolevulinic acid hydrochloride (5-ALA hyd...


2.  Some of the compounds for this screen were obtained through a compound library vendor. We will us all of the in the compound library, the rest will be randomly selected from the remaining list.

In [48]:
CL = compound_df[compound_df["Solid, Liquid or Chemical Library? (S, L, CL)"] == "CL"] # Subset the compound library compounds
to_select = compound_df[(compound_df["Solid, Liquid or Chemical Library? (S, L, CL)"] == "S") | (compound_df["Solid, Liquid or Chemical Library? (S, L, CL)"] == "L")]
selected = to_select.sample(n=(90 - len(CL))) # Randomly sample non CL compounds to 
final = pd.concat([CL, selected], axis=0) # Combine the compound library data with the randomly selected compounds
print(len(final)) # Check that the length of the final list = 90

# Dropping columns that aren't necessary for this step ie. inventory maintenence, etc.
final = final.iloc[:, 0:2]

90


3. Adding in the reference and control compounds. We should have a final dataframe with 96 total conditions

In [49]:
ref_data_dict = {'Compound': ['DMSO', 'H2O', 'Diacetyl', 'Isoamyl alcohol', '2-nonanone', '1-octanol'],
            'CAS ID' : ['67-68-5', '7732-18-5', '431-03-8', '123-51-3', '821-55-6', '111-87-5']}
ref_data = pd.DataFrame(ref_data_dict, columns=['Compound', 'CAS ID'])
final = pd.concat([final, ref_data], sort = True)
print(len(final))

96


4. Shuffling the final list 96 compounds to randomize and blind

In [53]:
shuffled = final.sample(frac=1).reset_index(drop=True)
shuffled

Unnamed: 0,CAS ID,Compound
0,5451-09-2,5-Aminolevulinic acid (hydrochloride)
1,78-83-1,Isobutanol
2,112-39-0,Methyl palmitate
3,116-26-7,Safranal
4,168316-95-8,Spinosad
5,476-66-4,Ellagic acid
6,76-22-2,Camphor
7,106-22-9,Citronellol
8,20283-92-5,Rosmarinic acid
9,520-18-3,Kaempferol


5. Adding column and row values to reflect each compounds placement in the working stock library

In [62]:
lets = [] #Creating an empty list to hold row values
lets = np.array(['B', 'C', 'D', 'E', 'F', 'G']) # Adding row values


lets = list(np.repeat(lets, [4], axis=0))# Extending the list of row values for all 96 conditions
lets = lets*4

nums = [] #Creating an empty list to hold column values
nums = ['2', '3', '4', '5'] # Adding appropriate PCR plate column values
nums = nums*24 # Extending the list of numbers to accomodate 96 conditions

shuffled['Num'] = nums # Adding column values to the randomized/shuffled df of compounds
shuffled['Let'] = lets # Adding row values to the randomized/shuffled df of compounds
shuffled["Compound Well"] =  shuffled["Let"] + shuffled["Num"] #Combining the row/column value into 1 field
#compound_locs = compound_locs.drop(['Num', 'Let'], axis=1)
shuffled.head()

Unnamed: 0,CAS ID,Compound,Num,Let,Compound Well
0,5451-09-2,5-Aminolevulinic acid (hydrochloride),2,B,B2
1,78-83-1,Isobutanol,3,B,B3
2,112-39-0,Methyl palmitate,4,B,B4
3,116-26-7,Safranal,5,B,B5
4,168316-95-8,Spinosad,2,C,C2


6. We need to assign each compound to a PCR plate.

In [64]:
plate_nums = np.repeat(list(range(1,5)), 24) #Creating a list of values 1-4 that will repeat
plate_nums = list(map(str, plate_nums))
plate_nums = ["1-" + num for num in plate_nums] 
shuffled['Plate Number'] = plate_nums # Adding the plate numbers to the df

7. Generating plate maps for each individual PCR working stock plate. The person creating the working stock plates needs to follow this to ensure that the compounds map correctly during the unblinding process.

In [70]:
i=1
gb = shuffled.groupby('Plate Number')
for group in gb.groups:
    gp_name = "Plate" + str(i)
    hold = gb.get_group(group)
    pvt = hold.pivot(index="Let", columns="Num", values="Compound")
    i += 1
    pvt.to_csv('/Volumes/LaCie/_2021_08_screen/' + gp_name + '.csv')

In [71]:
#shuffled.to_csv('/Volumes/LaCie/_2021_08_screen/S3_randomized_compounds.csv')