# Results of the Cytometry experiment

### Summary of the experiment

1. Data gathering from [BBBC022](https://data.broadinstitute.org/bbbc/BBBC022/) 

2. Compound selection and data conditioning (`Compound_Images_List.ipynb`)

3. Segmentation and DNA content analysis (for UNET)
    * Segmentation (`batch_prediction.py`)
    * DNA content analysis

4. Segmentation and DNA content analysis (for CellProfiler)
    * Segmentation
    * DNA content analysis

5. Results exploitation (`Results.ipynb`)


### 1. Data gathering

The dataset is coming from the [Broad Bioimage Benchmark Collection](https://data.broadinstitute.org/bbbc/index.html) and its name is [BBBC022](https://data.broadinstitute.org/bbbc/BBBC022/). The channel we are interested in is the Hoechst 33342 channel, so we only need to download the `.zip` files ending with `w1`. All the files must be decompressed and stored in a `/data` folder.


### 2. Compound selection and data conditioning

The list of the available compounds is printed by the first cells of `Compound_Images_List.ipynb`. In the section "Compound selection" we can choose which compound to analyse.

Once the compound is selected, we can run the entire notebook `Compound_Images_List.ipynb` for the desired compound. It should output a `.csv` file in the current folder with the list of the images corresponding to the compound. It should also create the structure of the directories in order to use the `batch_prediction.py` code.

It creates a `data/COMPOUND` directory with three subdirectories : 
* `/COMPOUND/images` : folder of the images corresponding to the compound (filled by `Compound_Images_List.ipynb`)
* `/COMPOUND/masks` : folder of the masks found with UNET (computed by `batch_prediction.py`)
* `/COMPOUND/cp` : folder of the intermediary output of the DNA content analysis with CellProfiler

It also creates a `/COMPOUND` folder in another location where the results of the CellProfiler segmentation will be stored.


### 3. Segmentation and DNA content analysis (for UNET)

#### 3. a) Segmentation

Use the `batch_prediction.py` file to process the images. It should be called by :

```python batch_prediction.py experiment_name image_list.csv input_dir output_dir```

The recommended structure is :

```python3 batch_prediction.py experiment1 ./../../DSB\ paper/COMPOUND_image_list.csv ./../../UNET/COMPOUND/images/ ./../../UNET/COMPOUND/masks/```

where `COMPOUND` is the selected compound for the compound analysis.


*Warnings:* 
* This command should be called in the folder where `batch_prediction.py` is
* python3 can be used instead of python
* The experiment name can be chosen by the user
* The list of images is the output of the compound analysis
* The `model.hdf5` file should be located in the `experiment/experiment_name` folder

This step takes care of the segmentation using UNET. It outputs the mask corresponding of each image in `COMPOUND_image_list.csv `.


#### 3. b) DNA content analysis

Use a pipeline in CellProfiler to compute the integrated intensity of the images, masked by the segmentation masks from the prediction. The pipeline consists of a `MeasureObjectIntensity` layer and some other layers to save the results.

For the several compounds (Gabazine, CYCLOPIAZONIC ACID, ESTRADIOL, DMSO, Proglumide, PIZOTIFEN, CATECHIN, PENTABENZOATE, TRIADIMEFON, 3-HYDROXYCOUMARIN, Etoposide, DACTINOMYCIN, Colchicine, Blebbistatin) the projects are saved into their respective folders (`/data/COMPOUND`). The pipeline for another compound should have exactly the same structure in order to use this notebook. 


### 4. Segmentation and DNA content analysis (for CellProfiler)

The segmentation and the DNA content analysis for CellProfiler are done using the same pipeline. The pipeline consists of one step of segmentation (`IdentifyPrimaryObjects` layer) and one step of measurement (`MeasureObjectIntensity`) to compute the DNA content of the objects.


### 5. Results exploitation

DNA content analysis is the purpose of this notebook.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

import math

import skimage
from skimage import io

from os import listdir
from os.path import isfile, join

## Results gathering


The following functions load the `.csv` file corresponding to the DNA content analysis for UNET and CellProfiler. If the above structure is respected, they should work perfectly, otherwise the paths may need to be adapted.

They create for each segmentation method (UNET and CP) a dataframe containing the information of the analysis :

| Index | ImageNumber | ObjectNumber | Integrated_Intensity | Plate |
|:-----:|-------------|--------------|----------------------|-------|
| **0** | 1           | 1            | 2.245075             | 20589 |
| **1** | 1           | 2            | 1.579263             | 20589 |
| ...   | ...         | ...          | ...                  | ...   |


Each *ImageNumber* corresponds to an image, each *ObjectNumber* corresponds to one found nucleus. The *Integrated_Intensity* field shows the result of the DNA content analysis, and we keep the information of the plate.

For each compound analysis, we have 4 dataframes of the above type :

|  Compound\Method | UNET         | CP         |
|:----------------:|--------------|------------|
| **Compound X :** | df_unet_cmpd | df_cp_cmpd |
|       **DMSO :** | df_unet_dmso | df_cp_dmso |



In [2]:
def construct_dataframe_unet(data_path, compound_number):
    compounds = ['Gabazine', 'CYCLOPIAZONIC ACID', 
                 'ESTRADIOL', 'DMSO', 
                 'Proglumide', 'PIZOTIFEN', 
                 'CATECHIN PENTABENZOATE', 'TRIADIMEFON',
                 '3-HYDROXYCOUMARIN', 'Etoposide',
                 'DACTINOMYCIN', 'Colchicine',
                 'Blebbistatin']

    compounds2 = ['Gabazine', 'CYCLOPIAZONIC_ACID', 
                  'ESTRADIOL', 'DMSO', 
                  'Proglumide', 'PIZOTIFEN', 
                  'CATECHIN_PENTABENZOATE', 'TRIADIMEFON',
                  '3-HYDROXYCOUMARIN', 'Etoposide',
                  'DACTINOMYCIN', 'Colchicine',
                  'Blebbistatin']

    compound = compounds[compound_number]
    compound2 = compounds2[compound_number]

    mypath = data_path + compound
    onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f)) and f not in ['UNET_MASKS.csv', 'UNET_Image.csv']]
    masks_end = 'MASKS.csv'
    image_end = 'Image.csv'
    masks_file = mypath + '/'
    image_file = mypath + '/'

    for f in onlyfiles:
        if (len(f) < len(masks_end) or len(f) < len(image_end)):
            continue

        if (f[len(f)-len(masks_end):] == masks_end):
            masks_file += f

        if (f[len(f)-len(image_end):] == image_end):
            image_file += f

    df_masks = pd.read_csv(masks_file)
    df_image = pd.read_csv(image_file)

    df1 = df_masks[['ImageNumber', 'ObjectNumber','Intensity_IntegratedIntensity_'+compound2+'_IMAGES']]
    df2 = df_image[['FileName_'+compound2+'_IMAGES']]

    ser = []
    for i in range(0, len(df1)):
        ser.append(df2['FileName_'+compound2+'_IMAGES'][df1.ImageNumber.iloc[i]-1][18:23])

    ser = np.array(ser)
    df1 = df1.assign(Plate=pd.Series(ser).values)

    df1 = df1.rename(index=str, columns={'Intensity_IntegratedIntensity_'+compound2+'_IMAGES': "Integrated_Intensity"})

    return df1, compound

In [3]:
def construct_dataframe_unet2(data_path, compound_number):
    compounds = ['Gabazine', 'CYCLOPIAZONIC ACID', 
                 'ESTRADIOL', 'DMSO', 
                 'Proglumide', 'PIZOTIFEN', 
                 'CATECHIN PENTABENZOATE', 'TRIADIMEFON',
                 '3-HYDROXYCOUMARIN', 'Etoposide',
                 'DACTINOMYCIN', 'Colchicine',
                 'Blebbistatin']

    compound = compounds[compound_number]

    mypath = data_path + compound
    masks_end = 'MASKS.csv'
    image_end = 'Image.csv'
    masks_file = mypath + '/UNET_DilateObjects.csv'
    image_file = mypath + '/UNET_Image.csv'
    
    df_masks = pd.read_csv(masks_file)
    df_image = pd.read_csv(image_file)

    df1 = df_masks[['ImageNumber', 'ObjectNumber','Intensity_IntegratedIntensity_IMAGES']]
    df2 = df_image[['FileName_IMAGES']]

    ser = []
    for i in range(0, len(df1)):
        ser.append(df2['FileName_IMAGES'][df1.ImageNumber.iloc[i]-1][18:23])

    ser = np.array(ser)
    df1 = df1.assign(Plate=pd.Series(ser).values)

    df1 = df1.rename(index=str, columns={'Intensity_IntegratedIntensity_IMAGES': "Integrated_Intensity"})

    return df1, compound

In [4]:
def construct_dataframe_cp(data_path, compound_number):
    compounds = ['Gabazine', 'CYCLOPIAZONIC ACID', 
                 'ESTRADIOL', 'DMSO', 
                 'Proglumide', 'PIZOTIFEN', 
                 'CATECHIN PENTABENZOATE', 'TRIADIMEFON',
                 '3-HYDROXYCOUMARIN', 'Etoposide',
                 'DACTINOMYCIN', 'Colchicine',
                 'Blebbistatin']
    compound = compounds[compound_number]
    compounds2 = ['Gabazine', 'CYCLOPIAZONIC_ACID', 
                  'ESTRADIOL', 'DMSO', 
                  'Proglumide', 'PIZOTIFEN', 
                  'CATECHIN_PENTABENZOATE', 'TRIADIMEFON',
                  '3-HYDROXYCOUMARIN', 'Etoposide',
                  'DACTINOMYCIN', 'Colchicine',
                  'Blebbistatin']
    compound2 = compounds2[compound_number]

    compound_path = data_path + compound + '/'
    
    df1 = pd.read_csv(compound_path + 'Nuclei.csv')
    df2 = pd.read_csv(compound_path + 'Image.csv')

    df1 = df1[['ImageNumber', 'ObjectNumber','Intensity_IntegratedIntensity_DNA']]
    df2 = df2[['FileName_DNA']]
    
    ser = []
    for i in range(0, len(df1)):
        ser.append(df2['FileName_DNA'][df1.ImageNumber.iloc[i]-1][18:23])

    ser = np.array(ser)
    df1 = df1.assign(Plate=pd.Series(ser).values)

    df1 = df1.rename(index=str, columns={'Intensity_IntegratedIntensity_DNA': "Integrated_Intensity"})
    
    return df1

## Image Plotting 

For each of the 2 methods and each compound, one can print an image, its corresponding mask, and the image seen through the mask thanks to the 2 functions below :

In [5]:
def plot_image_and_mask_unet(data_path, compound, i):
    mypath = data_path + compound
    
    image_path = mypath + '/images/'
    masks_path = mypath + '/masks/'
    
    onlyfiles = [f for f in listdir(image_path) if isfile(join(image_path, f))]
    im_end = '.tif'
    files = [f for f in onlyfiles if f[-4:]==im_end]
    
    file = files[i]
#     print(file)
    
    ipath = image_path + file
    mpath = masks_path + file
    
    im = skimage.io.imread(ipath)
    ma = skimage.io.imread(mpath)

    plt.figure(figsize=(15,10))
    plt.subplot(1,2,1)
    plt.imshow(im)
    plt.subplot(1,2,2)
#     plt.imshow(ma>0)
    plt.imshow(ma)
    
    plt.figure(figsize=(7,6))
    plt.imshow(im*(ma>0))

In [6]:
def plot_image_and_mask_cp(data_path_im, data_path_cp, compound, i):
    mypath = data_path_im + compound
    
    image_path = mypath + '/images/'
    
    onlyfiles = [f for f in listdir(image_path) if isfile(join(image_path, f))]
    im_end = '.tif'
    files = [f for f in onlyfiles if f[-4:]==im_end]
    
    file = files[i]
#     print(file)
    ipath = image_path + file
    
    
    mpath = data_path_cp + compound +'/' + file[:-4] + '_Overlay.tiff'
#     print(mpath)
    
    im = skimage.io.imread(ipath)
    ma = skimage.io.imread(mpath)
    plt.figure(figsize=(15,10))
    plt.subplot(1,2,1)
    plt.imshow(im)
    plt.subplot(1,2,2)
    plt.imshow(ma)
    

## Normalization

Each plate corresponds to different experimental conditions. Hence, we cannot compare the intensity of a cell in plate A with the intensity of a cell in plate B without normalizing them.

To normalize the intensities over the different plates, we use the log-transform of the histograms. 


### Splitting and log-intensities

We first split the dataframe into a dictionary with each plate as entry (`split_per_plates`) and then, we compute the log10 of each intensity value (`compute_log_intensities`), for each plate.

This gives us a dictionary of dataframes :

{'20589':

| Index | ImageNumber | ObjectNumber | Integrated_Intensity | Plate | Log_Integrated_Intensity |
|:-----:|-------------|--------------|----------------------|-------|--------------------------|
| **0** | 1           | 1            | 2.245075             | 20589 | 0.351231                 |
| **1** | 1           | 2            | 1.579263             | 20589 | 0.198454                 |
| ...   | ...         | ...          | ...                  | ...   | ...                      |

'20590':

| Index | ImageNumber | ObjectNumber | Integrated_Intensity | Plate | Log_Integrated_Intensity |
|:-----:|-------------|--------------|----------------------|-------|--------------------------|
| **0** | 10          | 1            | 3.067079             | 20590 | 0.486725                 |
| **1** | 10          | 2            | 2.338338             | 20590 | 0.368907                 |
| ...   | ...         | ...          | ...                  | ...   | ...                      |

...}

In [7]:
def split_per_plates(df):
    plates = df.Plate.unique()
    dfs = {}

    for p in plates:
        dfs[p] = df[df.Plate == p]
    
    return dfs

def compute_log_intensities(dfs):
    for p in dfs:
        LII = np.log10(dfs[p].Integrated_Intensity.values)
        dfs[p] = dfs[p].assign(Log_Integrated_Intensity = pd.Series(LII).values)
    return dfs

### Normalization

To normalize the instensities over the plates, we use the log intensities. If we look at the typical DNA content distribution, it should contain the global maximum at position 2N and a local maximum at position 4N. This is exactly what we obtain for each plate, but the position 2N is not the same for all of the plates (thus the position 4N is not neither). By shifting the histogram in the log space, we can easily make the two modes match in the original space. That is why we use the log transform.

Hence, in order to normalize, we compute the histogram of the log intensities. The goal is to find the main peak for each plates and to align these peaks. To find the main peak for each plate, we first smoothen the histogram to avoid the influence of the noise, and we take the maximum.

Once we computed the position of the maximum for each plate, we need to align them so the main peak of each histogram is at the same position. To do so, we take one of the positions and we compute the shifts needed to reach this position from the others. We can now apply those shifts to the log intensities (each shift corresponds to a plate).

Since the compound could have an influence on the cell cycle, we take the DMSO to compute the shifts. Since same plate means same experimental conditions, the shift computed to normalize the DMSO on a plate can be used to normalize the compound on the same plate.


Finally, at the end of the function, we remove the plates that are not in the compound dictionary from the dictionary of the DMSO to keep only the plates of interest.



In [8]:
def normalization(dfs_cmpd, dfs_dmso, plot_histograms=False, return_shifts=False):
    dfs = dfs_dmso.copy()
    dfs2 = dfs_cmpd.copy()
    
    nbins = 200
    hrange = (-1.0, 2.5)
    step = (hrange[1]-hrange[0])/nbins

    n_mask = 7
    mask = (1/n_mask)*np.ones(n_mask)

    maxima = {}

    for plate in dfs:
        histogram = np.histogram(
            a=dfs[plate].Log_Integrated_Intensity,
            bins = nbins,
            range = hrange
        )
        
        sig = np.convolve(histogram[0], mask)
        maxima[plate] = np.argmax(sig)

    maxis = np.fromiter(maxima.values(), dtype=float)
    max_pos = np.max(maxis)

    shifts = {}
    for plate in maxima:
        shifts[plate] = (max_pos-maxima[plate])*step

    for plate in dfs2:
        vals = dfs2[plate].Log_Integrated_Intensity.values
        dfs2[plate] = dfs2[plate].assign(normalized_logInt = pd.Series(vals+shifts[plate]).values)
        dfs2[plate] = dfs2[plate].assign(Normalized_Intensity = pd.Series(np.power(10, dfs2[plate].normalized_logInt)).values)
        dfs_cmpd[plate] = dfs2[plate].assign(Normalized_Intensity = pd.Series(dfs2[plate].Normalized_Intensity).values)
        
        vals = dfs[plate].Log_Integrated_Intensity.values
        dfs[plate] = dfs[plate].assign(normalized_logInt = pd.Series(vals+shifts[plate]).values)
        dfs[plate] = dfs[plate].assign(Normalized_Intensity = pd.Series(np.power(10, dfs[plate].normalized_logInt)).values)
        dfs_dmso[plate] = dfs[plate].assign(Normalized_Intensity = pd.Series(dfs[plate].Normalized_Intensity).values)
        
            
    if plot_histograms:
        for plate in dfs_cmpd:
            plt.figure(figsize=(20,8))
            
#             print("Plate {} : {}".format(plate,len(dfs[plate])))
            plt.subplot(1, 2, 1)
            plt.title(plate + ' DMSO')
            p1 = plt.hist(dfs_dmso[plate].Integrated_Intensity, nbins, (0, 30), label='Before normalization')
            p2 = plt.hist(dfs_dmso[plate].Normalized_Intensity, nbins, (0, 30), label='After normalization')
            plt.xlabel('Logarithm intensities')
            plt.ylabel('number of cells')
            plt.legend(frameon = False)
            
            plt.subplot(1, 2, 2)
            plt.title(plate + ' COMPOUND')
            plt.hist(dfs_cmpd[plate].Integrated_Intensity, nbins, (0, 30), label='Before normalization')
            plt.hist(dfs_cmpd[plate].Normalized_Intensity, nbins, (0, 30), label='After normalization')
            plt.xlabel('Intensities')
            plt.ylabel('number of cells')
            plt.legend(frameon = False)
    
    dfs_dmso2 = dfs_dmso.copy()
    for plate in dfs_dmso2:
        if plate not in dfs_cmpd.keys():
            del dfs_dmso[plate]
            
    if return_shifts:
        return dfs_cmpd, dfs_dmso, shifts
    
    return dfs_cmpd, dfs_dmso

## Regrouping the plates

Now that all the plates are normalized, we need to merge them together in order to compute the Z' factor for all the images. That is the purpose of the function below.

In [9]:
def regroup_plates(df, dfs):
    df2 = df.copy()
    pls = df2.Plate.unique()
    pls2 = dfs.keys()
    
    plates = list(set(list(pls))-set(list(pls2)))
    
    for p in plates:
        df2 = df2[df2.Plate != p]

    intensities = []

    for plate in pls2:
        intensities = intensities + list(dfs[plate].Normalized_Intensity.values)
    
    
    df2 = df2.assign(Normalized_Intensity=pd.Series(np.array(intensities)).values)
    
    return df2

In [10]:
def regroup_plates2(dfs):
    plates = list(dfs.keys())
    df = pd.DataFrame(columns=dfs[plates[0]].columns)

    for plate in plates:
        df = df.append(dfs[plate])
    
    
    return df

## Z factor calculation


We compute the Z factor to determine if we can see a significant difference between the compound experiment results and the DMSO experiment results. 


### Percentages computation

Thanks to the function below, we compute for each image the percentage of cell with intensity that is above a given threshold. By choosing the threshold between the 2 modes of the DNA distribution, we compute, for each image, the percentage of cells with intensity close to 4N.

In [11]:
def compute_percentages(df, threshold, display_hist=False, cmpd_name=''):
    images = df.ImageNumber.unique()
    percentages = []
    for im in images:
        vals = df[df.ImageNumber == im].Normalized_Intensity.values
        percentages.append(np.sum(vals>threshold)/vals.shape[0])
        
    if display_hist:
        image_number = 1
        vals = np.array(percentages)
    
        x = np.array([threshold, threshold])
        y = np.array([0, 100])
        plt.figure(figsize=(10,8))
        plt.title(compound + ' percentages')
        plt.hist(vals, 100, (0.0, 1.0), label='Values', normed=True)
        
#         plt.plot(x, y, 'r', label='Threshold')
        plt.xlabel('Proportion of intensity values above threshold')
        plt.ylabel('Number of images')
        plt.legend(frameon = False)
        
    return percentages

### Threshold computation

To compute the threshold, we first find the position of the peak at 2N and 4N, and we take the mean between the two positions.

In [12]:
def compute_threshold(df_DMSO, display=False):
    nbins = 200
    hrange = (-1.0, 30)
    step = (hrange[1]-hrange[0])/nbins

    n_mask = 3
    mask = (1/n_mask)*np.ones(n_mask)

    histogram2 = np.histogram(
        a=df_DMSO.Normalized_Intensity,
        bins = nbins,
        range = hrange
    )
    sig = np.convolve(histogram2[0], mask)
    maxima = np.argmax(sig)*step
    
    local_maxima = np.r_[True, sig[1:] > sig[:-1]] & np.r_[sig[:-1] > sig[1:], True]
    pos_max = np.squeeze(np.argwhere(local_maxima==True))
    val_max = sig[local_maxima==True]
    
    argmax1 = np.argmax(val_max)
    pmax1 = pos_max[argmax1]
    max1 = val_max[argmax1]
    
    pos_max = pos_max[argmax1+1:]
    val_max = val_max[argmax1+1:]
    
    argmax2 = np.argmax(val_max)
    pmax2 = pos_max[argmax2]
    
    if display:
        height = np.array([0, 1100])
        plt.figure()
        plt.plot(np.array(range(-1, nbins+1))*step, sig)
        plt.plot(np.array([pmax1, pmax1])*step, height, 'r')
        plt.plot(np.array([pmax2, pmax2])*step, height, 'r')
    
    thres = (pmax1+pmax2)/2*step
    if display:
        print("Threshold = {}".format(thres))
    
    return thres

### Z factor computation


The formula of the Z factor is :

$$Z' = 1-{3(\sigma _{p}+\sigma _{n}) \over |\mu _{p}-\mu _{n}|}$$

where p, n denote the positive and negative control, respectively compound and DMSO. $\mu$ and $\sigma$ are the mean and the standard deviation of the percentages

In [13]:
def compute_Z_factor(percentages_pos, percentages_neg):
    p = np.array(percentages_pos)
    n = np.array(percentages_neg)
    
    mean_p = np.mean(p)
    std_p  = np.std(p)
    mean_n = np.mean(n)
    std_n  = np.std(n)
    
    return 1 - 3*(std_p+std_n)/abs(mean_p-mean_n)  

## Results

In [14]:
for cmp_number in range(0, 12):
    if cmp_number == 3:
        continue
    
    data_path = './../UNET/'

    df_unet_cmpd, compound = construct_dataframe_unet2(data_path, cmp_number)
    df_unet_dmso, dmso = construct_dataframe_unet2(data_path, 3)


    df_plates_unet_cmpd, df_plates_unet_dmso = normalization(
        compute_log_intensities(split_per_plates(df_unet_cmpd)),
        compute_log_intensities(split_per_plates(df_unet_dmso)) , 
        plot_histograms=False)

    df_unet_cmpd = regroup_plates2(df_plates_unet_cmpd)
    df_unet_dmso = regroup_plates2(df_plates_unet_dmso)




    data_path_cp = './../CP/'

    df_cp_cmpd = construct_dataframe_cp(data_path_cp, cmp_number)
    df_cp_dmso = construct_dataframe_cp(data_path_cp, 3)

    df_plates_cp_cmpd, df_plates_cp_dmso = normalization(
        compute_log_intensities(split_per_plates(df_cp_cmpd)),
        compute_log_intensities(split_per_plates(df_cp_dmso)),
        plot_histograms=False)

    df_cp_cmpd = regroup_plates2(df_plates_cp_cmpd)
    df_cp_dmso = regroup_plates2(df_plates_cp_dmso)




    threshold_unet = compute_threshold(df_unet_dmso, display=False)
    threshold_cp = compute_threshold(df_cp_dmso, display=False)

    per_unet_cmpd = compute_percentages(df_unet_cmpd, threshold_unet)
    per_unet_dmso = compute_percentages(df_unet_dmso, threshold_unet)

    per_cp_cmpd = compute_percentages(df_cp_cmpd, threshold_cp)
    per_cp_dmso = compute_percentages(df_cp_dmso, threshold_cp)

    ZprimeUNET = compute_Z_factor(per_unet_cmpd, per_unet_dmso)
    ZprimeCP = compute_Z_factor(per_cp_cmpd, per_cp_dmso)


    print("Compound : {}".format(compound))
    print("Z' for UNET = {}".format(ZprimeUNET))
    print("Z' for CellProfiler = {}".format(ZprimeCP))
    print('')

Compound : Gabazine
Z' for UNET = -28.066965940664478
Z' for CellProfiler = -45.06801417449648

Compound : CYCLOPIAZONIC ACID
Z' for UNET = -29.996334244753626
Z' for CellProfiler = -55.24586616428868

Compound : ESTRADIOL
Z' for UNET = -3.2878556884273085
Z' for CellProfiler = -5.741775063390136

Compound : Proglumide
Z' for UNET = -9.799619561853978
Z' for CellProfiler = -17.22519559743383

Compound : PIZOTIFEN
Z' for UNET = -39.49409987195662
Z' for CellProfiler = -56.720938485163025

Compound : CATECHIN PENTABENZOATE
Z' for UNET = -59.539427374904186
Z' for CellProfiler = -18.167502634304967

Compound : TRIADIMEFON
Z' for UNET = -73.22188573510053
Z' for CellProfiler = -23.432665810567336

Compound : 3-HYDROXYCOUMARIN
Z' for UNET = -16.548270650652213
Z' for CellProfiler = -51.00284333300692

Compound : Etoposide
Z' for UNET = -1.850535951808781
Z' for CellProfiler = -1.7130831419201842

Compound : DACTINOMYCIN
Z' for UNET = -23.764863521752513
Z' for CellProfiler = -301.0464092118