# Metadata file

## Why?

We want to create a metadata file that contains critical information to label our samples in an image-based profiling. 

## How?

We gotta provide:

1) What's the plate type (6, 12, 24 or 96 well plate);
2) What's the plate layout?

    - Provide a xlsx (Excel) file;
    - The location of the treatments should be as it is in the plate;
    - Provide the minimal columns:

        - 'plate_map_name': give a name for this layout, which should be different when the layout changes;
        - 'well_position' column: this will be generated by our code, just input the plate-well number (6, 12, 24, 96);
        - 'cell_type': the name of the cell lineage;
        - 'compound': name of the treatment applied to that well;
        - 'control_type': `negcon` for negative control, `trt` for treated-sample, and `poscon` for positive control;
    
    
    Example of what should be contained within a cell in the excel file. SPLIT THE INFORMATION BY SPACE;

            - platemap_rep1 Huh7 AgNP trt
            - platemap_rep1 Huh7 Non-treated negcon

3) What are the headers of your metadata file?

    - The default is:

    `metadata_cols = ["well_position", "plate_map_name", "cell_type", "compound", "control_type"]`

    - Any additional columns you want to have, you need to add this information at the end of your cell and also add the new columns to the `metadata_cols` header.

In [1]:
path_to_scripts = r"C:\Users\Fer\Documents\GitHub"

In [2]:
import easygui as eg
import openpyxl
import csv
import sys
import os

sys.path.append(path_to_scripts)

from scripts_notebooks_fossa.metadata import metadata_utils

%load_ext autoreload
%autoreload 2

## 1. What's the plate type?

In [3]:
well_plate=96

In [4]:
wells = metadata_utils.plate_wells(plate_type=well_plate)

Number of wells: 96
['A1', 'A2', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9', 'A10', 'A11', 'A12', 'B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B8', 'B9', 'B10', 'B11', 'B12', 'C1', 'C2', 'C3', 'C4', 'C5', 'C6', 'C7', 'C8', 'C9', 'C10', 'C11', 'C12', 'D1', 'D2', 'D3', 'D4', 'D5', 'D6', 'D7', 'D8', 'D9', 'D10', 'D11', 'D12', 'E1', 'E2', 'E3', 'E4', 'E5', 'E6', 'E7', 'E8', 'E9', 'E10', 'E11', 'E12', 'F1', 'F2', 'F3', 'F4', 'F5', 'F6', 'F7', 'F8', 'F9', 'F10', 'F11', 'F12', 'G1', 'G2', 'G3', 'G4', 'G5', 'G6', 'G7', 'G8', 'G9', 'G10', 'G11', 'G12', 'H1', 'H2', 'H3', 'H4', 'H5', 'H6', 'H7', 'H8', 'H9', 'H10', 'H11', 'H12']


## 2. What's the plate layout?

In [5]:
pathname = eg.fileopenbox("Choose the Excel file")
file_name, extension = os.path.splitext(os.path.basename(pathname))
print(pathname)
print(file_name)

C:\Users\Fer\Documents\GitHub\2022_09_09_LiveCellPainting_fossa_Cimini\metadata\layout\2023_new_batch_DILI\layout_day1.xlsx
layout_day1


### Access the layout file

In [6]:
reading = openpyxl.load_workbook(pathname)
layout = reading.active

In [7]:
samples = metadata_utils.get_samples(layout, plate_type=well_plate)

['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'Huh7 Amiodarone 1 trt', 'Huh7 Amiodarone 1 trt', 'Huh7 Amiodarone 1 trt', 'Huh7 Amiodarone 1 trt', 'Huh7 Amiodarone 10 trt', 'Huh7 Amiodarone 10 trt', 'Huh7 Amiodarone 10 trt', 'Huh7 Amiodarone 10 trt', 'Huh7 Cyclophosphamide 1 trt', 'Huh7 Cyclophosphamide 10 trt', 'None', 'None', 'Huh7 Lovastatin 1 trt', 'Huh7 Lovastatin 1 trt', 'Huh7 Lovastatin 1 trt', 'Huh7 Lovastatin 1 trt', 'Huh7 Lovastatin 10 trt', 'Huh7 Lovastatin 10 trt', 'Huh7 Lovastatin 10 trt', 'Huh7 Lovastatin 10 trt', 'Huh7 Cyclophosphamide 1 trt', 'Huh7 Cyclophosphamide 10 trt', 'None', 'None', 'Huh7 Non-treated 0 negcon', 'Huh7 Non-treated 0 negcon', 'Huh7 Non-treated 0 negcon', 'Huh7 Non-treated 0 negcon', 'Huh7 Non-treated 0 negcon', 'Huh7 Non-treated 0 negcon', 'Huh7 Non-treated 0 negcon', 'Huh7 Non-treated 0 negcon', 'Huh7 Cyclophosphamide 1 trt', 'Huh7 Cyclophosphamide 10 trt', 'None', 'None', 'Huh7 Etoposide 1 

# 3. What are the headers of your metadata file?

In [29]:
# metadata_cols = ["well_position", "cell_type", "compound", "concentration_uM", "control_type"]
metadata_cols = metadata_utils.get_example_to_name_metadata_cols(samples)

['Huh7', 'Amiodarone', '1', 'trt']


In [30]:
sublist = metadata_utils.generate_rows_lists(samples, wells)

# Finally, generate the metadata CSV file

- Generate also a txt file that can be used later for pycytominer.

In [33]:
metadata_cols.insert(0, "plate_map_name")
sublist_add_platemap = [[file_name] + sublist for sublist in sublist]

In [59]:
os.makedirs("platemap", exist_ok=True)

with open(rf"platemap\{file_name}.csv",'w',newline = '') as csvfile: #create a metadata.csv file 
    writer = csv.writer(csvfile, delimiter=',') #comma delimited
#gives the header name row into csv
    writer.writerow([g for g in metadata_cols])
    for sub in sublist_add_platemap:
        if len(sub) == len(metadata_cols):
            writer.writerow(sub)

#also generate a txt file
with open(rf"platemap\{file_name}.csv", 'r') as f_in, open(f"platemap\{file_name}.txt", 'w') as f_out:
    # 2. Read the CSV file and store in variable
    content = f_in.read()
    # 3. Write the content into the TXT file
    f_out.write(content)