# `namer()`

`namer()` is the function that imports qPCR data and labels wells. The output of `namer()` is used for all subsequent equipt functions. 

The first step of `namer()` is a function that imports the Ct values from an .xlsx or .csv file. In the current distribution this is performed by the function `lc480_importer()`, which is specific to the data output of the Roche LightCycler 480. For other instruments and file formats the user can write their own importer and supply it to namer. The only requirements are that it outputs a Pandas Dataframe with one column named 'Pos' that contains well positions, and another named 'Cp' that contains the Ct values. `namer()` does not use the 'Pos' column, but it allows the user to easily verify that wells were accurately named.

## Using `namer()` for the Lightcycler 480

In [2]:
import equipt
import pandas as pd

This example uses data from an efficiency curve analysis of seven primer sets tested on four dilutions of a single cDNA sample. The first few lines of the .csv file exported by the Lightcycler 480 looks like this:

In [13]:
with open('data/22.11.22_PrimerCurve_Ct.csv','r') as f:
    print(f.read()[:486])

Experiment: DH_22.11.22_PrimerCurve  Selected Filter: SYBR Green I / HRM Dye (465-510)
Include	Color	Pos	Name	Cp	Concentration	Standard	Status
True	255	A1	Sample 1	17.51		0	
True	16711680	A2	Sample 2	17.54		0	? - Detector Call uncertain
True	255	A3	Sample 3	17.55		0	
True	255	A4	Sample 4	18.49		0	
True	255	A5	Sample 5	18.52		0	
True	255	A6	Sample 6	18.54		0	
True	255	A7	Sample 7	19.53		0	
True	255	A8	Sample 8	19.59		0	
True	255	A9	Sample 9	19.59		0	
True	255	A10	Sample 10	20.64		0	


The first line contains the experiment name and filter sets used, the second line contains the column names, and the remaining lines contain tab-separated values for the experiment. Before moving onto `namer()`, lets look at the output of the importer function:

In [17]:
equipt.lc480_importer('data/22.11.22_PrimerCurve_Ct.csv').iloc[:6]

Unnamed: 0,Pos,Cp
0,A1,17.51
1,A2,17.54
2,A3,17.55
3,A4,18.49
4,A5,18.52
5,A6,18.54


The importer skips the header and strips away all columns except for 'Pos' and 'Cp'. The 'Cp' column name is important for subsequent analyses, but `namer()` only uses relative positions to name columns.

We can now use namer to automatically label the wells. `namer()` uses six parameters. Their documentation is reproduced below:

ct_file : str
        Path to a CSV or Excel file containing the qPCR data. Currently only
        data output from a Lightcycler 480 is supported, but the structure of 
        namer() allows for other importers to be written without disrupting the
        rest of the function.
        
    Params
    ______
        
    primers : list of strings
        A list, in order, of the primers. See documentation for supported plate
        arrangements.
        
    samples : list of strings
        A list, in order, or the sample names. See documentation for supported
        plate arrangements.
        
    reps : int
        Number of replicate wells. 2, 3, or 4.
        
    config : str
        A description of how the samples are arranged: 'square' or 'line'. See
        documentation for additional details. Default 'line'
        
    importer : a custom importer function or None
        A user-supplied function that imports data from their qPCR instrument 
        to a Pandas Dataframe with columns 'Pos', for the well position, and
        'Cp' for the Ct values. If None, namer() defaults to an importer for
        data from the Roche Lightcycler 480. Default None
        
    **kwargs : dictionary
        
        with_dil : list of strings
            List of names of samples that have dilution curves.
            
        dil_series : list of ints
            List of dilution factors in order on plate. Dilutions
            should be entered as integers (e.g. a 1:10 dilution 
            should be entered as 10).
            
        dil_rest : int or None
            The dilution of samples that do not have a dilution 
            series. If None, with_dil should contain all samples.

The \**kwargs parameter should only be used if one or more sample has a dilution series. Otherwise it need not be supplied. For this experiment, the following parameter values were used:

In [18]:
primers = ['Fus (112734868c1)',
         'Fus (15029724a1)',
         'Ewsr1 (6679715a1)',
         'Ewsr1 (88853580c2)',
         'Taf15 (141803447c1)',
         'Taf15 (141803447c2)',
         'Tsix exon4']

samples = ['mESC total cDNA']

reps = 3

config = 'line'

kwargs = {'with_dil':samples,
         'dil_series':[20,40,80,160],
         'dil_rest':None}

Supplying these to `namer()` gives the following output:

In [21]:
df = equipt.namer('data/22.11.22_PrimerCurve_Ct.csv',
            primers,
            samples,
            reps,
            config,
            **kwargs)

df.iloc[:6]

['mESC total cDNA_20', 'mESC total cDNA_40', 'mESC total cDNA_80', 'mESC total cDNA_160']


Unnamed: 0,Pos,Cp,Primer,Name,NamePrim
0,A1,17.51,Fus (112734868c1),mESC total cDNA_20,mESC total cDNA_20Fus (112734868c1)
1,A2,17.54,Fus (112734868c1),mESC total cDNA_20,mESC total cDNA_20Fus (112734868c1)
2,A3,17.55,Fus (112734868c1),mESC total cDNA_20,mESC total cDNA_20Fus (112734868c1)
3,A4,18.49,Fus (112734868c1),mESC total cDNA_40,mESC total cDNA_40Fus (112734868c1)
4,A5,18.52,Fus (112734868c1),mESC total cDNA_40,mESC total cDNA_40Fus (112734868c1)
5,A6,18.54,Fus (112734868c1),mESC total cDNA_40,mESC total cDNA_40Fus (112734868c1)


`namer()` has correctly labeled the primer, assigned sample names with the dilution factor after an underscore, and created a column called 'NamePrim' that allows for replicate wells to be easily detected. This output can be supplied to any of the other tools in equipt.

## Using a custom importer function with `namer()`