# miRNA-Seq

## Validation by Real-Time PCR

On this experiment, I have tested the amplification of vvi-mir156(d) on Female E/F sample, using multiple quantities on cDNA synthesis and on the qRT-PCR.

### Loading required modules

In [None]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt

### Preparing data

Reactions in the plate were set as...

* Columns 01, 02, 03 and 04: 5.0 µL of cDNA were loaded into the Real-Time (l5)
* Columns 05, 06, 07 and 08: 1.0 µL of cDNA were loaded into the Real-Time (l1)
* Columns 09, 10, 11 and 12: 0.5 µL of cDNA were loaded into the Real-Time (l0)


* Line A: cDNA was synthesized with Female E/F 1st elution, diluted to 1/5 (rna5)
* Line B: cDNA was synthesized with Female E/F 1st elution, diluted to 1/50 (rna50)
* Line C: cDNA was synthesized with Female E/F 1st elution, diluted to 1/100 (rna100)
* Line D: cDNA was synthesized with Female E/F 1st elution, diluted to 1/500 (rna500)
* Line E: cDNA was synthesized with Female E/F 1st elution, diluted to 1/1000 (rna1000)


* Columns 01/02, 05/06 and 09/10: cDNA without amplification by TaqMan (raw)
* Columns 03/04, 07/08 and 11/12: cDNA after amplification/enrichment by TaqMan (amp)

* Duplicates are on the same line, in subsequent columns (eg. A01 = A02) (r1 / r2)



As **all samples are Female E/F 1st elution**, that information will not be used on the reactions names! The same applies to the gene used: **all reactions use vvi-miR156** with primers vvi-mir156F + oligo(dT).

In [None]:
columns = {
    'Well/Cycles': 'cycle',
    
    'A01': 'l5_rna5_raw_r1', 'A02': 'l5_rna5_raw_r2',
    'A03': 'l5_rna5_amp_r1', 'A04': 'l5_rna5_amp_r2',
    'A05': 'l1_rna5_raw_r1', 'A06': 'l1_rna5_raw_r2',
    'A07': 'l1_rna5_amp_r1', 'A08': 'l1_rna5_amp_r2',
    'A09': 'l0_rna5_raw_r1', 'A10': 'l0_rna5_raw_r2',
    'A11': 'l0_rna5_amp_r1', 'A12': 'l0_rna5_amp_r2',
    
    'B01': 'l5_rna50_raw_r1', 'B02': 'l5_rna50_raw_r2',
    'B03': 'l5_rna50_amp_r1', 'B04': 'l5_rna50_amp_r2',
    'B05': 'l1_rna50_raw_r1', 'B06': 'l1_rna50_raw_r2',
    'B07': 'l1_rna50_amp_r1', 'B08': 'l1_rna50_amp_r2',
    'B09': 'l0_rna50_raw_r1', 'B10': 'l0_rna50_raw_r2',
    'B11': 'l0_rna50_amp_r1', 'B12': 'l0_rna50_amp_r2',
    
    'C01': 'l5_rna100_raw_r1', 'C02': 'l5_rna100_raw_r2',
    'C03': 'l5_rna100_amp_r1', 'C04': 'l5_rna100_amp_r2',
    'C05': 'l1_rna100_raw_r1', 'C06': 'l1_rna100_raw_r2',
    'C07': 'l1_rna100_amp_r1', 'C08': 'l1_rna100_amp_r2',
    'C09': 'l0_rna100_raw_r1', 'C10': 'l0_rna100_raw_r2',
    'C11': 'l0_rna100_amp_r1', 'C12': 'l0_rna100_amp_r2',
    
    'D01': 'l5_rna500_raw_r1', 'D02': 'l5_rna500_raw_r2',
    'D03': 'l5_rna500_amp_r1', 'D04': 'l5_rna500_amp_r2',
    'D05': 'l1_rna500_raw_r1', 'D06': 'l1_rna500_raw_r2',
    'D07': 'l1_rna500_amp_r1', 'D08': 'l1_rna500_amp_r2',
    'D09': 'l0_rna500_raw_r1', 'D10': 'l0_rna500_raw_r2',
    'D11': 'l0_rna500_amp_r1', 'D12': 'l0_rna500_amp_r2',
    
    'E01': 'l5_rna1000_raw_r1', 'E02': 'l5_rna1000_raw_r2',
    'E03': 'l5_rna1000_amp_r1', 'E04': 'l5_rna1000_amp_r2',
    'E05': 'l1_rna1000_raw_r1', 'E06': 'l1_rna1000_raw_r2',
    'E07': 'l1_rna1000_amp_r1', 'E08': 'l1_rna1000_amp_r2',
    'E09': 'l0_rna1000_raw_r1', 'E10': 'l0_rna1000_raw_r2',
    'E11': 'l0_rna1000_amp_r1', 'E12': 'l0_rna1000_amp_r2'
}

### Opening file

In [None]:
rfu = pd.read_csv('amplification_data_rfu.tsv',
                  sep = '\t',
                  header = 0,
                  index_col = 0
                 )

rfu = rfu.rename(columns = columns)
rfu.head()

### Plotting amplification curves by group

Considering the duplicates, we will now draw plots for each experiment, with duplicates in the same plot.

In [None]:
columns_list = list(columns.values())

# The first item will be cycle, with is not relevant
columns_list.pop(0)

columns_list = list(map(lambda x: x.rsplit('_', 1)[0], columns_list))

# Remove the duplicates
columns_list = list(set(columns_list))
columns_list

for sample in columns_list:
    rfu[[sample + '_r1', sample + '_r2']].plot()

### Considerations

Regarding the amplifications detected, these are the most relevant amplifications:

* l1_rna5_amp
* l1_rna50_amp
* l0_rna5_amp
* l0_rna50_amp

I will, to simplify, load these plots again:

In [None]:
most_significant = [
    'l1_rna5_amp',
    'l1_rna50_amp',
    'l0_rna5_amp',
    'l0_rna50_amp'
]

for sample in most_significant:
    rfu[[sample + '_r1', sample + '_r2']].plot()

In both cases, the rna5 load (where the RNA used on the cDNA synthesis was diluted to 1/5), show better results than the 50 (dilution to 1/50). 

In fact, the l1 (where 1 µL of cDNA was loaded into the Real-Time) have a Ct more lower than the l0 (where 0.5 µL of cDNA was loaded), which corresponds to more quantity.

We may conclude, that the amplification by the TaqMan kit, is relevant, as the raw amplifications are too bad to be even considered.

### Observing melting curves

In [None]:
melting = pd.read_csv('melting_data_rfu.tsv',
                      sep = '\t',
                      header = 0,
                      index_col = 0
                     )

melting = melting.rename(columns = columns)
melting.head()

#### Calculating the melting derivate

The melting derivate will be calculated according to the formula:
-delta(RFU) / delta(temperature)

On this case, the difference in temperature is always 0.5 as that was the parameters set on the experiment.

We will compute the derivatives only to the selected samples (please, see `most_significant` list, is required)

In [None]:
def derivate_melting(values):
    delta_value = []
    for i in range(1, len(values)):
        j = i - 1
        delta_value.append(values[i] - values[j])
    
    derivate = list(map(lambda x: -x/0.5, delta_value))
    return(derivate)

In [None]:
for sample in columns_list:
    derivate_1 = derivate_melting(melting[sample + '_r1'].values.tolist())
    plt.plot(derivate_1)
    
    derivate_2 = derivate_melting(melting[sample + '_r2'].values.tolist())    
    plt.plot(derivate_2)
    
    plt.title(sample)
    
    plt.show()

### Considerations on the melting curve

Again, the results using the 1/5 dilution are betters than the 1/500 on the cDNA synthesis. When 1 µL of cDNA is loaded, we have a more unspecific reaction. The second amplification in l0_rna5_amp is too short to cause any kind of problems.

## Main conclusion

The best conditions are:

* Use first elution of RNA on cDNA synthesis, diluted 1/5.
* Do the cDNA synthesis as recommended by the manufacturer, including the final amplification/enrichment.
* Load 0.5 µL of cDNA to the Real-Time reaction.