### This script will build your redo list from a Geneious export (specifically the tabular plate view of your cycle sequence plate) and build a new FIMS sheet for your reference.

Make sure you are only exporting Geneious data from either the forward or reverse plates - not both. (Otherwise there will be duplicates.) Note: You will need to have passed all of your good sequences before exporting this data from Geneious.

In [1]:
import pandas as pd

Enter the name of your Geneious export file inside the (' ') below:

In [2]:
Geneious_df = pd.read_csv('CapeVerdeP06SeqGeneiousExport.csv')
Geneious_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 192 entries, 0 to 191
Data columns (total 20 columns):
Plate                 192 non-null object
Well                  192 non-null object
GELImage              0 non-null float64
Extraction Barcode    192 non-null int64
Extraction ID         192 non-null object
Workflow ID           192 non-null object
Locus                 192 non-null object
Date                  192 non-null object
Reaction state        192 non-null object
Primer                192 non-null object
Direction             192 non-null object
Reaction Cocktail     192 non-null object
Cleanup performed     192 non-null object
Cleanup method        192 non-null object
Technician            192 non-null object
notes                 192 non-null object
Extraction BCID       0 non-null float64
# Traces              192 non-null int64
# Passed Sequences    192 non-null int64
# Sequences           192 non-null int64
dtypes: float64(2), int64(4), object(14)
memory usage: 30.1+ 

Enter the name of your FIMS file inside the (' ') below:

In [3]:
FIMS_df = pd.read_excel('FY18CapeVerde_P06.xlsx', sheet_name='Samples')
FIMS_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 96 entries, 0 to 95
Data columns (total 22 columns):
materialSampleID             96 non-null object
institutionCode              96 non-null object
kingdom                      96 non-null object
phylum                       96 non-null object
scientificName               96 non-null object
yearCollected                96 non-null int64
locality                     96 non-null object
country                      96 non-null object
tissuePlate                  96 non-null object
tissueWell                   96 non-null object
tissueType                   96 non-null object
collectionCode               96 non-null object
taxonRemarks                 4 non-null object
catalogNumber                96 non-null int64
voucherCatalogNumber         96 non-null object
identifiedBy                 96 non-null object
genbankSpecimenVoucher       96 non-null object
samplingProtocol             96 non-null object
dayCollected                 96 non-

Geneious does not use a two-digit well number value, so we run the following portion of the script to normalize this value between the Geneious export and the FIMS. 

In [4]:
wellCorrection = {'A1':'A01', 'A2':'A02', 'A3':'A03', 'A4':'A04', 'A5':'A05', 'A6':'A06', 'A7':'A07', 
                  'A8':'A08', 'A9':'A09', 'B1':'B01', 'B2':'B02', 'B3':'B03', 'B4':'B04', 'B5':'B05', 
                  'B6':'B06', 'B7':'B07', 'B8':'B08', 'B9':'B09', 'C1':'C01', 'C2':'C02', 'C3':'C03', 
                  'C4':'C04', 'C5':'C05', 'C6':'C06', 'C7':'C07', 'C8':'C08', 'C9':'C09', 'D1':'D01', 
                  'D2':'D02', 'D3':'D03', 'D4':'D04', 'D5':'D05', 'D6':'D06', 'D7':'D07', 'D8':'D08', 
                  'D9':'D09', 'E1':'E01', 'E2':'E02', 'E3':'E03', 'E4':'E04', 'E5':'E05', 'E6':'E06', 
                  'E7':'E07', 'E8':'E08', 'E9':'E09', 'F1':'F01', 'F2':'F02', 'F3':'F03', 'F4':'F04', 
                  'F5':'F05', 'F6':'F06', 'F7':'F07', 'F8':'F08', 'F9':'F09', 'G1':'G01', 'G2':'G02', 
                  'G3':'G03', 'G4':'G04', 'G5':'G05', 'G6':'G06', 'G7':'G07', 'G8':'G08', 'G9':'G09', 
                  'H1':'H01', 'H2':'H02', 'H3':'H03', 'H4':'H04', 'H5':'H05', 'H6':'H06', 'H7':'H07', 
                  'H8':'H08', 'H9':'H09'}

In [9]:
Geneious_df["Well"].replace(wellCorrection, inplace=True)
Geneious_df.head()

Unnamed: 0,Plate,Well,GELImage,Extraction Barcode,Extraction ID,Workflow ID,Locus,Date,Reaction state,Primer,Direction,Reaction Cocktail,Cleanup performed,Cleanup method,Technician,notes,Extraction BCID,# Traces,# Passed Sequences,# Sequences
0,CapeVerde_P06_Seq01_dgjg_F,A01,,302217688,USNM:IZ:1524467.1.1,COI_workflow205675,COI,Mon May 20 00:00:00 EDT 2019,passed,jgLCO1490,Forward,standard,Yes,Sephadex,Allison,Cleanup performed May 21st. \n6 wells cherry p...,,1,1,1
1,CapeVerde_P06_Seq01_dgjg_F,A02,,302217696,USNM:IZ:1524475.1.2,COI_workflow205676,COI,Mon May 20 00:00:00 EDT 2019,failed,jgLCO1490,Forward,standard,Yes,Sephadex,Allison,Cleanup performed May 21st. \n6 wells cherry p...,,1,0,1
2,CapeVerde_P06_Seq01_dgjg_F,A03,,302217704,USNM:IZ:1524484.1.2,COI_workflow205677,COI,Mon May 20 00:00:00 EDT 2019,failed,jgLCO1490,Forward,standard,Yes,Sephadex,Allison,Cleanup performed May 21st. \n6 wells cherry p...,,1,0,1
3,CapeVerde_P06_Seq01_dgjg_F,A04,,302217712,USNM:IZ:1524493.1.2,COI_workflow205678,COI,Mon May 20 00:00:00 EDT 2019,failed,jgLCO1490,Forward,standard,Yes,Sephadex,Allison,Cleanup performed May 21st. \n6 wells cherry p...,,1,0,1
4,CapeVerde_P06_Seq01_dgjg_F,A05,,302217720,USNM:IZ:1524507.1.1,COI_workflow205679,COI,Mon May 20 00:00:00 EDT 2019,failed,jgLCO1490,Forward,standard,Yes,Sephadex,Allison,Cleanup performed May 21st. \n6 wells cherry p...,,1,0,1


This next part will pull out all the wells from the Geneious export that are not marked as "passed":

In [10]:
redos_df = Geneious_df[Geneious_df['Reaction state']!='passed']
redos_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 122 entries, 1 to 191
Data columns (total 20 columns):
Plate                 122 non-null object
Well                  122 non-null object
GELImage              0 non-null float64
Extraction Barcode    122 non-null int64
Extraction ID         122 non-null object
Workflow ID           122 non-null object
Locus                 122 non-null object
Date                  122 non-null object
Reaction state        122 non-null object
Primer                122 non-null object
Direction             122 non-null object
Reaction Cocktail     122 non-null object
Cleanup performed     122 non-null object
Cleanup method        122 non-null object
Technician            122 non-null object
notes                 122 non-null object
Extraction BCID       0 non-null float64
# Traces              122 non-null int64
# Passed Sequences    122 non-null int64
# Sequences           122 non-null int64
dtypes: float64(2), int64(4), object(14)
memory usage: 20.0+ 

This will merge the non-passed wells with the information from the FIMS:

In [11]:
redosWithFIMS_df = FIMS_df.merge(redos_df, how='right', left_on='tissueOtherCatalogNumbers', 
                                 right_on='Extraction Barcode')
redosWithFIMS_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 122 entries, 0 to 121
Data columns (total 42 columns):
materialSampleID             122 non-null object
institutionCode              122 non-null object
kingdom                      122 non-null object
phylum                       122 non-null object
scientificName               122 non-null object
yearCollected                122 non-null int64
locality                     122 non-null object
country                      122 non-null object
tissuePlate                  122 non-null object
tissueWell                   122 non-null object
tissueType                   122 non-null object
collectionCode               122 non-null object
taxonRemarks                 6 non-null object
catalogNumber                122 non-null int64
voucherCatalogNumber         122 non-null object
identifiedBy                 122 non-null object
genbankSpecimenVoucher       122 non-null object
samplingProtocol             122 non-null object
dayCollected     

Now we check to make sure the data looks like it's been merged successfully, by checking that the Well values are in the same order.

In [13]:
wellValidation_df = redosWithFIMS_df[['Well','tissueWell']]
print(wellValidation_df)

    Well tissueWell
0    B01        B01
1    B01        B01
2    C01        C01
3    C01        C01
4    D01        D01
..   ...        ...
117  C12        C12
118  G12        G12
119  G12        G12
120  H12        H12
121  H12        H12

[122 rows x 2 columns]


And finally we save the output to an Excel spreadsheet.

In [14]:
redosWithFIMS_df.to_excel('Redos.xlsx', index=False)