### RULES
Requires format of skipped line between different phylums (ex. empty row above Diatom, Dinoflagellate, etc.)

Assumed all {Ochromonas, } are mixotrophs.

1. assume everything after "Unknown flagellates" is irrelevant (to be deleted)
2. diatoms are NOT mixotrophs
3. remove all "[name]-like" (without genus specified)
4. remove all "[genus name] spp." AND "[genus name] sp."
5. check "cysts of"

Status Key--  
Confirmed := explicitly in the Mixotroph Database  
Unsure (sp. in mdb) := genus in Mixotroph Database lists "[genus name] sp." (ex. Ochromonas sp. for Ochromonas danica)  
Unsure (inexact name):= LIS name is in a longer Mixotroph Database name or vice versa (ex. Chattonella marina in Chattonella marina var. ovata)   

### QUESTIONS TO ASK

1. Should I be considering "cysts of Linggulodinium polyedrum" mixotrophs?
2. How should I handle these "unsure" cases? - see status key above

In [152]:
import pandas as pd
import numpy as np

from datetime import datetime

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option("future.no_silent_downcasting", True)

In [153]:
mdb = pd.read_csv("MDB - 3Dec2022.csv")
mdb.columns = mdb.iloc[1]
mdb = mdb.drop([0, 1]).reset_index(drop=True)

# edit mdb so that species ending in "sp" now end in "sp."
mdb['Species Name'] = mdb['Species Name'].str.replace(r'sp$', 'sp.', regex=True)

mdb.head()

1,Species Name,Taxonomic Group,AphiaID,Additional notes,Gene markers,PR2 Accession Number,GenBank Accession Number,Reference to sequence,MFT,Evidence of mixoplankton activity,Reference to mixoplankton activity,can form colonies,has benthic stages,has parasitic stages,HABs,L (μm),W (μm) or diameter (μm),Depth (μm) or other information,size class,colony size,Reference to mixoplankton size,ability to use NO3,Reference to NO3 usage,Mode of feeding,Presumed prey size,tested min size (μm),tested max size (μm),ingested min size (μm),ingested max size (μm),# prey spp tested,Foram ontogenetic stage (20 μm),presumed prey size class for col AG,Foram ontogenetic stage (20-200 μm),Presumed prey size (µm) for col AI,Foram ontogenetic stage (200 μm - adult),Presumed prey size (µm) for col AK,References for prey sizes,example prey types,prey taxonomic group,References for prey types,Plastid source,plastid taxonomic group,indicative size in environment,References for plastids,Symbiont source,Symbiont taxonomic group,References for taxonomic information,symbiont size (environment),symbiont size (in hospite),indicative size,symbionts per host,References and additional notes for symbionts,NaN,ALSK,ANTA,APLR,ARAB,ARCH,ARCT,AUSE,AUSW,BENG,BERS,BPLR,BRAZ,CAMR,CARB,CCAL,CHIL,CHIN,CNRY,EAFR,ETRA,FKLD,GFST,GUIA,GUIN,INDE,INDW,ISSG,KURO,MEDI,MONS,NADR,NASE,NASW,NATR,NECS,NEWZ,NPPF,NPSW,NPTG,NWCS,PEQD,PNEC,PSAE,PSAW,REDS,SANT,SARC,SATL,SPSG,SSTC,SUND,TASM,WARM,WTRA
0,Acanthochiasma sp.,Radiolaria,368427,Acantharia,18S_rRNA_nucleus;18S_rRNA_nucleus;18S_rRNA_nuc...,HM103395.1.1099_U;HM103418.1.1104_U;JN811207.1...,HM103395;HM103418;JN811207;GU825020;HM103399;H...,"Quaiser,A.. Comparative metagenomics of bathyp...",eSNCM,endosymbionts,"Decelle J, Siano R, Probert I et al (2012b) Mu...",no,not recorded,not recorded,not recorded,100,150,not recorded,micro,not applicable,"Decelle J, Siano R, Probert I et al (2012b) Mu...",not recorded,this work,prey capture using axopodia; engulfment after ...,pico-micro,not recorded,not recorded,not recorded,not recorded,not recorded,not applicable,not applicable,not applicable,not applicable,not applicable,not applicable,"Swanberg, N.R. and Caron, D.A., 1991. Patterns...","tintinnids, mollusc larvae, copepods, ostracod...","Diatomeae, Ciliophora, Copepoda, Mollusca, Ost...","Swanberg, N.R. and Harbison, G.R., 1980. The e...",not applicable,not applicable,not applicable,not applicable,"Chrysochromulina, Azadinium sp., Pelagodinium ...","Haptophyta, Dinoflagellata","Decelle J, Siano R, Probert I et al (2012b) Mu...",not recorded,4-15 um,nano,5 (<100 μm); 10-29 (100-200 μm); 29-45 (200-30...,"Decelle J, Siano R, Probert I et al (2012b) Mu...",-,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded
1,Acanthometra fusca,Radiolaria,not registered,Acantharia,18S_rRNA_nucleus;18S_rRNA_nucleus;18S_rRNA_nuc...,KC172856.1.1696_U;EU446351.1.1552_U;JN811165.1...,KC172856;EU446351;JN811165,"Decelle,J.. Diversity, ecology and biogeochemi...",eSNCM,endosymbionts,"Michaels, A.F., 1988. Vertical distribution an...",no,not recorded,not recorded,not recorded,100-300,100-300,not recorded,micro-meso,not applicable,"Michaels, A.F., 1988. Vertical distribution an...",not recorded,this work,prey capture using axopodia; engulfment after ...,pico-micro,not recorded,not recorded,not recorded,not recorded,not recorded,not applicable,not applicable,not applicable,not applicable,not applicable,not applicable,"Swanberg, N.R. and Caron, D.A., 1991. Patterns...","tintinnids, mollusc larvae, copepods, ostracod...","Diatomeae, Ciliophora, Copepoda, Mollusca, Ost...","Swanberg, N.R. and Harbison, G.R., 1980. The e...",not applicable,not applicable,not applicable,not applicable,"Phaeocystis, Chrysochromulina; Gymnoxanthella ...","Haptophyta, Dinoflagellata","Gastrich, M.D., 1987. Ulstructure of a new int...",3-5 μm,8-10 μm,nano,10-29 (100-200 μm); 29-45 (200-300 μm); 40 (~3...,"symbiont size: Decelle, J., Stryhanyuk, H., Ga...",-,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded
2,Acanthodesmia vinculata,Radiolaria,493675,Acantharia,not recorded,not recorded,not recorded,not recorded,eSNCM,endosymbionts,"Yuasa, T., Horiguchi, T., Mayama, S. and Takah...",no,not recorded,not recorded,not recorded,200,200,not recorded,meso,not applicable,"Zhang, L., Suzuki, N., Nakamura, Y. and Tuji, ...",not recorded,this work,prey capture using axopodia; engulfment after ...,pico-micro,not recorded,not recorded,not recorded,not recorded,not recorded,not applicable,not applicable,not applicable,not applicable,not applicable,not applicable,this work,"tintinnids, mollusc larvae, copepods, ostracod...","Diatomeae, Ciliophora, Copepoda, Mollusca, Ost...","Swanberg, N.R. and Harbison, G.R., 1980. The e...",not applicable,not applicable,not applicable,not applicable,Gymnoxanthella radiolariae; Zooxanthella nutri...,Dinoflagellata,"Yuasa, T., Horiguchi, T., Mayama, S. and Takah...",9.1–11.4 um (length); 5.7–9.4 um (width); oval...,nano,nano,> 50,"symbiont number: Suzuki, N. and Not, F., 2015....",-,not recorded,not recorded,not recorded,not recorded,2,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,29,not recorded,not recorded,not recorded,not recorded,18,not recorded,not recorded,4,1,not recorded,not recorded,not recorded,not recorded,33,not recorded,not recorded,not recorded,not recorded,1,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,12,2,not recorded,not recorded,not recorded,not recorded,not recorded,2,8,not recorded,not recorded,not recorded,15,not recorded
3,Acanthometra pellucida,Radiolaria,235750,Acantharia,18S_rRNA_nucleus;18S_rRNA_nucleus;18S_rRNA_nuc...,JN811196.1.1668_U;JQ697712.1.1693_U;JQ697708.1...,JN811196;JQ697712;JQ697708;JN811190;JQ697711;J...,"Decelle,J.. Molecular Phylogeny and Morphologi...",eSNCM,endosymbionts,"Febvre, J. and Febvre-Chevalier, C., 1979. Ult...",no,not recorded,not recorded,not recorded,100-300,100-300,not recorded,micro-meso,not applicable,"Michaels, A.F., 1988. Vertical distribution an...",not recorded,this work,prey capture using axopodia; engulfment after ...,pico-micro,not recorded,not recorded,not recorded,not recorded,not recorded,not applicable,not applicable,not applicable,not applicable,not applicable,not applicable,"Swanberg, N.R. and Caron, D.A., 1991. Patterns...","tintinnids, mollusc larvae, copepods, ostracod...","Diatomeae, Ciliophora, Copepoda, Mollusca, Ost...","Swanberg, N.R. and Harbison, G.R., 1980. The e...",not applicable,not applicable,not applicable,not applicable,"Phaeocystis, Chrysochromulina; Gymnoxanthella ...","Haptophyta, Dinoflagellata","Febvre, J. and Febvre-Chevalier, C., 1979. Ult...",3-5 μm,8-10 μm,nano,10-29 (100-200 μm); 29-45 (200-300 μm); 40 (~3...,"symbiont size: Decelle, J., Stryhanyuk, H., Ga...",-,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,17,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,1,31,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded
4,Acanthometron sp.,Radiolaria,391880,Acantharia,not recorded,not recorded,not recorded,not recorded,eSNCM,endosymbionts,"Febvre, J. and Febvre-Chevalier, C., 1979. Ult...",no,not recorded,not recorded,not recorded,~100-400,~100-400,not recorded,micro-meso,not applicable,"assumption based on Mars Brisbin, M., Mesrop, ...",not recorded,this work,prey capture using axopodia; engulfment after ...,pico-micro,not recorded,not recorded,not recorded,not recorded,not recorded,not applicable,not applicable,not applicable,not applicable,not applicable,not applicable,"Swanberg, N.R. and Caron, D.A., 1991. Patterns...","tintinnids, mollusc larvae, copepods, ostracod...","Diatomeae, Ciliophora, Copepoda, Mollusca, Ost...","Swanberg, N.R. and Harbison, G.R., 1980. The e...",not applicable,not applicable,not applicable,not applicable,"Prymnesiophyceae; Phaeocystis, Chrysochromulina","Haptophyta, Dinoflagellata","Mars Brisbin, M., Mesrop, L.Y., Grossmann, M.M...",3-5 μm,8-10 μm,nano,5 (<100 μm); 10-29 (100-200 μm); 29-45 (200-30...,"symbiont size: Decelle, J., Stryhanyuk, H., Ga...",-,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded,not recorded


In [154]:
csv_name = "LIS_2019-Phytoplankton_Final Report Data.xlsx - 2019 LIS phytoplankton count"

In [155]:
# import and clean LIS data
lis = pd.read_csv(f"inputs/{csv_name}.csv")
original_headers = lis.columns  # save original column headers

phylum_ind = lis[lis.iloc[:, 0] == "Phylum"].index[0]
lis.columns = lis.iloc[phylum_ind]  # reset column headers
lis = lis.iloc[phylum_ind+2:].reset_index(drop=True)  

In [156]:
# remove rows after unknown flagellates
unknown_flagellates_ind = lis[lis["Phylum"] == "Unknown flagellates"].index[0] 
lis = lis.iloc[:unknown_flagellates_ind]
lis = lis.iloc[:lis.last_valid_index()+1]  # remove trailing nan rows

In [157]:
# remove rows that contain "TOTAL"
lis = lis[~lis["Phylum"].str.contains("TOTAL", na=False)].reset_index(drop=True)  

In [158]:
# construct correct phylum column
actual_phylum_ind = lis[lis["Species"].isna() & lis["Phylum"].isna()].index + 1
lis = lis.rename(columns={"Phylum": "Genus"}) # rename phylum column to genus
lis.insert(0, 'Phylum', lis["Genus"].iloc[actual_phylum_ind])  # reconstruct phylum column
lis['Phylum'] = lis['Phylum'].ffill()  # forwardfill phylum

lis['Genus'] = lis['Species'].str.split().str[0]  # fill genus using first word of species name

lis = lis.dropna(subset=['Species']).reset_index(drop=True) # delete rows with na in Species column

SPECIES_COL = lis.columns.get_loc("Species")
lis.iloc[:, SPECIES_COL+1:] = lis.iloc[:, SPECIES_COL+1:].replace(",", "", regex=True).astype(float)  # ensure numerical values are floats

In [159]:
# add Status column
lis.insert(0, 'Status', None)

In [160]:
# store blocks of known mixotroph genuses 
ochromonas_ind = lis[lis["Species"].str.contains("Ochromonas")].index
ochromonas_block = lis.iloc[ochromonas_ind] 

In [161]:
# remove based on hard coded rules (NOT RESETTING INDEX IN ORDER TO ADD BLOCKS BACK CORRECTLY)
lis = lis[lis["Phylum"] != "Diatom"] # remove all diatoms
lis = lis[~lis["Species"].str.contains("-like")] # remove species ending with "-like"
lis = lis[~lis["Species"].str.contains("sp.|spp.")]  # remove all sp. / spp.

In [162]:
# check "cysts of"
CYSTS_LEN = len("cysts of ")
cysts_of = lis[lis["Species"].str.contains("cysts of", regex=False)]["Species"].str.slice(CYSTS_LEN)
filtered = cysts_of.isin(mdb['Species Name'])
lis.loc[filtered[filtered].index, "Status"] = "Confirmed"
lis.loc[filtered[filtered].index, "Genus"] = cysts_of.str.split().str[0]

In [163]:
# add back stored blocks of known mixotrophs and mark as Confirmed
lis = pd.concat([lis, ochromonas_block]).sort_index().drop_duplicates()
lis.loc[ochromonas_ind, "Status"] = "Confirmed"

In [165]:
# check if (in none status) direct match and mark all Trues as "Confirmed"
filtered = lis["Species"].isin(mdb['Species Name'])
lis.loc[filtered[filtered].index, "Status"] = "Confirmed"

# check (in remaining none status) if the genus has sp. and mark all Trues as "Unsure (sp. in mdb)"
genus_to_check = lis[lis['Status'].isnull()]['Species'].str.split().str[0].drop_duplicates() + " sp."
filtered = genus_to_check.isin(mdb['Species Name'])
lis.loc[filtered[filtered].index, "Status"] = "Unsure (sp. mdb)"

In [166]:
# check (in remaining none status) if the name is contained in the mdb and vice versa and mark all Trues as "Unsure (inexact name)"
filtered = lis[lis['Status'].isnull()]["Species"].apply(lambda x: mdb["Species Name"].str.contains(x, regex=False).any())
lis.loc[filtered[filtered].index, "Status"] = "Unsure (inexact name)"

pattern = '|'.join(mdb['Species Name'])
filtered = lis[lis['Status'].isnull()]["Species"].str.contains(pattern, regex=True)
lis.loc[filtered[filtered].index, "Status"] = "Unsure (inexact name)"

In [167]:
# drop all rows with Status = "None"
lis = lis.dropna(subset=['Status']).reset_index(drop=True)

In [178]:
lis = pd.merge(lis, mdb[['Species Name','MFT', 'Evidence of mixoplankton activity', 'size class']], left_on='Species', right_on='Species Name', how='left').drop(columns=['Species Name']).reset_index(drop=True) 
lis = lis[lis.columns[:4].to_list() + lis.columns[-3:].to_list() + lis.columns[4:-3].to_list()]
lis

1,Status,Phylum,Genus,Species,MFT_y,Evidence of mixoplankton activity_y,size class_y,1/3/19,1/3/19.1,1/3/19.2,1/3/19.3,1/3/19.4,1/3/19.5,1/3/19.6,1/3/19.7,1/3/19.8,1/3/19.9,1/3/19.10,1/3/19.11,1/3/19.12,1/3/19.13,1/3/19.14,1/3/19.15,1/3/19.16,1/3/19.17,1/7/19,1/7/19.1,1/7/19.2,1/7/19.3,1/7/19.4,1/7/19.5,1/7/19.6,1/7/19.7,1/7/19.8,1/7/19.9,1/7/19.10,1/7/19.11,1/7/19.12,1/7/19.13,1/7/19.14,1/7/19.15,1/7/19.16,1/7/19.17,1/2/19,1/2/19.1,1/2/19.2,1/2/19.3,1/2/19.4,1/2/19.5,1/2/19.6,1/2/19.7,1/2/19.8,1/2/19.9,1/2/19.10,1/2/19.11,1/2/19.12,1/2/19.13,1/2/19.14,1/2/19.15,1/2/19.16,1/2/19.17,1/2/19.18,1/2/19.19,1/2/19.20,1/2/19.21,1/2/19.22,1/2/19.23,1/2/19.24,1/2/19.25,1/2/19.26,1/2/19.27,1/2/19.28,1/2/19.29,1/2/19.30,1/2/19.31,1/3/19.18,1/3/19.19,1/3/19.20,1/3/19.21,1/3/19.22,1/3/19.23,1/3/19.24,1/3/19.25,1/3/19.26,1/3/19.27,1/3/19.28,1/3/19.29,1/3/19.30,1/3/19.31,1/3/19.32,1/3/19.33,1/3/19.34,1/3/19.35,1/7/19.18,1/7/19.19,1/7/19.20,1/7/19.21,1/7/19.22,1/7/19.23,1/7/19.24,1/7/19.25,1/7/19.26,1/7/19.27,1/7/19.28,1/7/19.29,1/7/19.30,1/7/19.31,1/7/19.32,1/7/19.33,1/7/19.34,1/7/19.35,1/2/19.32,1/2/19.33,1/2/19.34,1/2/19.35,1/2/19.36,1/2/19.37,1/2/19.38,1/2/19.39,1/2/19.40,1/2/19.41,1/2/19.42,1/2/19.43,1/2/19.44,1/2/19.45,1/2/19.46,1/2/19.47,1/2/19.48,1/2/19.49,1/2/19.50,1/2/19.51,1/2/19.52,1/2/19.53,1/2/19.54,1/2/19.55,1/2/19.56,1/2/19.57,1/2/19.58,1/2/19.59,1/2/19.60,1/2/19.61,1/2/19.62,1/2/19.63,2/7/19,2/7/19.1,2/7/19.2,2/7/19.3,2/7/19.4,2/7/19.5,2/7/19.6,2/7/19.7,2/7/19.8,2/7/19.9,2/7/19.10,2/7/19.11,2/7/19.12,2/7/19.13,2/7/19.14,2/7/19.15,2/7/19.16,2/7/19.17,2/7/19.18,2/7/19.19,2/7/19.20,2/7/19.21,2/7/19.22,2/7/19.23,2/7/19.24,2/7/19.25,2/7/19.26,2/7/19.27,2/7/19.28,2/7/19.29,2/7/19.30,2/7/19.31,2/6/19,2/6/19.1,2/6/19.2,2/6/19.3,2/6/19.4,2/6/19.5,2/6/19.6,2/6/19.7,2/4/19,2/4/19.1,2/4/19.2,2/4/19.3,2/4/19.4,2/4/19.5,2/4/19.6,2/4/19.7,2/4/19.8,2/4/19.9,2/4/19.10,2/4/19.11,2/4/19.12,2/4/19.13,2/4/19.14,2/4/19.15,2/4/19.16,2/4/19.17,2/4/19.18,2/4/19.19,2/4/19.20,2/4/19.21,2/4/19.22,2/4/19.23,2/4/19.24,2/4/19.25,2/4/19.26,2/4/19.27,2/4/19.28,2/4/19.29,2/4/19.30,2/4/19.31,2/7/19.32,2/7/19.33,2/7/19.34,2/7/19.35,2/7/19.36,2/7/19.37,2/7/19.38,2/7/19.39,2/7/19.40,2/7/19.41,2/7/19.42,2/7/19.43,2/7/19.44,2/7/19.45,2/7/19.46,2/7/19.47,2/7/19.48,2/7/19.49,2/7/19.50,2/7/19.51,2/7/19.52,2/7/19.53,2/7/19.54,2/7/19.55,2/7/19.56,2/7/19.57,2/7/19.58,2/7/19.59,2/7/19.60,2/7/19.61,2/7/19.62,2/7/19.63,2/6/19.8,2/6/19.9,2/6/19.10,...,11/5/19,11/5/19.1,11/5/19.2,11/5/19.3,11/5/19.4,11/5/19.5,11/5/19.6,11/5/19.7,11/5/19.8,11/5/19.9,11/5/19.10,11/5/19.11,11/5/19.12,11/5/19.13,11/5/19.14,11/5/19.15,11/5/19.16,11/5/19.17,11/5/19.18,11/5/19.19,11/5/19.20,11/5/19.21,11/5/19.22,11/6/19,11/6/19.1,11/6/19.2,11/6/19.3,11/6/19.4,11/6/19.5,11/6/19.6,11/6/19.7,11/4/19,11/4/19.1,11/4/19.2,11/4/19.3,11/4/19.4,11/4/19.5,11/4/19.6,11/4/19.7,11/4/19.8,11/4/19.9,11/4/19.10,11/4/19.11,11/4/19.12,11/4/19.13,11/4/19.14,11/4/19.15,11/4/19.16,11/4/19.17,11/4/19.18,11/4/19.19,11/4/19.20,11/4/19.21,11/4/19.22,11/4/19.23,11/4/19.24,11/4/19.25,11/4/19.26,11/4/19.27,11/4/19.28,11/4/19.29,11/4/19.30,11/4/19.31,11/5/19.23,11/5/19.24,11/5/19.25,11/5/19.26,11/5/19.27,11/5/19.28,11/5/19.29,11/5/19.30,11/5/19.31,11/5/19.32,11/5/19.33,11/5/19.34,11/5/19.35,11/5/19.36,11/5/19.37,11/5/19.38,11/5/19.39,11/5/19.40,11/5/19.41,11/5/19.42,11/5/19.43,11/5/19.44,11/5/19.45,11/5/19.46,11/5/19.47,11/5/19.48,11/5/19.49,11/5/19.50,11/5/19.51,11/5/19.52,11/5/19.53,11/5/19.54,11/6/19.8,11/6/19.9,11/6/19.10,11/6/19.11,11/6/19.12,11/6/19.13,11/6/19.14,11/6/19.15,11/4/19.32,11/4/19.33,11/4/19.34,11/4/19.35,11/4/19.36,11/4/19.37,11/4/19.38,11/4/19.39,11/4/19.40,11/4/19.41,11/4/19.42,11/4/19.43,11/4/19.44,11/4/19.45,11/4/19.46,11/4/19.47,11/4/19.48,11/4/19.49,11/4/19.50,11/4/19.51,11/4/19.52,11/4/19.53,11/4/19.54,11/4/19.55,11/4/19.56,11/4/19.57,11/4/19.58,11/4/19.59,11/4/19.60,11/4/19.61,11/4/19.62,11/4/19.63,12/6/19,12/6/19.1,12/6/19.2,12/6/19.3,12/6/19.4,12/6/19.5,12/6/19.6,12/6/19.7,12/6/19.8,12/6/19.9,12/6/19.10,12/6/19.11,12/6/19.12,12/6/19.13,12/6/19.14,12/6/19.15,12/6/19.16,12/6/19.17,12/5/19,12/5/19.1,12/16/19,12/16/19.1,12/16/19.2,12/16/19.3,12/16/19.4,12/16/19.5,12/16/19.6,12/16/19.7,12/16/19.8,12/16/19.9,12/16/19.10,12/16/19.11,12/16/19.12,12/16/19.13,12/16/19.14,12/16/19.15,12/16/19.16,12/16/19.17,12/4/19,12/4/19.1,12/4/19.2,12/4/19.3,12/4/19.4,12/4/19.5,12/4/19.6,12/4/19.7,12/4/19.8,12/4/19.9,12/4/19.10,12/4/19.11,12/4/19.12,12/4/19.13,12/4/19.14,12/4/19.15,12/4/19.16,12/4/19.17,12/6/19.18,12/6/19.19,12/6/19.20,12/6/19.21,12/6/19.22,12/6/19.23,12/6/19.24,12/6/19.25,12/6/19.26,12/6/19.27,12/6/19.28,12/6/19.29,12/6/19.30,12/6/19.31,12/6/19.32,12/6/19.33,12/6/19.34,12/6/19.35,12/5/19.2,12/5/19.3,12/16/19.18,12/16/19.19,12/16/19.20,12/16/19.21,12/16/19.22,12/16/19.23,12/16/19.24,12/16/19.25,12/16/19.26,12/16/19.27,12/16/19.28,12/16/19.29,12/16/19.30,12/16/19.31,12/16/19.32,12/16/19.33,12/16/19.34,12/16/19.35,12/4/19.18,12/4/19.19,12/4/19.20,12/4/19.21,12/4/19.22,12/4/19.23,12/4/19.24,12/4/19.25,12/4/19.26,12/4/19.27,12/4/19.28,12/4/19.29,12/4/19.30,12/4/19.31,12/4/19.32,12/4/19.33,12/4/19.34,12/4/19.35,MFT_x,Evidence of mixoplankton activity_x,size class_x
0,Confirmed,Dinoflagellate,Akashiwo,Akashiwo sanguinea,CM,"ingestion of ciliates, Isochrysis, Cryptophyte...",micro,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,CM,"ingestion of ciliates, Isochrysis, Cryptophyte...",micro
1,Confirmed,Dinoflagellate,Dinophysis,Dinophysis acuminata,pSNCM,photosynthetic Dinophysis spp. obtain plastids...,micro,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,176.0,,,,,,,,176.0,,,,,,,,176.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,176.0,,,,,,,,176.0,,,,,,,,176.0,,,,,,,,176.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,176.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,176.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,pSNCM,photosynthetic Dinophysis spp. obtain plastids...,micro
2,Confirmed,Dinoflagellate,Dinophysis,Dinophysis miles,pSNCM*,This species retains chloroplasts from cryptop...,micro,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,pSNCM*,This species retains chloroplasts from cryptop...,micro
3,Confirmed,Dinoflagellate,Dinophysis,Dinophysis norvegica,pSNCM,photosynthetic Dinophysis spp. obtain plastids...,micro,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,pSNCM,photosynthetic Dinophysis spp. obtain plastids...,micro
4,Confirmed,Dinoflagellate,Gambierdiscus,Gambierdiscus toxicus,CM,"presence of feeding vacuoles, unknown prey",micro,352.0,,,,,,352.0,,,,,,352.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,352.0,,,,,,352.0,,,,,,352.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,352.0,,,,,,,,352.0,,,,,,,,352.0,,,,,,,,352.0,,,,,,,,,,,,,,,,,,,,176.0,,,,,,,,176.0,,,,,,,,176.0,,,,,,,,176.0,,,,352.0,,,,,,,,352.0,,,,,,,,352.0,,,,,,,,352.0,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,88.0,,,,,,88.0,,,,,,88.0,,,,,,,,,88.0,,,,,,88.0,,,,,,88.0,,,,,,,,,,,,,,,,,,,,,,,88.0,,,,,,88.0,,,,,,88.0,,,,,,,,,88.0,,,,,,88.0,,,,,,88.0,,,,,,,,,,,,,,,,,,,,CM,"presence of feeding vacuoles, unknown prey",micro
5,Confirmed,Dinoflagellate,Gonyaulax,Gonyaulax polygramma,CM,"ingestion of cryptophyte species, Amphidinium ...",micro,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,CM,"ingestion of cryptophyte species, Amphidinium ...",micro
6,Confirmed,Dinoflagellate,Heterocapsa,Heterocapsa circularisquama,CM,bacteria in food vacuoles,nano,17600.0,17600.0,30800.0,8800.0,2904.0,,17600.0,17600.0,30800.0,8800.0,2904.0,,17600.0,17600.0,30800.0,8800.0,2904.0,,8800.0,,17600.0,13200.0,1452.0,,8800.0,,17600.0,13200.0,1452.0,,8800.0,,17600.0,13200.0,1452.0,,22000.0,22000.0,,,1452.0,,,,22000.0,22000.0,,,1452.0,,,,22000.0,22000.0,,,1452.0,,,,22000.0,22000.0,,,1452.0,,,,17600.0,17600.0,30800.0,8800.0,2904.0,,17600.0,17600.0,30800.0,8800.0,2904.0,,17600.0,17600.0,30800.0,8800.0,2904.0,,8800.0,,17600.0,13200.0,1452.0,,8800.0,,17600.0,13200.0,1452.0,,8800.0,,17600.0,13200.0,1452.0,,22000.0,22000.0,,,1452.0,,,,22000.0,22000.0,,,1452.0,,,,22000.0,22000.0,,,1452.0,,,,22000.0,22000.0,,,1452.0,,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,704.0,,,,704.0,,,,1452.0,2904.0,1452.0,,,2904.0,1452.0,,1452.0,2904.0,1452.0,,,2904.0,1452.0,,1452.0,2904.0,1452.0,,,2904.0,1452.0,,1452.0,2904.0,1452.0,,,2904.0,1452.0,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,13200.0,8800.0,,,704.0,,,...,158400.0,184800.0,92400.0,13200.0,13200.0,,2904.0,123200.0,158400.0,184800.0,92400.0,13200.0,13200.0,,2904.0,123200.0,158400.0,184800.0,92400.0,13200.0,13200.0,,2904.0,,22000.0,704.0,8800.0,,22000.0,704.0,8800.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,123200.0,158400.0,184800.0,92400.0,13200.0,13200.0,,2904.0,123200.0,158400.0,184800.0,92400.0,13200.0,13200.0,,2904.0,123200.0,158400.0,184800.0,92400.0,13200.0,13200.0,,2904.0,123200.0,158400.0,184800.0,92400.0,13200.0,13200.0,,2904.0,,22000.0,704.0,8800.0,,22000.0,704.0,8800.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,83600.0,8800.0,1452.0,,4400.0,1452.0,2904.0,2904.0,8800.0,13200.0,2904.0,13200.0,13200.0,2904.0,8800.0,13200.0,2904.0,13200.0,13200.0,2904.0,8800.0,13200.0,2904.0,13200.0,13200.0,2904.0,13200.0,352.0,1452.0,2904.0,1452.0,,,,1452.0,2904.0,1452.0,,,,1452.0,2904.0,1452.0,,,,,,,2904.0,704.0,,,,,2904.0,704.0,,,,,2904.0,704.0,,8800.0,13200.0,2904.0,13200.0,13200.0,2904.0,8800.0,13200.0,2904.0,13200.0,13200.0,2904.0,8800.0,13200.0,2904.0,13200.0,13200.0,2904.0,13200.0,352.0,1452.0,2904.0,1452.0,,,,1452.0,2904.0,1452.0,,,,1452.0,2904.0,1452.0,,,,,,,2904.0,704.0,,,,,2904.0,704.0,,,,,2904.0,704.0,,CM,bacteria in food vacuoles,nano
7,Confirmed,Dinoflagellate,Noctiluca,Noctiluca scintillans,eSNCM,endosymbionts,meso,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,eSNCM,endosymbionts,meso
8,Confirmed,Dinoflagellate,Prorocentrum,Prorocentrum lima,CM,"presence of feeding vacuoles, unknown prey",micro,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,CM,"presence of feeding vacuoles, unknown prey",micro
9,Confirmed,Dinoflagellate,Prorocentrum,Prorocentrum micans,CM,"consumed Isochrysis galbana, Heterosigma akash...",micro,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,CM,"consumed Isochrysis galbana, Heterosigma akash...",micro


In [151]:
filtered_mdb = mdb[['Species Name','MFT', 'Evidence of mixoplankton activity', 'size class']].loc[mdb['Species Name'].isin(lis['Species'])]
lis1 = pd.merge(lis, filtered_mdb, left_on='Species', right_on='Species Name', how='left').drop(columns='Species Name')
filtered_mdb

1,Species Name,MFT,Evidence of mixoplankton activity,size class
12,Akashiwo sanguinea,CM,"ingestion of ciliates, Isochrysis, Cryptophyte...",micro
111,Dinophysis acuminata,pSNCM,photosynthetic Dinophysis spp. obtain plastids...,micro
116,Dinophysis miles,pSNCM*,This species retains chloroplasts from cryptop...,micro
117,Dinophysis norvegica,pSNCM,photosynthetic Dinophysis spp. obtain plastids...,micro
147,Gambierdiscus toxicus,CM,"presence of feeding vacuoles, unknown prey",micro
167,Gonyaulax polygramma,CM,"ingestion of cryptophyte species, Amphidinium ...",micro
186,Heterocapsa circularisquama,CM,bacteria in food vacuoles,nano
189,Heterosigma akashiwo,CM,uptake of eubacteria,micro
235,Lingulodinium polyedra,CM,"ingestion of Skeletonema costatum, Synechococcus",micro
253,Noctiluca scintillans,eSNCM,endosymbionts,meso


In [109]:
# totals = lis.groupby('Phylum', as_index=False, sort=False).sum()
# totals

# # empty text-containing columns
# totals = totals.drop("Status", axis=1)
# totals.insert(0, 'Status', "") 
# totals["Genus"] = ""
# totals["Species"] = ""

# # rename to TOTAL "   "
# totals["Phylum"] = totals["Phylum"].str.upper().apply(lambda x: "TOTAL " + x + "S")

# # add in line skips
# totals = totals.set_index(lis.groupby(['Phylum']).tail(1).index + 0.1)
# empty_df = pd.DataFrame("", index=lis.groupby(['Phylum']).tail(1).index+0.2, columns=totals.columns)
# totals = pd.concat([totals, empty_df]).sort_index()

In [171]:
# lis = pd.concat([lis, totals]).sort_index().reset_index(drop=True)

In [172]:
# replace Nans with zero
lis = lis.fillna(0)

In [173]:
# add back multiheader
needed_cols = pd.Series(np.full(len(lis.columns) - len(original_headers), None)) 
original_headers = pd.concat([needed_cols, original_headers.to_series()], ignore_index=True)
lis.columns = pd.MultiIndex.from_arrays([original_headers, lis.columns])

In [174]:
lis


Unnamed: 0_level_0,NaN,NaN,"(Note: S: surface water sample, B: bottom water sample",Unnamed: 1,A4S Note: Skeletonema bloom,B3S,C1S Note: low cell abundance,D3S,E1S Note: low cell abundance,F2S Note: low cell abundance,...,A4B Note: Skeletonema spp. Bloom,B3B Note: Skeletonema spp. Bloom,C1B Note: Low cell abundance,D3B Note: Very low cell abundance,E1B Note: Low cell abundance.2,F2B Note: some debris; very low cell abundance,H4B Note: Large amount of debris; low cell abundance,I2B Note: debris; very low cell abundance,J2B Note: Very low cell abundance .1,K2B Note: Very low cell abundance.4
1,Status,Phylum,Genus,Species,1/3/19,1/3/19,1/3/19,1/7/19,1/7/19,1/7/19,...,12/6/19,12/6/19,12/6/19,12/5/19,12/16/19,12/16/19,12/16/19,12/4/19,12/4/19,12/4/19
0,Confirmed,Dinoflagellate,Akashiwo,Akashiwo sanguinea,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
1,Confirmed,Dinoflagellate,Dinophysis,Dinophysis acuminata,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
2,Confirmed,Dinoflagellate,Dinophysis,Dinophysis miles,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
3,Confirmed,Dinoflagellate,Dinophysis,Dinophysis norvegica,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
4,Confirmed,Dinoflagellate,Gambierdiscus,Gambierdiscus toxicus,352,0.0,0.0,0.0,0.0,0.0,...,88.0,0.0,0.0,0.0,0.0,88.0,0.0,0.0,0.0,0
5,Confirmed,Dinoflagellate,Gonyaulax,Gonyaulax polygramma,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
6,Confirmed,Dinoflagellate,Heterocapsa,Heterocapsa circularisquama,17600,17600.0,30800.0,8800.0,0.0,17600.0,...,13200.0,13200.0,2904.0,352.0,0.0,0.0,0.0,2904.0,704.0,0
7,Confirmed,Dinoflagellate,Noctiluca,Noctiluca scintillans,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
8,Confirmed,Dinoflagellate,Prorocentrum,Prorocentrum lima,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
9,Confirmed,Dinoflagellate,Prorocentrum,Prorocentrum micans,0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0


In [41]:
# save dataframe to excel
lis.to_excel(f"outputs/{csv_name}-{str(datetime.now())}.xlsx")