# Automated Multiple Reaction Monitoring (MRM)-profiling and Ozone Electrospray Ionizaton (OzESI)-MRM Informatics Platform for High-throughput Lipidomics


In this jupyter notebook you will automate the data analysis of the lipidome. This is a challenging problem to perform manually due to the diverse nature of lipids and the many potential isomers. In this notebook you will analyze mzML files containing data from lipid MRMs, with ozone off and ozone on. The goal is to identify possible double-bond locations in a lipid, in this case a TAG (triacylglycerols).

In [1]:
from IPython.display import Image

![title](Figures/agilent_lcms.png)

The examples shown here were run on an Agilent 6495C Triple Quadrupole LC/MS (example shown above) that has been connected to an ozone line (not shown in picture) for ozoneolysis of lipids.

![title](Figures/TAG_example.png)
Here is an example of a TAG. Notice how many possibilities there are for locations of one double-bond there could be and how convoluted the analysis can become! This image is obtained from LipidMaps.org

Import all necessary libraries

In [2]:
#Import all the necessary libraries
import pymzml
import csv
import os
import pandas as pd
import numpy as np
import math
from matplotlib import pyplot as plt
import re
import plotly.express as px
from collections import defaultdict

import plotly.io as pio
import json
import plotly.graph_objs as go
import matplotlib.colors as mcolors

import json
import ipywidgets as widgets
from IPython.display import display


No module named 'ms_deisotope._c.averagine' averagine
No module named 'ms_deisotope._c.scoring'
No module named 'ms_deisotope._c.deconvoluter_base'
No module named 'ms_deisotope._c.deconvoluter_base'
No module named 'ms_deisotope._c.deconvoluter_base'


In [3]:
###Importing Variables for all functions

data_base_name_location = 'lipid_database/Lipid_Database.xlsx'####Lipid database with Standard Carnitines
mzml_folder = './data_mzml/Brain_5xFAD/5_09_23/'
tolerance = 0.3
remove_std = True

# Example usage:
folder_name_to_save = 'Brain_5xFAD/'
file_name_to_save = 'Brain_5xFAD_3_tolerance'
save_data= True


###Think I should Reorient for 1 folder with all mzml, plots, csvs, pre edgeR

###Label_File
##Need to drop equisplash among others ##Drop Position Should be something Unique
label_file = "./Labels/5xFAD_brain/5_9_23.csv"
blank_name = "Blank_Blank_Blank_Blank_Blank" ##Maybe Standardize the position so that we can automatically take it - Or a second reading file
splash_name = "SPLASH_splash_splash_splash_splash"
clean_name = "IPA_clean_clean_clean_clean"
remove_list = ["SPLASH_splash_splash_splash_splash","IPA_clean_clean_clean_clean"]
Pre_edge_r_path = "Pre_EdgeR/5xFAD_brain/5_09_23/"

plots_2_save_path = "Plots/5xFAD_brain/5_09_23/"

In [29]:
json1 = {"Genotype": ["5xFAD"]}
json2 = {"Sample Name": ["blank_033123"]}
json_list_singles = [json1]

In [2]:
###Importing Variables for all functions

# data_base_name_location = 'lipid_database/Lipid_Database.xlsx'####Lipid database with Standard Carnitines
# mzml_folder = './data_mzml/Brain_5xFAD/5_09_23/'
# tolerance = 0.3
# remove_std = True

# # Example usage:
# folder_name_to_save = 'Brain_5xFAD/'
# file_name_to_save = 'Brain_5xFAD_3_tolerance'
# save_data= True


# ###Think I should Reorient for 1 folder with all mzml, plots, csvs, pre edgeR

# ###Label_File
# ##Need to drop equisplash among others ##Drop Position Should be something Unique
# label_file = "./Labels/5xFAD_brain/5_9_23.csv"
# blank_name = "Blank_Blank_Blank_Blank_Blank" ##Maybe Standardize the position so that we can automatically take it - Or a second reading file
# splash_name = "SPLASH_splash_splash_splash_splash"
# clean_name = "IPA_clean_clean_clean_clean"
# remove_list = ["SPLASH_splash_splash_splash_splash","IPA_clean_clean_clean_clean"]
# Pre_edge_r_path = "Pre_EdgeR/5xFAD_brain/5_09_23/"

# plots_2_save_path = "Plots/5xFAD_brain/5_09_23/"

In [4]:
##Labels DF and Labels List
labels_df = pd.read_csv(label_file)
labels_list = list(labels_df)
labels_list = labels_list +["Class","Lipid"]

In [5]:
##Json files

##Examples

# json1 = {"Cage": ["FAD131"]}
# json2 = {"Cage": ["DOD73"]}
# json3 = {"Cage": ["FAD131", "DOD73"]}
# json4 = {"Sex": ["Male"], "Genotype": ["5xFAD"], "Name": ["FAD131-5xFAD-M2liver"]}

json1 = {"Genotype": ["WT"]}
json2 = {"Genotype": ["5xFAD"]}
json3 = {"Genotype": ["WT"],"Sex": ["Male"]}
json4 = {"Genotype": ["WT"],"Sex": ["Female"]}
json5 = {"Genotype": ["5xFAD"],"Sex": ["Male"]}
json6 = {"Genotype": ["5xFAD"],"Sex": ["Female"]}

json7 = {"Genotype": ["WT"],"Brain Region": ["hippocampus"]}
json8 = {"Genotype": ["WT"],"Brain Region": ["cortex "]}
json7A = {"Genotype": ["WT"],"Brain Region": ["cerebellum"]}
json8A = {"Genotype": ["WT"],"Brain Region": ["diencephalon"]}

json9 = {"Genotype": ["5xFAD"],"Brain Region": ["hippocampus"]}
json10 = {"Genotype": ["5xFAD"],"Brain Region": ["cortex "]}
json11 = {"Genotype": ["5xFAD"],"Brain Region": ["cerebellum"]}
json12 = {"Genotype": ["5xFAD"],"Brain Region": ["diencephalon"]}

json13 = {"Genotype": ["WT"],"Brain Region": ["hippocampus"],"Sex": ["Male"]}
json14 = {"Genotype": ["WT"],"Brain Region": ["cortex "],"Sex": ["Male"]}
json15 = {"Genotype": ["WT"],"Brain Region": ["cerebellum"],"Sex": ["Male"]}
json16 = {"Genotype": ["WT"],"Brain Region": ["diencephalon"],"Sex": ["Male"]}
json17 = {"Genotype": ["WT"],"Brain Region": ["hippocampus"],"Sex": ["Female"]}
json18 = {"Genotype": ["WT"],"Brain Region": ["cortex "],"Sex": ["Female"]}
json19 = {"Genotype": ["WT"],"Brain Region": ["cerebellum"],"Sex": ["Female"]}
json20 = {"Genotype": ["WT"],"Brain Region": ["diencephalon"],"Sex": ["Female"]}


json21 = {"Genotype": ["5xFAD"],"Brain Region": ["hippocampus"],"Sex": ["Male"]}
json22 = {"Genotype": ["5xFAD"],"Brain Region": ["cortex "],"Sex": ["Male"]}
json23 = {"Genotype": ["5xFAD"],"Brain Region": ["cerebellum"],"Sex": ["Male"]}
json24 = {"Genotype": ["5xFAD"],"Brain Region": ["diencephalon"],"Sex": ["Male"]}
json25 = {"Genotype": ["5xFAD"],"Brain Region": ["hippocampus"],"Sex": ["Female"]}
json26 = {"Genotype": ["5xFAD"],"Brain Region": ["cortex "],"Sex": ["Female"]}
json27 = {"Genotype": ["5xFAD"],"Brain Region": ["cerebellum"],"Sex": ["Female"]}
json28 = {"Genotype": ["5xFAD"],"Brain Region": ["diencephalon"],"Sex": ["Female"]}


json_list_singles = [json1,json2,json3,json4,json5,json6,json7,json7A,json8A,json8,json9,json10,json11,json12,json13,json14,
                     json15,json16,json17,json18,json19,json20,json21,json22,json23,json25,json26,json27,json28]

In [6]:
###Comparisons
json_list_pairs = [[json2,json1],[json5,json3],[json6,json4],
                   [json9,json7],[json10,json8],[json11,json7A],[json12,json8A],
                   [json21,json13],[json22,json14],[json23,json15],[json24,json16],[json25,json17],[json26,json18],
                  [json27,json19],[json28,json20]]

##Maybe make this all possible combinations in json_list_singles

MAKE CLASSES FOR EACH LIPID

In [7]:

###All functions

#Function to read in MRM database
#Option to remove STDs from database##Not finished need option to use another database with no qualitative ACs


def read_mrm_list(filename,remove_std = True):
    mrm_list_new = pd.read_excel(filename, sheet_name=None)
    mrm_list_new = pd.concat(mrm_list_new, ignore_index=True)
    mrm_list_offical = mrm_list_new[['Compound Name', 'Parent Ion', 'Product Ion', 'Class']]
    # Add underscore to middle of columns names
    mrm_list_offical.columns = mrm_list_offical.columns.str.replace(' ', '_')
    # Round Parent Ion and Product Ion to 1 decimal place
    mrm_list_offical['Parent_Ion'] = np.round(mrm_list_offical['Parent_Ion'],1)
    mrm_list_offical['Product_Ion'] = np.round(mrm_list_offical['Product_Ion'],1)
    # Create transition column by combining Parent Ion and Product Ion with arrow between numbers
    mrm_list_offical['Transition'] = mrm_list_offical['Parent_Ion'].astype(str) + ' -> ' + mrm_list_offical['Product_Ion'].astype(str)
    # Change column compound name to lipid
    mrm_list_offical = mrm_list_offical.rename(columns={'Compound_Name': 'Lipid'})
    # Make a column called Class match lipid column to lipid types
    if remove_std == True:
        lipid_class = mrm_list_offical['Class'].unique()
        lipid_class_to_keep = ['PS','PG','CE','PC', 'DAG', 'PE', 'TAG', 'FA', 'Cer', 'CAR', 'PI','SM']
        mrm_list_offical = mrm_list_offical[mrm_list_offical['Class'].isin(lipid_class_to_keep)]
    return mrm_list_offical



def mzml_parser(file_name):
    df = pd.DataFrame(columns=['Lipid','Parent_Ion','Product_Ion','Intensity','Transition','Class','Sample_ID'])
    data_folder = os.listdir(file_name) #Path to the mzml files
    data_folder.sort()
    path_to_mzml_files = file_name
    for file in data_folder:
            if file.endswith('.mzML'):

                    run = pymzml.run.Reader(path_to_mzml_files+file, skip_chromatogram=False) #Load the mzml file into the run object



                    df_all = pd.DataFrame(columns=['Lipid','Parent_Ion','Product_Ion','Intensity','Transition','Class','Sample_ID']) #Create empty pandas dataframe to store the data

                    #create pandas dataframe to store the data with the columns Parent Ion, Product Ion, Intensity, Transition Lipid and Class
                   
                    q1_mz = 0 #Create empty variables to store the Q1 and Q3 m/z values
                    q3_mz = 0
                    count = 0 #Create a counter to keep track of the number of transitions
                    for spectrum in run:


                            for element in spectrum.ID.split(' '):
                                    intensity_store = np.array([])
                                    if 'Q1' in element:
                                            q1 = element.split('=')
                                            q1_mz= np.round((float(q1[1])),1)

                                    if 'Q3' in element:
                                
                                            q3 = element.split('=')
  
                                            q3_mz=np.round(float(q3[1]),1)


                                            for mz,intensity in spectrum.peaks(): #Get the m/z and intensity values from the spectrum
                                                    intensity_store = np.append(intensity_store,intensity) #Store the intensity values in an array



                                    if 'Q3' in element:
                                            # print(intensity_sum)
                                            intensity_sum = np.sum(intensity_store) #Sum the intensity values
                                            df_all.loc[count,'Parent_Ion'] = q1_mz #Store the Q1 and Q3 m/z values in the pandas dataframe
                                            df_all.loc[count,'Product_Ion'] = q3_mz
                                            #round the Q1 and Q3 m/z values to 1 decimal places
                                            df_all.loc[count,'Parent_Ion'] = np.round(df_all.loc[count,'Parent_Ion'],1)
                                            df_all.loc[count,'Product_Ion'] = np.round(df_all.loc[count,'Product_Ion'],1)
                                            df_all.loc[count,'Intensity'] = intensity_sum #Store the intensity values in the pandas dataframe
                                            df_all.loc[count,'Transition'] = str(q1_mz)+ ' -> '+ str(q3_mz) #Store the transition values in the pandas dataframe
                                            #add file name to Sample_ID column without the mzmL extension
                                            df_all.loc[count,'Sample_ID'] = file[:-5]
                                            count+=1

            #append df_all to df
            df = df.append(df_all, ignore_index=True)
    return df

# Function to create an ion dictionary from an MRM database DataFrame
def create_ion_dict(mrm_database):
    ion_dict = defaultdict(list)
    # Iterate through the rows of the MRM database DataFrame
    for index, row in mrm_database.iterrows():
        # Add a tuple with Lipid and Class to the ion dictionary using Parent_Ion and Product_Ion as the key
        ion_dict[(row['Parent_Ion'], row['Product_Ion'])].append((row['Lipid'], row['Class']))
    return ion_dict

# Function to check if the absolute difference between two values is within a given tolerance
def within_tolerance(a, b, tolerance=0.1):
    return abs(a - b) <= tolerance

# Function to match the ions in a DataFrame row with the ions in an ion dictionary
def match_ions(row, ion_dict, tolerance=0.1):
    ions = (row['Parent_Ion'], row['Product_Ion'])
    matched_lipids = []
    matched_classes = []

    # Iterate through the ion dictionary
    for key, value in ion_dict.items():
        # Check if both the Parent_Ion and Product_Ion values are within the specified tolerance
        if within_tolerance(ions[0], key[0], tolerance) and within_tolerance(ions[1], key[1], tolerance):
            # If within tolerance, extend the matched_lipids and matched_classes lists with the corresponding values
            matched_lipids.extend([match[0] for match in value])
            matched_classes.extend([match[1] for match in value])

    # If any matches were found, update the Lipid and Class columns in the row
    if matched_lipids and matched_classes:
        row['Lipid'] = ' | '.join(matched_lipids)
        row['Class'] = ' | '.join(matched_classes)

    return row

####Combined functions for Matching

def match_lipids_parser(mrm_database,df, tolerance=0.3):
    ion_dict = create_ion_dict(mrm_database)
    # Assuming you have the df DataFrame to apply the match_ions function
    df_matched = df.apply(lambda row: match_ions(row, ion_dict=ion_dict, tolerance=tolerance), axis=1)


    df_matched = df_matched.dropna()
    
    return df_matched


def save_dataframe(df, folder_name, file_name, max_attempts=5):
    folder_path = f'data_results/data/data_matching/{folder_name}'
    os.makedirs(folder_path, exist_ok=True)

    for i in range(max_attempts):
        file_path = f'{folder_path}/{file_name}.csv'
        if not os.path.isfile(file_path):
            df.to_csv(file_path, index=False)
            print(f"Saved DataFrame to {file_path}")
            break
    else:
        print(f"Failed to save DataFrame after {max_attempts} attempts.")
        return None

##Adds labels and method type
def add_labels(labels_df,matched_df):
    for _, group_row in labels_df.iterrows():
        for index, burda_row in matched_df.iterrows():
            if group_row['Sample Name'].lower() in burda_row['Sample_ID'].lower():
                # Add group_row data to the corresponding burda_row if the condition is met
                for col in labels_df.columns:
                    if col not in matched_df.columns:
                        matched_df[col] = None
                    matched_df.at[index, col] = group_row[col]

    matched_df['method_type'] = matched_df['Sample_ID'].apply(lambda x: x.split('_')[0])

    return matched_df
    
def subtract_blank(labels_df,matched_df,remove_list,blank_name):
    ###Removing Stuff
    for i in remove_list:
        matched_df = matched_df[matched_df['Sample Name'] != i]
    
    matched_df = matched_df.dropna()

    ###Subtracting the Blank
    blank_intensities_df = matched_df[matched_df['Sample Name'] == blank_name][['Lipid',"Transition", 'Intensity', 'method_type']]
    blank_intensities_df.columns = ['Lipid',"Transition", 'Blank_Intensity', 'method_type']


# Get the unique names in the Name column, excluding the blank_name
    unique_names = matched_df['Sample Name'].unique()
    print(unique_names)
    ###Keep this line to drop the blank
    # unique_names = unique_names[unique_names != blank_name] ###This drops the blank
    merged_df = pd.DataFrame()  
    for name in unique_names:
        temp_df = matched_df[matched_df['Sample Name'] == name]
        print(len(temp_df),"TEMP")
        numbers = np.array((temp_df["Intensity"] ))
        numbers1 = np.array((blank_intensities_df['Blank_Intensity'] ))
        print(len(temp_df),"TEMP")
        numbers2 = numbers - numbers1
        numbers2[numbers2<0] = 0
        # Merge the blank intensities DataFrame with the temporary DataFrame
        temp_df["Blank Subtraction"] = numbers2
        print(len(temp_df),"TEMP")
        merged_df = merged_df.append(temp_df)



    merged_df['Class'] = merged_df['Class'].replace({'TAG | TAG': 'TAG', 'FA | FA': 'FA'})

    return merged_df




def filter_dataframe(df, json_filter):
    for column, values in json_filter.items():
        df = df[df[column].isin(values)]
    return df


# Function to convert HEX colors to RGBA with specified alpha value, and then back to HEX
def hex_to_rgba_hex(hex_color, alpha=1):
    rgba_color = list(mcolors.hex2color(hex_color))
    rgba_color.append(alpha)
    return mcolors.to_hex(rgba_color, keep_alpha=True)

def json_to_string(json_dict):
    result = []
    for key, values in json_dict.items():
        values_str = ' '.join(values)
        result.append(f"{key}: {values_str}")
    return ' | '.join(result)





def make_pie_chart_no_replicates(merged_df,save_path,json_list_singles,labels_list,show_percentages=False):

    ##Maybe filter?
    for i in range(len(json_list_singles)):
        plot_title = 'Sum of Intensity by Class for' + json_to_string(json_list_singles[i])
        save_name = save_path+json_to_string(json_list_singles[i]).replace(" | ","__")
        filtered_df = filter_dataframe(merged_df,json_list_singles[i])
        filtered_df = filtered_df[filtered_df["Sample Name"] != blank_name]

        groupby_columns = labels_list

        # Aggregate functions to apply
        aggregations = {
            'Intensity': 'mean',
            'Blank Subtraction': 'mean'
        }

        # Group the DataFrame and apply the aggregation functions

        filtered_df_grouped = filtered_df.groupby(groupby_columns).agg(aggregations).reset_index()
        grouped_df_sum_avg = filtered_df_grouped.groupby('Class')['Blank Subtraction'].sum().reset_index()


        textinfo = 'label+percent' if show_percentages else "none"




        fig1 = px.pie(grouped_df_sum_avg, values='Blank Subtraction', names='Class', title=plot_title)
        fig1.update_traces(marker=dict(colors=[lipid_class_colors_alpha[lipid.upper()] for lipid in grouped_df_sum_avg['Class']]),
                        hovertemplate='%{label}: %{percent:.2%}', textinfo=textinfo)
        pio.write_html(fig1,  save_name + "Sum Pie.html")

        # Save the plot as plot.png
        pio.write_image(fig1, save_name + "Sum Pie.svg")
    return



def average_pie_chart_no_repeats(merged_df,save_path,json_list_singles,labels_list,show_percentages=False):

    for i in range(len(json_list_singles)):
        plot_title = 'Mean of Intensity by Class for' + json_to_string(json_list_singles[i])
        save_name = save_path+json_to_string(json_list_singles[i]).replace(" | ","__")
        filtered_df = filter_dataframe(merged_df,json_list_singles[i])
        filtered_df = filtered_df[filtered_df["Sample Name"] != blank_name]

        groupby_columns = labels_list

        # Aggregate functions to apply
        aggregations = {
            'Intensity': 'mean',
            'Blank Subtraction': 'mean'
        }

        # Group the DataFrame and apply the aggregation functions

        filtered_df_grouped = filtered_df.groupby(groupby_columns).agg(aggregations).reset_index()
        grouped_df_avg = filtered_df_grouped.groupby('Class')['Blank Subtraction'].mean().reset_index()


        textinfo = 'label+percent' if show_percentages else "none"




        fig1 = px.pie(grouped_df_avg, values='Blank Subtraction', names='Class', title=plot_title)
        fig1.update_traces(marker=dict(colors=[lipid_class_colors_alpha[lipid.upper()] for lipid in grouped_df_avg['Class']]),
                        hovertemplate='%{label}: %{percent:.2%}', textinfo=textinfo)
        pio.write_html(fig1,  save_name + "Average Pie.html")

        # Save the plot as plot.png
        pio.write_image(fig1, save_name + "Average Pie .svg")
    return


def make_bar_plot_comparisons(merged_df, save_path, json_list_pairs,labels_list):

#     merged_df = merged_df[merged_df["Name"] != blank_name]

    ###This plot makes relative bar plots of comparison for total Intensity and average intensity
    ###need to have a better group by column method list to add perhaps with the propper added names from first parser
    ##Should be Lipid, Class and other important info from labeling dataframe
    
    for i in range(len(json_list_pairs)):
        json1 = json_list_pairs[i][0]
        json2 = json_list_pairs[i][1]
        custom_name1 = json_to_string(json1)
        custom_name2 = json_to_string(json2)
        plot_tile = custom_name1+ " vs "+custom_name2
        save_name  = save_path+plot_tile.replace(" | ","__")

#         merged_df = merged_df[merged_df["Name"] != blank_name]

        groupby_columns = labels_list

        aggregations = {
            'Intensity': 'mean',
            'Blank Subtraction': 'mean'
        }
        merged_df1 = merged_df[merged_df["Sample Name"] != blank_name]
        merged_df1 = merged_df1.groupby(groupby_columns).agg(aggregations).reset_index()

        filtered_df1 = filter_dataframe(merged_df1, json1)
        filtered_df2 = filter_dataframe(merged_df1, json2)
        
#         filtered_df1 = filtered_df1[filtered_df1["Sample Name"] != blank_name]
#         filtered_df2 = filtered_df2[filtered_df2["Sample Name"] != blank_name]
        
        filtered_df1['Class'] = filtered_df1['Class'].str.upper()
        filtered_df2['Class'] = filtered_df2['Class'].str.upper()

        filtered_df1_avg = filtered_df1.groupby('Class')['Blank Subtraction'].mean().reset_index()
        filtered_df2_avg = filtered_df2.groupby('Class')['Blank Subtraction'].mean().reset_index()



        combined_max = filtered_df1_avg.set_index('Class')[['Blank Subtraction']].combine(filtered_df2_avg.set_index('Class')[['Blank Subtraction']], np.maximum)


        normalized_df1_avg = filtered_df1_avg.set_index('Class').divide(combined_max)
        normalized_df2_avg = filtered_df2_avg.set_index('Class').divide(combined_max)

        trace1 = go.Bar(
            x=normalized_df1_avg.index,
            y=normalized_df1_avg['Blank Subtraction'],
            name=custom_name1,
            marker=dict(color='red'),
        )

        trace2 = go.Bar(
            x=normalized_df2_avg.index,
            y=normalized_df2_avg['Blank Subtraction'],
            name=custom_name2,
            marker=dict(color='blue'),
        )

        layout = go.Layout(
            title=plot_tile,
            xaxis=dict(title="Class"),
            yaxis=dict(title="Normalized Mean Intensity Blank Subtraction"),
            barmode="group",
        )

        fig = go.Figure(data=[trace1, trace2], layout=layout)

        pio.write_image(fig, save_name + "bar.svg")
        pio.write_html(fig, save_name + "bar.html")
    return



lipid_classes = ["CAR", "CE", "Cer", "FA", "PC", "PE", "PG", "PI", "PS", "SM", "TAG",'DAG','TAG | DAG','DAG | CE','TAG | DAG | CE']
lipid_colors = ["#a6cee3", "#1f78b4", "#b2df8a", "#33a02c", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#808080", "#cab2d6", "#6a3d9a",'#8dd3c7', '#ffffb3', '#bebada', '#fb8072', '#80b1d3']

lipid_class_colors = dict(zip([lipid.upper() for lipid in lipid_classes], lipid_colors))


# Update colors with 0.5 alpha
alpha = 0.5
transparent_colors = [hex_to_rgba_hex(color, alpha) for color in lipid_colors]
lipid_class_colors_alpha = dict(zip([lipid.upper() for lipid in lipid_classes], transparent_colors))
    

def full_parse(data_base_name_location,mzml_folder, folder_name_to_save, file_name_to_save,tolerance,remove_std = True,
               save_data=False):
    mrm_database = read_mrm_list(data_base_name_location,remove_std=remove_std)
    df = mzml_parser(mzml_folder)
    df_matched = match_lipids_parser(mrm_database,df, tolerance=tolerance)

    
    if save_data == True:
        
        save_dataframe(df_matched, folder_name_to_save, file_name_to_save)

    return df_matched





In [8]:
df_matched = full_parse(data_base_name_location,mzml_folder, folder_name_to_save, 
                        file_name_to_save,tolerance, remove_std = remove_std,save_data=save_data)

df_matched = add_labels(labels_df,df_matched)
merged_df = subtract_blank(labels_df,df_matched,remove_list,blank_name)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  mrm_list_offical['Parent_Ion'] = np.round(mrm_list_offical['Parent_Ion'],1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  mrm_list_offical['Product_Ion'] = np.round(mrm_list_offical['Product_Ion'],1)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  mrm_list_offical['Transition'] = mrm_list_offical['

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_

  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)
  df = df.append(df_all, ignore_index=True)


Failed to save DataFrame after 5 attempts.
['Blank_Blank_Blank_Blank_Blank' 'DOD93_F4_Female_5xFAD_cerebellum'
 'DOD93_F4_Female_5xFAD_cortex ' 'DOD93_F4_Female_5xFAD_diencephalon'
 'DOD93_F4_Female_5xFAD_hippocampus' 'DOD94_F3_Female_WT_cerebellum'
 'DOD94_F3_Female_WT_cortex ' 'DOD94_F3_Female_WT_diencephalon'
 'DOD94_F3_Female_WT_hippocampus' 'DOD94_F4_Female_5xFAD_cerebellum'
 'DOD94_F4_Female_5xFAD_cortex ' 'DOD94_F4_Female_5xFAD_diencephalon'
 'DOD94_F4_Female_5xFAD_hippocampus' 'FAD184_F3_Female_WT_cerebellum'
 'FAD184_F3_Female_WT_cortex ' 'FAD184_F3_Female_WT_diencephalon'
 'FAD184_F3_Female_WT_hippocampus' 'FAD184_F4_Female_WT_cerebellum'
 'FAD184_F4_Female_WT_cortex ' 'FAD184_F4_Female_WT_diencephalon'
 'FAD184_F4_Female_WT_hippocampus' 'FAD185_M1_Male_5xFAD_cerebellum'
 'FAD185_M1_Male_5xFAD_cortex ' 'FAD185_M1_Male_5xFAD_diencephalon'
 'FAD185_M1_Male_5xFAD_hippocampus' 'FAD185_M3_Male_5xFAD_cerebellum'
 'FAD185_M3_Male_5xFAD_cortex ' 'FAD185_M3_Male_5xFAD_diencephalon'
 '

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  mer

3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  mer

3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  mer

3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  mer

3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP
3256 TEMP


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  merged_df = merged_df.append(temp_df)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp_df["Blank Subtraction"] = numbers2
  mer

In [16]:
def add_subclass_and_length(merged_df):
    merged_df = merged_df.reset_index(drop=True)
    merged_df.loc[merged_df['Class'] != 'Cer', 'Lipid'] = merged_df.loc[merged_df['Class'] != 'Cer', 'Lipid'].str.replace(',', '|')




    # merged_df = merged_df[merged_df['Class'] != 'PG']
    # merged_df = merged_df[merged_df['Class'] != 'PE']
    # merged_df = merged_df[merged_df['Class'] != 'PI']
    # merged_df = merged_df[merged_df['Class'] != 'PS']
    # merged_df = merged_df[merged_df['Class'] != 'SM']

    print(set(list(merged_df["Class"])))
    print(list(set(list(merged_df["Class"]))))

    subclasses = []
    chain_length = []
    saturation = []
    between_parenthese = []
    FA_Chain = []
    ##Would need to recheck for CE -O and all other classes if tolerance or MRMs are added
    ##Should add natural Multiples and replace the , with a |

    string = "DG(24:0)_C16:0"
    print(string[string.find("_C")+2:])

    print(len(merged_df))
    # exit()

    for i in range(len(merged_df)):
        jj = merged_df["Class"][i]
        xx = merged_df["Lipid"][i]
        subclass = ""

        if jj == "FA": ##Maybe add chain length and saturation 
            subclasses.append("FA")

            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""
                for i in temp_lipid:
                    if len(temp_string_length) >0:
                        temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                        between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                    else:

                        temp_string_length = i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = i[i.find(":")+1:i.find(")")]
                        between_parents_temp = i[i.find("("):i.find(")")+1]

                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append(between_parents_temp)


            else:


                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append(xx[xx.find("("):xx.find(")")+1])


        elif jj == "CAR":
            FA_Chain.append("None")
            subclasses.append("CAR")

            if "(" in xx:
                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])

            else:
                chain_length.append("0")
                saturation.append("0")    
                between_parenthese.append("None")


        ###What about O????
        ######
        elif jj == "CE":
            FA_Chain.append("None")

            if "O2" not in xx:
                subclasses.append("CE")

                # if "(" in xx:
                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])

            elif "O2" in xx:
                subclasses.append("CE-O")

                # if "(" in xx:
                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(";")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])



        elif jj == "Cer":
            FA_Chain.append("None")

            if "CerP" in xx:
                subclasses.append("CerP")
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                saturation.append("NA")

                chain_length.append("NA")

            elif "1-O-" in xx:
                subclasses.append("1-O-Cer")
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                saturation.append("NA")
                chain_length.append("NA")


            elif "omega-linoleoyloxy-GlcCer(" in xx:
                subclasses.append("omega-linoleoyloxy-GlcCer")
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                saturation.append("NA")
                chain_length.append("NA")


            elif "omega-linoleoyloxy-Cer" in xx:

                subclasses.append("omega-linoleoyloxy-Cer")
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                saturation.append("NA")

                chain_length.append("NA")

            else:
                subclasses.append("Cer")
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                saturation.append("NA")

                chain_length.append("NA")

        elif jj == "DAG" or jj== "DAG | CE" or jj=="TAG | DAG" or jj == "TAG | DAG | CE":
            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""
                FA_Chain_temp = ""
                temp_sub_class = ""
                for i in temp_lipid:
                    if "DG" in i:
                        if len(temp_string_length) >0:
                            temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                            temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                            between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                            FA_Chain_temp = FA_Chain_temp+" | "+ i[i.find("_C")+2:]
                            if "DG(dO" in i:
                                temp_sub_class = temp_sub_class +" | "+ "DG-DO"
                            elif "DG(" in i:
                                temp_sub_class = temp_sub_class +" | "+"DG"
                            elif "DG(P" in i:
                                temp_sub_class = temp_sub_class +" | "+"DG-P"
                            else:
                                temp_sub_class = temp_sub_class +" | "+ "DG-O"      
                        else:

                            temp_string_length = i[i.find("(")+1:i.find(":")]
                            temp_string_saturation = i[i.find(":")+1:i.find(")")]
                            between_parents_temp = i[i.find("("):i.find(")")+1]
                            FA_Chain_temp = i[i.find("_C")+2:]
                            if "DG(dO" in i:
                                temp_sub_class = "DG-DO"

                            elif "DG(P" in i:
                                temp_sub_class = "DG-P"
                            elif "DG(O" in i:
                                temp_sub_class = "DG-O" 
                            else:
                                temp_sub_class = "DG"

                    elif "CE" in i:                            
                        if len(temp_string_length) >0:
                            temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                            temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                            between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                            FA_Chain_temp = FA_Chain_temp+" | "+ "NA"
                        else:
                            temp_string_length =  i[i.find("(")+1:i.find(":")]
                            temp_string_saturation = i[i.find(":")+1:i.find(")")]
                            between_parents_temp = i[i.find("("):i.find(")")+1]
                            FA_Chain_temp =  "NA"

                    elif "TG" in i:
        ##Add TAG for this one after I do individual TAG
                        if len(temp_string_length) >0:
                            temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                            temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                            between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                            FA_Chain_temp = FA_Chain_temp+" | "+ i[i.find("_FA")+3:]
                            if "TG(O" in i:
                                temp_sub_class = temp_sub_class +" | "+ "TG(O"
                            else:
                                temp_sub_class = temp_sub_class +" | "+"TG"

                        else:

                            temp_string_length = i[i.find("(")+1:i.find(":")]
                            temp_string_saturation = i[i.find(":")+1:i.find(")")]
                            between_parents_temp = i[i.find("("):i.find(")")+1]
                            FA_Chain_temp = i[i.find("_FA")+3:]
                            if "TG(O" in i:
                                temp_sub_class = "TG(O"
                            else:
                                temp_sub_class = "TG"



                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append(FA_Chain_temp)
                subclasses.append(temp_sub_class)


            else:


                # exit()
                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append(xx[xx.find("_C")+2:])

                if "DG(dO" in xx:
                    temp_sub_class = "DG-DO"

                elif "DG(P" in xx:
                    temp_sub_class = "DG-P"
                elif "DG(O" in xx:
                    temp_sub_class = "DG-O" 
                else:
                    temp_sub_class = "DG"
                subclasses.append(temp_sub_class)


        elif jj == "TAG":

            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""
                FA_Chain_temp = ""
                temp_sub_class = ""
                for i in temp_lipid:
                    if len(temp_string_length) >0:
                        temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                        between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                        FA_Chain_temp = FA_Chain_temp+" | "+ i[i.find("_FA")+3:]
                        if "TG(O" in i:
                            temp_sub_class = temp_sub_class +" | "+ "TG(O"
                        else:
                            temp_sub_class = temp_sub_class +" | "+"TG"

                    else:

                        temp_string_length = i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = i[i.find(":")+1:i.find(")")]
                        between_parents_temp = i[i.find("("):i.find(")")+1]
                        FA_Chain_temp = i[i.find("_FA")+3:]
                        if "TG(O" in i:
                            temp_sub_class = "TG(O"
                        else:
                            temp_sub_class = "TG"

                subclasses.append(temp_sub_class)
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append(FA_Chain_temp)


            else:

                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append(xx[xx.find("_FA")+3:])

                if "TG(O" in xx:
                    temp_sub_class = "TG(O"
                else:
                    temp_sub_class = "TG"
                subclasses.append(temp_sub_class)

        elif jj == "PC":
            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""

                temp_sub_class = ""
                for i in temp_lipid:
                    if len(temp_string_length) >0:
                        temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                        between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                        if "LPC(O-" in i:
                            temp_sub_class =temp_sub_class +" | "+ "LPC(O-"
                        elif "LPC(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPC(P-"    
                        elif "LPC" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPC"    
                        elif "PC(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PC(P-"
                        elif "PC(O-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PC(O-"      
                        else:
                            temp_sub_class = temp_sub_class +" | "+"PC"

                    else:

                        temp_string_length = i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = i[i.find(":")+1:i.find(")")]
                        between_parents_temp = i[i.find("("):i.find(")")+1]

                        if "LPC(O-" in i:
                            temp_sub_class = "LPC(O-"
                        elif "LPC(P-" in i:
                            temp_sub_class = "LPC(P-"    
                        elif "LPC" in i:
                            temp_sub_class = "LPC"    
                        elif "PC(P-" in i:
                            temp_sub_class = "PC(P-"
                        elif "PC(O-" in i:
                            temp_sub_class = "PC(O-"      
                        else:
                            temp_sub_class = "PC"

                subclasses.append(temp_sub_class)
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append("NA")


            else:

                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append("NA")
                if "LPC(O-" in xx:
                    temp_sub_class = "LPC(O-"
                elif "LPC(P-" in xx:
                    temp_sub_class = "LPC(P-"    
                elif "LPC" in xx:
                    temp_sub_class = "LPC"    
                elif "PC(P-" in xx:
                    temp_sub_class = "PC(P-"
                elif "PC(O-" in xx:
                    temp_sub_class = "PC(O-"      
                else:
                    temp_sub_class = "PC"
                subclasses.append(temp_sub_class)

        elif jj == "PE":
            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""

                temp_sub_class = ""
                for i in temp_lipid:
                    if len(temp_string_length) >0:
                        temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                        between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                        if "LPE(O-" in i:
                            temp_sub_class =temp_sub_class +" | "+ "LPE(O-"
                        elif "LPE(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPE(P-"    
                        elif "LPE" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPE"    
                        elif "PE(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PE(P-"
                        elif "PE(O-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PE(O-"      
                        else:
                            temp_sub_class = temp_sub_class +" | "+"PE"

                    else:

                        temp_string_length = i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = i[i.find(":")+1:i.find(")")]
                        between_parents_temp = i[i.find("("):i.find(")")+1]

                        if "LPE(O-" in i:
                            temp_sub_class = "LPE(O-"
                        elif "LPE(P-" in i:
                            temp_sub_class = "LPE(P-"    
                        elif "LPE" in i:
                            temp_sub_class = "LPE"    
                        elif "PE(P-" in i:
                            temp_sub_class = "PE(P-"
                        elif "PE(O-" in i:
                            temp_sub_class = "PE(O-"      
                        else:
                            temp_sub_class = "PE"

                subclasses.append(temp_sub_class)
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append("NA")


            else:

                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append("NA")
                if "LPE(O-" in xx:
                    temp_sub_class = "LPE(O-"
                elif "LPE(P-" in xx:
                    temp_sub_class = "LPE(P-"    
                elif "LPE" in xx:
                    temp_sub_class = "LPE"    
                elif "PE(P-" in xx:
                    temp_sub_class = "PE(P-"
                elif "PE(O-" in xx:
                    temp_sub_class = "PE(O-"      
                else:
                    temp_sub_class = "PE"
                subclasses.append(temp_sub_class)

        elif jj == "PG":
            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""

                temp_sub_class = ""
                for i in temp_lipid:
                    if len(temp_string_length) >0:
                        temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                        between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                        if "LPG(O-" in i:
                            temp_sub_class =temp_sub_class +" | "+ "LPG(O-"
                        elif "LPG(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPG(P-"    
                        elif "LPG" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPG"    
                        elif "PG(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PG(P-"
                        elif "PG(O-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PG(O-"      
                        else:
                            temp_sub_class = temp_sub_class +" | "+"PG"

                    else:

                        temp_string_length = i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = i[i.find(":")+1:i.find(")")]
                        between_parents_temp = i[i.find("("):i.find(")")+1]

                        if "LPG(O-" in i:
                            temp_sub_class = "LPG(O-"
                        elif "LPG(P-" in i:
                            temp_sub_class = "LPG(P-"    
                        elif "LPG" in i:
                            temp_sub_class = "LPG"    
                        elif "PG(P-" in i:
                            temp_sub_class = "PG(P-"
                        elif "PG(O-" in i:
                            temp_sub_class = "PG(O-"      
                        else:
                            temp_sub_class = "PG"

                subclasses.append(temp_sub_class)
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append("NA")


            else:

                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append("NA")
                if "LPG(O-" in xx:
                    temp_sub_class = "LPG(O-"
                elif "LPG(P-" in xx:
                    temp_sub_class = "LPG(P-"    
                elif "LPG" in xx:
                    temp_sub_class = "LPG"    
                elif "PG(P-" in xx:
                    temp_sub_class = "PG(P-"
                elif "PG(O-" in xx:
                    temp_sub_class = "PG(O-"      
                else:
                    temp_sub_class = "PG"
                subclasses.append(temp_sub_class)

        elif jj == "PI":
            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""

                temp_sub_class = ""
                for i in temp_lipid:
                    if len(temp_string_length) >0:
                        temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                        between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                        if "LPI(O-" in i:
                            temp_sub_class =temp_sub_class +" | "+ "LPI(O-"
                        elif "LPI(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPI(P-"    
                        elif "LPI" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPI"    
                        elif "PI(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PI(P-"
                        elif "PI(O-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PI(O-"      
                        else:
                            temp_sub_class = temp_sub_class +" | "+"PI"

                    else:

                        temp_string_length = i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = i[i.find(":")+1:i.find(")")]
                        between_parents_temp = i[i.find("("):i.find(")")+1]

                        if "LPI(O-" in i:
                            temp_sub_class = "LPI(O-"
                        elif "LPI(P-" in i:
                            temp_sub_class = "LPI(P-"    
                        elif "LPI" in i:
                            temp_sub_class = "LPI"    
                        elif "PI(P-" in i:
                            temp_sub_class = "PI(P-"
                        elif "PI(O-" in i:
                            temp_sub_class = "PI(O-"      
                        else:
                            temp_sub_class = "PI"

                subclasses.append(temp_sub_class)
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append("NA")


            else:

                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append("NA")
                if "LPI(O-" in xx:
                    temp_sub_class = "LPI(O-"
                elif "LPI(P-" in xx:
                    temp_sub_class = "LPI(P-"    
                elif "LPI" in xx:
                    temp_sub_class = "LPI"    
                elif "PI(P-" in xx:
                    temp_sub_class = "PI(P-"
                elif "PI(O-" in xx:
                    temp_sub_class = "PI(O-"      
                else:
                    temp_sub_class = "PI"
                subclasses.append(temp_sub_class)


    #####
        elif jj == "PS":
            if " | " in xx:

                temp_lipid = xx.split("|")
                temp_string_length = ""
                temp_string_saturation = ""
                between_parents_temp = ""

                temp_sub_class = ""
                for i in temp_lipid:
                    if len(temp_string_length) >0:
                        temp_string_length = temp_string_length+" | "+ i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = temp_string_saturation+" | "+ i[i.find(":")+1:i.find(")")]
                        between_parents_temp = between_parents_temp +" | "+ i[i.find("("):i.find(")")+1]
                        if "LPS(O-" in i:
                            temp_sub_class =temp_sub_class +" | "+ "LPS(O-"
                        elif "LPS(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPS(P-"    
                        elif "LPS" in i:
                            temp_sub_class = temp_sub_class +" | "+"LPS"    
                        elif "PS(P-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PS(P-"
                        elif "PS(O-" in i:
                            temp_sub_class = temp_sub_class +" | "+"PS(O-"      
                        else:
                            temp_sub_class = temp_sub_class +" | "+"PS"

                    else:

                        temp_string_length = i[i.find("(")+1:i.find(":")]
                        temp_string_saturation = i[i.find(":")+1:i.find(")")]
                        between_parents_temp = i[i.find("("):i.find(")")+1]

                        if "LPS(O-" in i:
                            temp_sub_class = "LPS(O-"
                        elif "LPS(P-" in i:
                            temp_sub_class = "LPS(P-"    
                        elif "LPS" in i:
                            temp_sub_class = "LPS"    
                        elif "PS(P-" in i:
                            temp_sub_class = "PS(P-"
                        elif "PS(O-" in i:
                            temp_sub_class = "PS(O-"      
                        else:
                            temp_sub_class = "PS"

                subclasses.append(temp_sub_class)
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(between_parents_temp)
                FA_Chain.append("NA")


            else:

                temp_string_length = xx[xx.find("(")+1:xx.find(":")]
                temp_string_saturation = xx[xx.find(":")+1:xx.find(")")]
                chain_length.append(temp_string_length)
                saturation.append(temp_string_saturation)
                between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
                FA_Chain.append("NA")
                if "LPS(O-" in xx:
                    temp_sub_class = "LPS(O-"
                elif "LPS(P-" in xx:
                    temp_sub_class = "LPI(P-"    
                elif "LPS" in xx:
                    temp_sub_class = "LPS"    
                elif "PS(P-" in xx:
                    temp_sub_class = "PS(P-"
                elif "PS(O-" in xx:
                    temp_sub_class = "PS(O-"      
                else:
                    temp_sub_class = "PS"
                subclasses.append(temp_sub_class)



        elif jj == "SM":
            FA_Chain.append("None")

            subclasses.append("SM")
            between_parenthese.append(xx[xx.find("("):xx.find(")")+1])
            saturation.append("NA")

            chain_length.append("NA")

    merged_df["subclasses"] = subclasses
    merged_df["chain_length"] = subclasses
    merged_df["saturation"] = saturation
    merged_df["between_parenthese"] = between_parenthese
    merged_df["FA_Chain"] = FA_Chain

    return merged_df


In [17]:
merged_df = add_subclass_and_length(merged_df)

{'PE', 'CE', 'DAG | CE', 'TAG | DAG | CE', 'PC', 'DAG', 'TAG | DAG', 'PS', 'TAG', 'FA', 'PI', 'SM', 'CAR', 'Cer', 'PG'}
['PE', 'CE', 'DAG | CE', 'TAG | DAG | CE', 'PC', 'DAG', 'TAG | DAG', 'PS', 'TAG', 'FA', 'PI', 'SM', 'CAR', 'Cer', 'PG']
16:0
146520


In [18]:
list(merged_df)

['Lipid',
 'Parent_Ion',
 'Product_Ion',
 'Intensity',
 'Transition',
 'Class',
 'Sample_ID',
 'Sample Name',
 'Cage',
 'Position',
 'Dilution Factor',
 'Sex2',
 'Sex',
 'Genotype',
 'Brain Region',
 'method_type',
 'Blank Subtraction',
 'subclasses',
 'chain_length',
 'saturation',
 'between_parenthese',
 'FA_Chain']

In [15]:
merged_df['Class'][7]

'CAR'

In [69]:
json_list_pairs

[[{'Genotype': ['WT'], 'Brain Region': [], 'Sex': [], 'Gender': []},
  {'Genotype': ['5xFAD'], 'Brain Region': [], 'Sex': [], 'Gender': []}],
 [{'Genotype': ['5xFAD'],
   'Brain Region': ['diencephalon'],
   'Sex': ['Male'],
   'Gender': ['Female']},
  {'Genotype': [],
   'Brain Region': ['cortex'],
   'Sex': ['Male'],
   'Gender': ['Male']}],
 [{'Genotype': ['WT', '5xFAD'], 'Brain Region': [], 'Sex': [], 'Gender': []},
  {'Genotype': [],
   'Brain Region': ['diencephalon', 'hippocampus', 'cortex'],
   'Sex': [],
   'Gender': []}]]

In [80]:
# List to hold pairs of JSON objects
json_list_pairs = []

def remove_empty_entries(json_list_pairs):
    cleaned_list_pairs = [
        [
            {key: value for key, value in pair_dict.items() if value} for pair_dict in pair
        ] for pair in json_list_pairs
    ]
    return cleaned_list_pairs

# Initialize widgets_dict1 and widgets_dict2 to be filled later
widgets_dict1 = {}
widgets_dict2 = {}

# Create a function that displays the pair widgets
def display_pair_widgets():
    global widgets_dict1, widgets_dict2
    widgets_dict1 = {key: widgets.SelectMultiple(options=value, description=key) for key, value in main_json.items()}
    widgets_dict2 = {key: widgets.SelectMultiple(options=value, description=key) for key, value in main_json.items()}
    
    for key in main_json.keys():
        display(widgets.HBox([widgets_dict1[key], widgets_dict2[key]]))

display_pair_widgets()

# Define what to do on 'Generate JSON files' button click
def on_generate_clicked(b):
    # Build new_json based on the values selected in the widgets
    new_json1 = {key: list(widget.value) for key, widget in widgets_dict1.items()}
    new_json2 = {key: list(widget.value) for key, widget in widgets_dict2.items()}

    # Add new JSON objects to the list
    pair = [new_json1, new_json2]
    json_list_pairs.append(pair)
    
    # Print the new JSON objects
    print(json.dumps(new_json1, indent=2))
    print(json.dumps(new_json2, indent=2))

def on_add_more_clicked(b):
    # Build new_json based on the values selected in the widgets
    new_json1 = {key: list(widget.value) for key, widget in widgets_dict1.items()}
    new_json2 = {key: list(widget.value) for key, widget in widgets_dict2.items()}

    # Add new JSON objects to the list
    pair = [new_json1, new_json2]
    json_list_pairs.append(pair)
    
    # Clear current selection
    for widget in widgets_dict1.values():
        widget.value = []
    for widget in widgets_dict2.values():
        widget.value = []

# Create the buttons
generate_button = widgets.Button(description='Finish')
generate_button.on_click(on_generate_clicked)

add_more_button = widgets.Button(description='Add more JSON pairs')
add_more_button.on_click(on_add_more_clicked)

# Display the buttons
display(widgets.HBox([generate_button, add_more_button]))

HBox(children=(SelectMultiple(description='Genotype', options=('WT', '5xFAD'), value=()), SelectMultiple(descr…

HBox(children=(SelectMultiple(description='Brain Region', options=('diencephalon', 'hippocampus', 'cortex'), v…

HBox(children=(SelectMultiple(description='Sex', options=('Female', 'Male'), value=()), SelectMultiple(descrip…

HBox(children=(SelectMultiple(description='Gender', options=('Female', 'Male'), value=()), SelectMultiple(desc…

HBox(children=(Button(description='Finish', style=ButtonStyle()), Button(description='Add more JSON pairs', st…

{
  "Genotype": [
    "WT"
  ],
  "Brain Region": [],
  "Sex": [
    "Female"
  ],
  "Gender": []
}
{
  "Genotype": [
    "5xFAD"
  ],
  "Brain Region": [],
  "Sex": [
    "Female"
  ],
  "Gender": []
}


In [82]:
json_list_pairs = remove_empty_entries(json_list_pairs)

json_list_pairs

[[{'Genotype': ['5xFAD']}, {'Genotype': ['WT']}],
 [{'Genotype': ['WT'], 'Sex': ['Female']},
  {'Genotype': ['5xFAD'], 'Sex': ['Female']}]]

In [84]:
def get_unique_json_objects(json_list_pairs):
    json_set = set()
    for pair in json_list_pairs:
        for json_obj in pair:
            json_set.add(json.dumps(json_obj))
    
    json_list_singles = [json.loads(json_str) for json_str in json_set]
    return json_list_singles

In [85]:
json_list_singles = get_unique_json_objects(json_list_pairs)

In [86]:
json_list_singles

[{'Genotype': ['5xFAD']},
 {'Genotype': ['WT'], 'Sex': ['Female']},
 {'Genotype': ['WT']},
 {'Genotype': ['5xFAD'], 'Sex': ['Female']}]

In [9]:
len(df_matched) ##20405 df matched .3 tolerance

20408

In [10]:
len(merged_df)

20382

In [37]:
len(merged_df)/len(merged_df["Sample Name"].unique())

3397.0

In [24]:
groupby_columns = labels_list

# Aggregate functions to apply
aggregations = {
    'Intensity': 'mean',
    'Blank Subtraction': 'mean'
}

# Group the DataFrame and apply the aggregation functions

filtered_df_grouped = merged_df.groupby(groupby_columns).agg(aggregations).reset_index()

In [38]:
len(set(merged_df["Transition"]))

3232

In [27]:
len(set(filtered_df_grouped["Lipid"]))

3161

In [26]:
len(set(filtered_df_grouped["Lipid"]))

3161

In [30]:
# Count the unique instances of Class column
filtered_df_grouped = filter_dataframe(filtered_df_grouped,json2)
class_counts = filtered_df_grouped['Class'].value_counts()

# Save the counts to a CSV file
class_counts.to_csv('class_counts.csv')

# Convert Series to DataFrame for Plotly
df_class_counts = class_counts.reset_index()
df_class_counts.columns = ['Class', 'Count']

# Create a pie chart
fig = px.pie(df_class_counts, values='Count', names='Class', color='Class', 
             color_discrete_map=lipid_class_colors_alpha)
fig.update_layout(title_text='Count of Lipids')
pio.write_html(fig, "Relative_Count of Lipids.html")
fig.show()

In [32]:
lipid_classes = ["AC", "CE", "CER", "FFA", "PC", "PE", "PG", "PI", "PS", "SM", "TAG",'DAG','TAG | DAG','DAG | CE','TAG | DAG | CE']
lipid_colors = ["#a6cee3", "#1f78b4", "#b2df8a", "#33a02c", "#fb9a99", "#e31a1c", "#fdbf6f", "#ff7f00", "#808080", "#cab2d6", "#6a3d9a",'#8dd3c7', '#ffffb3', '#bebada', '#fb8072', '#80b1d3']

lipid_class_colors = dict(zip([lipid.upper() for lipid in lipid_classes], lipid_colors))


# Update colors with 0.5 alpha
alpha = 0.5
transparent_colors = [hex_to_rgba_hex(color, alpha) for color in lipid_colors]
lipid_class_colors_alpha = dict(zip([lipid.upper() for lipid in lipid_classes], transparent_colors))
    

In [36]:
class_counts = df_new['type'].value_counts()

# Save the counts to a CSV file
class_counts.to_csv('class_counts_old.csv')

# Convert Series to DataFrame for Plotly
df_class_counts = class_counts.reset_index()
df_class_counts.columns = ['type', 'Count']

# Create a pie chart
fig = px.pie(df_class_counts, values='Count', names='type', color='type', 
             color_discrete_map=lipid_class_colors_alpha)
fig.update_layout(title_text='Count of Lipids Old')
pio.write_html(fig, "Relative_Count of Lipids Old.html")
fig.show()

In [18]:
filtered_df_grouped.to_csv("Counting stuff.csv",index=False)

In [17]:
3232-3161

71

In [11]:
make_pie_chart_no_replicates(merged_df,plots_2_save_path,json_list_singles,labels_list)

In [12]:
average_pie_chart_no_repeats(merged_df,plots_2_save_path,json_list_singles,labels_list)

In [25]:
make_bar_plot_comparisons(merged_df, plots_2_save_path, json_list_pairs,labels_list)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [26]:

def prep_edge_R(merged_df,json_list_pairs,Pre_edge_r_path,blank_name,labels_list):
    for i in range(len(json_list_pairs)):
        json1 = json_list_pairs[i][0]
        json2 = json_list_pairs[i][1]

        json_blank = {"Sample Name":[blank_name]}

        filtered_df1 = filter_dataframe(merged_df,json1)
        filtered_df2 = filter_dataframe(merged_df,json2)
        filtered_blank = filter_dataframe(merged_df,json_blank)



        filtered_df1 = filtered_df1.groupby(labels_list)['Intensity'].mean().reset_index()
        filtered_df2 = filtered_df2.groupby(labels_list)['Intensity'].mean().reset_index()
        filtered_blank = filtered_blank.groupby(labels_list)['Intensity'].mean().reset_index()



        reformatted_df1 = filtered_df1.pivot(index=['Lipid', 'Class'], columns='Sample Name', values='Intensity').reset_index()
        reformatted_df2 = filtered_df2.pivot(index=['Lipid', 'Class'], columns='Sample Name', values='Intensity').reset_index()
        reformatted_blank = filtered_blank.pivot(index=['Lipid', 'Class'], columns='Sample Name', values='Intensity').reset_index()

        num_value_columns_df1 = reformatted_df1.shape[1] - 2  # subtract 2 for 'Lipid' and 'Class'
        num_value_columns_df2 = reformatted_df2.shape[1] - 2  # subtract 2 for 'Lipid' and 'Class'



        combined_df = reformatted_df1.merge(reformatted_df2, on=['Lipid', 'Class'], how='inner')

        combined_df = combined_df.merge(reformatted_blank, on=['Lipid', 'Class'], how='inner')




        ##Aquiring Names ##Add an extra string option

        title1 = json_to_string(json1)
        title2 = json_to_string(json2)

        title = title1 +" vs "+title2
        title = title.replace(" | ","__")
        length1 = num_value_columns_df1
        length2 = num_value_columns_df2

        combined_df["Title1"] = title1
        combined_df["Title2"] = title2
        combined_df["Title"] = title
        combined_df["length1"] = length1
        combined_df["length2"] = length2
        combined_df["Blank_name"] = blank_name




        combined_df.to_csv(Pre_edge_r_path+title+".csv",index=False)
    return combined_df



In [27]:
combined_df = prep_edge_R(merged_df,json_list_pairs,Pre_edge_r_path,blank_name,labels_list)

In [None]:
###Adding Subclasses

In [29]:
list_of_lipids = list(merged_df["Lipid"])

In [30]:
list_of_lipids

['CAR_QUAL',
 'CAR',
 'CAR(2:0)_QUAL',
 'CAR(2:0)',
 'CAR(3:1)_QUAL',
 'CAR(3:1)',
 'CAR(3:0)_QUAL',
 'CAR(3:0)',
 'CAR(4:1)_QUAL',
 'CAR(4:1)',
 'CAR(4:0)_QUAL',
 'CAR(4:0)',
 'CAR(5:1)_QUAL',
 'CAR(5:1)',
 'CAR(5:0)_QUAL',
 'CAR(5:0)',
 'CAR(6:1)_QUAL',
 'CAR(6:1)',
 'CAR(6:0)_QUAL',
 'CAR(6:0)',
 'CAR(7:0)_QUAL',
 'CAR(7:0)',
 'CAR(8:1)_QUAL',
 'CAR(8:1)',
 'CAR(8:0)_QUAL',
 'CAR(8:0)',
 'CAR(9:0)_QUAL',
 'CAR(9:0)',
 'CAR(10:3)_QUAL',
 'CAR(10:3)',
 'CAR(10:2)_QUAL',
 'CAR(10:2)',
 'CAR(10:1)_QUAL',
 'CAR(10:1)',
 'CAR(10:0)_QUAL',
 'CAR(10:0)',
 'CAR(11:0)_QUAL',
 'CAR(11:0)',
 'CAR(12:0)_QUAL',
 'CAR(12:0)',
 'CAR(14:2)_QUAL',
 'CAR(14:2)',
 'CAR(14:1)_QUAL',
 'CAR(14:1)',
 'CAR(14:0)_QUAL',
 'CAR(14:0)',
 'CAR(16:2)_QUAL',
 'CAR(16:2)',
 'CAR(16:1)_QUAL',
 'CAR(16:1)',
 'CAR(16:0)_QUAL',
 'CAR(16:0)',
 'CAR(17:0)_QUAL',
 'CAR(17:0)',
 'CAR(18:4)_QUAL',
 'CAR(18:4)',
 'CAR(18:3)_QUAL',
 'CAR(18:3)',
 'CAR(18:2)_QUAL',
 'CAR(18:2)',
 'CAR(18:1)_QUAL',
 'CAR(18:1)',
 'CAR(18:0)_QUA

In [None]:
###Below here is not cleaned up

In [22]:
#import visualization libraries
import umap
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

  from .autonotebook import tqdm as notebook_tqdm


In [25]:
#Plotting functions

def plot_transition_vs_intensity(df):
    fig = px.bar(df, x="Transition", y="Intensity", color="Lipid", hover_data=['Lipid', 'Class'])
    fig.show()

def plot_class_vs_intensity_bar(df):
    fig = px.bar(df, x="Class", y="Intensity", color="Class", hover_data=['Lipid', 'Class'])
    fig.show()

def plot_class_vs_intensity_pie(df):
    fig = px.pie(df, values='Intensity', names='Class', title='Lipid Class')
    fig.show()

def plot_intensity_heatmap(df):
    fig = go.Figure(data=go.Heatmap(
        z=df['Intensity'],
        x=df['Lipid'],
        y=df['Class'],
        colorscale='Viridis'))
    fig.show()

# Example usage:
# Assuming you have the df_matching DataFrame
plot_transition_vs_intensity(df_matched)
plot_class_vs_intensity_bar(df_matched)
plot_class_vs_intensity_pie(df_matched)
plot_intensity_heatmap(df_matched)
