# Jupyter Notebook UI to analyze baseline data from tap-habituation experiments!

### Beginner Essentials:
1. Shift-Enter to run each cell. After you run, you should see an output "done step #". If not, an error has occured
2. When inputting your own code/revising the code, make sure you close all your quotation marks '' and brackets (), [], {}.
3. Don't leave any commas (,) hanging! (make sure an object always follows a comma. If there is nothing after a comma, remove the comma!
4. Learning to code? Each line of code is annotated to help you understand how this code works!

**Run all cells/steps sequentially, even the ones that do not need input**

## Step-by-Step Analysis of the Jupyter Notebook

| Step | Purpose | Key Actions |
|------|---------|-------------|
| **1. Import Packages** | Load required Python libraries for data analysis | Imports `pandas`, `numpy`, `matplotlib`, etc. | 
| **2. Pick Filepath** | Select the folder containing experimental data files (.dat or .trv) | Input required: Uses `FileChooser` widget to select directory | 
| **3. User-Defined Variables** | Set experiment parameters | Defines: `bin`  | 
| **4. Construct Filelist** | Find all files in selected folder | Sets working directory and scans `folder_path` using; Displays no. of `.trv` files found in the folder |
| **5. Process Data Function** | Define functions to load, clean, and analyze raw data | - `ProcessData()`: Loads files, calculates metrics (reversal probability, speed) |
| **6.1 Process Data** | Apply processing to all strains| - Checks `filelist` for unique strain names (e.g., "N2") <br>- Runs `ProcessData()` for each strain | 
| **7. Grouping & Naming** | Combine data from all strains | - Concatenates DataFrames<br>- Assigns dataset names (e.g., "N2") | 
| **Output CSV** | Save processed data | Exports `Baseline_data` to CSV |

### Key Notes:
- User Input Required: Steps 2 (file selection), 3 (parameters), 6.1 (strain verification)
- Output: Final CSV contains all analyzed tap response data

# 1. Importing Packages Required (No input required, just run)

In [188]:
import pandas as pd #<- package used to import and organize data
import numpy as np #<- package used to import and organize data
import seaborn as sns #<- package used to plot graphs
from matplotlib import pyplot as plt #<- package used to plot graphs
import os #<- package used to work with system filepaths
from ipywidgets import widgets #<- widget tool to generate button
from IPython.display import display #<- displays button
from ipyfilechooser import FileChooser
# from tkinter import Tk, filedialog #<- Tkinter is a GUI package
from tqdm.notebook import tqdm
import warnings
# import dask.dataframe as dd
# import pingouin as pg
pd.set_option('display.max_columns', 50)
print("done step 1")

done step 1


In [189]:
warnings.filterwarnings("ignore", category=RuntimeWarning)

## 2. Pick filepath (just run and click button from output)

Run the following cell and click the button 'Select Folder' to pick a filepath.

**Important: Later on, this script uses the total file path for each file to import and group data. That means if your folder has whatever your strain is named, the script will not work.**

(ex. if your folder has "N2" in it this script sees all files inside this folder as having the "N2" search key)

**An easy fix is to just rename your folder to something else (make your strains lower-case, or just have the date)**

In [100]:
starting_directory = '/Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/'
chooser = FileChooser(starting_directory)
display(chooser)

FileChooser(path='/Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025', filename='', title='', show_hidden=Fals…

In [101]:
print(chooser.selected_path)
folder_path=chooser.selected_path

/Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025


In [102]:
screens = ['PD_Screen', 'ASD_Screen', 'G-Proteins_Screen', 'Glia_Genes_Screen', 'Neuron_Genes_Screen', 'Miscellaneous', 'ASD_WGS_Screen']

screen_chooser = widgets.Select(options=screens, value=screens[0], description='Screen:')
display(screen_chooser)

Select(description='Screen:', options=('PD_Screen', 'ASD_Screen', 'G-Proteins_Screen', 'Glia_Genes_Screen', 'N…

In [103]:
Screen=screen_chooser.value
print(Screen)

Glia_Genes_Screen


# 3. User Defined Variables (Add input here)

Here, we add some constants to help you blaze through this code.

3.1: Setting time bins


3.2: Setting view range for your graph
- Top, bottom = y axis view range
- left, right = x axis view range



In [104]:
# Setting 1s Bins
bins = np.linspace(0,1200,1201) # np.linspace(start, end, steps in between)
print(bins)


print("done step 3")

[0.000e+00 1.000e+00 2.000e+00 ... 1.198e+03 1.199e+03 1.200e+03]
done step 3


# 4. Construct filelist from folder path (No input required, just run)

In [105]:
os.chdir(folder_path) # setting your working directory so that your images will be saved here

filelist = list() # empty list
for root, dirs, files in os.walk(folder_path): # this for loop goes through your folder 
    for name in files:
        if name.endswith('.dat'): # and takes out all files with a .dat (file that contains your data)
            if "_" in name.split(".")[-2]:
                filepath = os.path.join(root, name) # Notes down the file path of each data file
                filelist.append(filepath) # saves it into the list

if not filelist:
    raise FileNotFoundError("No .dat files found in the selected folder!")
else:
    print(f"Number of .dat files to process: {len(filelist)}")
    # print(f"Example of first and last file saved: {filelist[0]}, {filelist[-1]}") 

print('done step 4')

Number of .dat files to process: 317
done step 4


# 5. Process Data Function (No input required, just run)

In [106]:
def ProcessData(strain, experiment_counter): 
    """
    Filters and processes .dat files matching the given strain.

    Parameters: 
        strain (str): keyword to match in the files

    Returns:
        dict: N (Plate number) and Dataframe with required columns 
              ("time", "dura", "dist", "prob", "speed", "plate", "Date",
              "Plate_id", "Screen")

    """
    strain_filelist = [x for x in filelist if strain in x] # Goes through the list and filters for keyword
    Strain_N = len(strain_filelist) # Finds the number of plates per strain
    if Strain_N == 0:
        raise AssertionError ('{} is not a good identifier'.format(strain))
    else:
        pass
        print(f'Strain {strain}')
        print(f'Number of plates: {Strain_N}') 
        
        # visiting files in this strain
        strain_filelist = [file for file in filelist if strain in file]
        df_list=[]
        for i, file in enumerate(strain_filelist):
            if file.split('/')[-1].startswith('._'):
                pass
            else:
                try:
                    print(f"File: {file}")
                    df= pd.read_csv(file, sep=' ', header = None, encoding_errors='ignore')
                    df['Plate_id'] = file.split('/')[-2]+"_"+ file.split('/')[-1].split('_')[-1].split('.')[0]
                    df['Date'] = file.split('/')[-2].split('_')[0]
                    df['Screen'] = file.split('/')[-4]
                    df['Experiment'] = experiment_counter
                    experiment_counter = 1+experiment_counter
                    df_list.append(df)
                except:
                    print(f"error in file {file}")
                    pass
        DF_Total = pd.concat(df_list, ignore_index = True)
        DF_Total = DF_Total.rename( 
                    {0:'Time',
                    1:'n',
                    2:'Number',
                    3:'Instantaneous Speed',
                    4:'Interval Speed',
                    5:'Bias',
                    6:'Tap',
                    7:'Puff',
                    8:'x',
                    9:'y',
                    10:'Morphwidth',
                    11:'Midline',
                    12:'Area',
                    13:'Angular Speed',
                    14:'Aspect Ratio',
                    15:'Kink',
                    16:'Curve',
                    17:'Crab',
                    18:'Pathlength'}, axis=1)
        
        # check function here for NaN Columns
        DF_Total['plate'] = 0

        print("---------------------------------------------------------------------------------------------------------------------------------------------------------------------------")

    return{
            'N': Strain_N,
            'Confirm':DF_Total,
            'experiment_counter': experiment_counter
            # 'Final': DF_Final
    }



def assign_taps(df, tolerances):
    """
    Assigns tap number to each row in the DataFrame based on time tolerances.

    Parameters:
        df (pd.DataFrame): The DataFrame to modify
        tolerances (list of tuples): Each tuple is (lower, upper) time range

    Returns:
        None
    """
    df['taps'] = np.nan
    df['taps'][0] = 0
    for taps, tolerance in enumerate(psa_tolerances): #[(99, 101), (109,111), ...]
        tap_lower,tap_upper = tolerance
        TimesInTapRange = df['Time'].between(tap_lower,tap_upper, inclusive="both")
        df.loc[TimesInTapRange,'taps'] = int(taps)+1 # set the tap to i where times are between
    # df.bfill(inplace=True)


def insert_plates(df):   
    """
    Inserts a plate column into a dataframe.
    
    Parameters:
        df (pd.DataFrame): any dataframe
    
    Returns: 
        pd.DataFrame: dataframe with a plate column
    """
    df['plate']=(df['taps'] ==1).cumsum()


print('done step 5')

done step 5


# 6.1 Process Data

Create a dictionary `StrainNames` that contains all the genotype/strain names from each file path

In [107]:
genotype=[]
for f in filelist:
    genotype.append(f.split('/')[-3])

genotypes=np.unique(genotype).tolist()

if Screen =="Neuron_Genes_Screen":
    genotypes.insert(0, genotypes.pop(genotypes.index("N2_XJ1")))
    genotypes.insert(0, genotypes.pop(genotypes.index("N2_N2")))
else:
    genotypes.insert(0, genotypes.pop(genotypes.index("N2")))

nstrains = list(range(1, len(genotypes) + 1))
StrainNames = {nstrains[i]: genotypes[i] for i in range(len(nstrains))}

print(f"Number of genotypes/strains in the experiment: {len(genotypes)}")

# Display the first 5 Strain names in the experiment
for k in list(StrainNames)[:5]:
    print(f"{k}: {StrainNames[k]}")


print("done step 6.1")

# <---------------- Test element to use for dictionary buidling -------------------
# s = '/Users/Joseph/Desktop/OnFoodOffFoodTest/N2_OnFood/20220401_163048/N2_10x1_n96h20C_360sA0401_ka.00065.dat'
# slist=s.split('/')[5]
# print(slist)
# print(list(range(1,5+1)))

Number of genotypes/strains in the experiment: 25
1: N2
2: AMshABLATE_nsIs109
3: ced-10_n3246
4: ced-5_n2002
5: delm-1_ok1226
done step 6.1


# 6.2 Process Data (just run this cell)

Pass each strain through `ProcessData()` function 

In [168]:
# 3.1 Input
number_of_taps = 30 # Taps in your experiment (N)

# 3.2 Input
ISI = 10  # ISI in your experiment
first_tap = 600 # when is your first tap? check your TRV files

# Here, open up one of the trv files to determine the times for each of these taps. 

# Record number of taps (N+1), e.g., if number_of_taps = 30, taps = [1, 2, 3, ..., 31]
taps = np.arange(1, number_of_taps+2).tolist()

# Assign tolerance to each tap
lower = np.arange(first_tap+7.0, first_tap+7.0+(number_of_taps*ISI), ISI) # (first tap, last tap+10s, ISI)
upper = np.arange(first_tap+9.5, first_tap+9.5+(number_of_taps*ISI), ISI) # (first tap, last tap+10s, ISI)
psa_tolerances = [(float(l), float(u)) for l, u in zip(lower, upper)]
psa_tolerances.append((1197.5,1199)) # (N+1)th tap

# Display taps with tolerances 
for i in taps:
    print(f"Tap {i}, tolerance: {psa_tolerances[i-1]}")

print("done step 3")

Tap 1, tolerance: (607.0, 609.5)
Tap 2, tolerance: (617.0, 619.5)
Tap 3, tolerance: (627.0, 629.5)
Tap 4, tolerance: (637.0, 639.5)
Tap 5, tolerance: (647.0, 649.5)
Tap 6, tolerance: (657.0, 659.5)
Tap 7, tolerance: (667.0, 669.5)
Tap 8, tolerance: (677.0, 679.5)
Tap 9, tolerance: (687.0, 689.5)
Tap 10, tolerance: (697.0, 699.5)
Tap 11, tolerance: (707.0, 709.5)
Tap 12, tolerance: (717.0, 719.5)
Tap 13, tolerance: (727.0, 729.5)
Tap 14, tolerance: (737.0, 739.5)
Tap 15, tolerance: (747.0, 749.5)
Tap 16, tolerance: (757.0, 759.5)
Tap 17, tolerance: (767.0, 769.5)
Tap 18, tolerance: (777.0, 779.5)
Tap 19, tolerance: (787.0, 789.5)
Tap 20, tolerance: (797.0, 799.5)
Tap 21, tolerance: (807.0, 809.5)
Tap 22, tolerance: (817.0, 819.5)
Tap 23, tolerance: (827.0, 829.5)
Tap 24, tolerance: (837.0, 839.5)
Tap 25, tolerance: (847.0, 849.5)
Tap 26, tolerance: (857.0, 859.5)
Tap 27, tolerance: (867.0, 869.5)
Tap 28, tolerance: (877.0, 879.5)
Tap 29, tolerance: (887.0, 889.5)
Tap 30, tolerance: (897

In [201]:
DataLists = [0] # generates empty list at index 0 because we want indexing to start at 1 
                # when I say #1, I want the first point, not the second point

experiment_counter = 1

# the loop below goes through the dictionary in step 6.1 and processes data
# and appends all data into a list of dataframes
for s in tqdm(StrainNames.values()): 
    if not s == '':
        result = ProcessData(s, experiment_counter)
        DataLists.append(result['Confirm'])
        experiment_counter = result['experiment_counter'] 


# the loop below assigns taps and plates to the processed data
for df in DataLists[1:]: 
    assign_taps(df, psa_tolerances)
    insert_plates(df)


print('done step 6.2')

  0%|          | 0/25 [00:00<?, ?it/s]

Strain N2
Number of plates: 75
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/mgl-2_tm355/20241024_171133/N2_5x4_f96h20C_600s30x10s10s_B1024.dat
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/N2/20240724_025822/N2_5x4_f96h20C_600s30x10s10s_A0724.dat
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/N2/20240724_035049/N2_5x4_f96h20C_600s30x10s10s_A0724.dat
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/N2/20240724_094826/N2_5x4_f96h20C_600s31x10s10s_B0724.dat
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/N2/20240724_095505/N2_5x4_f96h20C_600s31x10s10s_C0724.dat
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/N2/20240724_103519/N2_5x4_f96h20C_600s31x10s10s_B0724.dat
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/N2/20240724_104235/N2_5x4_f96h20C_600s31x10s10s_C0724.dat
File: /Volumes/RankinLabMehak_SSD/Glia_Genes_Screen_2025/N2/20240727_144831/N2_5x4_f96h20C_600s31x10s10s_B0727.dat
File: /Volumes/RankinLabMehak_SSD/Glia_G

You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  df['taps'][0] = 0
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['taps'][0] = 0
You are setting values thro

done step 6.2


You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  df['taps'][0] = 0
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['taps'][0] = 0
You are setting values thro

# Convert float64 data to float32 to reduce memory load (can also convert to 16 if needed)

For plain english:

float16 = 4 decimal points

float32 = 8 decimal points

float64 = 16 decimal points

more decimal points = more data/memory that computer has to keep track of

In [202]:
# commented out this section in case memory load needs to be reduced

for n in tqdm(DataLists[1:]):
    print(n)
    TestData=n
    TestData[TestData.select_dtypes(np.float64).columns]=TestData.select_dtypes(np.float64).astype(np.float32)
    

  0%|          | 0/25 [00:00<?, ?it/s]

             Time   n  Number  Instantaneous Speed  Interval Speed   Bias  \
0           0.007   0       0               0.0000          0.0000  0.000   
1           0.052   0       0               0.0000          0.0000  0.000   
2           0.098   0       0               0.0000          0.0000  0.000   
3           0.138   0       0               0.0000          0.0000  0.000   
4           0.176   0       0               0.0000          0.0000  0.000   
...           ...  ..     ...                  ...             ...    ...   
1638676  1199.799  76      28               0.1016          0.0852  0.115   
1638677  1199.848  76      28               0.1184          0.1005  0.000   
1638678  1199.926  75      28               0.0865          0.0726  0.000   
1638679  1200.007  75      28               0.0000          0.0000  0.000   
1638680  1200.087  76      28               0.0000          0.0000  0.000   

         Tap  Puff        x        y  Morphwidth  Midline      Area  \
0   

# 7. Grouping Data and Naming

In [203]:
base=pd.concat(df.assign(dataset=StrainNames.get(i+1)) for i, df in enumerate(DataLists[1:]))

base[['Gene', 'Allele']] = base['dataset'].str.split(pat='_', n=1, expand=True)

base['Screen']=Screen

base['Allele'] = base['Allele'].fillna('N2')

### Creating `Baseline_data` 

This step takes all the individual strain data (processed in Step 6) and combines them into single dataframe, filters for time window 490s - 590s, drops unwanted columns. 


In [204]:
# Baseline_data = base.drop(columns=["Tap", "Puff", "x","y", "Experiment", "taps"]).dropna().reset_index(drop=True) 

# Baseline_data = Baseline_data[((Baseline_data.Time<=590.0)&(Baseline_data.Time >=490.0))] 

# Baseline_data.head()

In [205]:
# Baseline_data.shape

## Creating Post Stimulus Data 

In [206]:
base.dtypes

Time                   float32
n                        int64
Number                   int64
Instantaneous Speed    float32
Interval Speed         float32
Bias                   float32
Tap                      int64
Puff                     int64
x                      float32
y                      float32
Morphwidth             float32
Midline                float32
Area                   float32
Angular Speed          float32
Aspect Ratio           float32
Kink                   float32
Curve                  float32
Crab                   float32
Pathlength             float32
Plate_id                object
Date                    object
Screen                  object
Experiment               int64
plate                    int64
taps                   float32
dataset                 object
Gene                    object
Allele                  object
dtype: object

In [207]:
# similar filters as baseline data

Post_stimulus_data_pre = base[((base.Time>599.000))]

Post_stimulus_data_pre = Post_stimulus_data_pre.drop(columns=["Puff", "x","y"]).dropna().reset_index()

# Post_stimulus_data_pre['Time'] = Post_stimulus_data_pre['Time']

# Add continuous tap numbers from 1 to 31 for each experiment
# E.g., Experiment 1 has taps 1-31, Experiment 2 has taps 1-31 and so on..
# Post_stimulus_data_pre['Tap_num'] = Post_stimulus_data_pre.groupby(['Experiment'])['Tap'].cumsum()

Post_stimulus_data_pre.head()

Unnamed: 0,index,Time,n,Number,Instantaneous Speed,Interval Speed,Bias,Tap,Morphwidth,Midline,Area,Angular Speed,Aspect Ratio,Kink,Curve,Crab,Pathlength,Plate_id,Date,Screen,Experiment,plate,taps,dataset,Gene,Allele
0,15071,607.017029,15,12,0.0483,0.0561,0.083,0,0.0666,0.773,0.073751,7.3,0.354,65.300003,34.900002,0.013,2.632,20241024_171133_B1024,20241024,Glia_Genes_Screen,1,1,1.0,N2,N2,N2
1,15072,607.057983,15,12,0.0524,0.059,0.167,0,0.0648,0.7765,0.072535,7.8,0.352,65.0,35.200001,0.0104,2.633,20241024_171133_B1024,20241024,Glia_Genes_Screen,1,2,1.0,N2,N2,N2
2,15073,607.093994,15,12,0.0531,0.0669,0.167,0,0.0633,0.7715,0.071199,6.8,0.349,65.0,34.599998,0.0082,2.633,20241024_171133_B1024,20241024,Glia_Genes_Screen,1,3,1.0,N2,N2,N2
3,15074,607.142029,15,12,0.0572,0.0764,0.25,0,0.0644,0.7734,0.072232,7.1,0.354,64.400002,34.799999,0.0058,2.635,20241024_171133_B1024,20241024,Glia_Genes_Screen,1,4,1.0,N2,N2,N2
4,15075,607.179016,15,12,0.0611,0.0784,0.25,0,0.0658,0.7809,0.073568,6.8,0.345,64.900002,34.700001,0.0055,2.636,20241024_171133_B1024,20241024,Glia_Genes_Screen,1,5,1.0,N2,N2,N2


In [208]:
# # Create windows from 7s to 9.5s post a tap ("Tap"=1) for each experiment
# # and concatenate all these wondows into a single dataframe

# Post_stimulus_data = []

# for exp in Post_stimulus_data_pre['Experiment'].unique(): # loop through each experiment separately 
#     df = Post_stimulus_data_pre[Post_stimulus_data_pre['Experiment'] == exp]  
#     tap_times = df[df['Tap'] == 1]['Time']  # get times where tap occured

#     for t in tap_times: 
#         window = df[(df['Time'] >= t + 7) & (df['Time'] <= t + 9.5)]
#         Post_stimulus_data.append(window)

# Post_stimulus_data = pd.concat(Post_stimulus_data)

# Post_stimulus_data.head()

In [217]:
# Aggregate columns by "Experiment" + "taps" by taking their means

# Post_stimulus_data_pre['Time'] = Post_stimulus_data_pre['Time'].astype('float32')

Post_stimulus_data = Post_stimulus_data_pre.groupby(['Experiment','Screen','Date','Plate_id','Gene','Allele','dataset', "taps"]).agg({
    'Time': 'min', # take minimum valu of time instead of mean
    'n': 'mean',
    'Number': 'mean',
    'Instantaneous Speed': 'mean',
    'Interval Speed' : 'mean',
    'Bias': 'mean',
    'Tap': 'mean',
    'Morphwidth': 'mean',
    'Midline': 'mean',
    'Area': 'mean',
    'Angular Speed': 'mean',
    'Aspect Ratio': 'mean',
    'Kink': 'mean',
    'Curve': 'mean',
    'Crab': 'mean',
    'Pathlength': 'mean'
})

Post_stimulus_data = Post_stimulus_data.reset_index()

Post_stimulus_data

Unnamed: 0,Experiment,Screen,Date,Plate_id,Gene,Allele,dataset,taps,Time,n,Number,Instantaneous Speed,Interval Speed,Bias,Tap,Morphwidth,Midline,Area,Angular Speed,Aspect Ratio,Kink,Curve,Crab,Pathlength
0,1,Glia_Genes_Screen,20241024,20241024_171133_B1024,N2,N2,N2,1.0,607.017029,15.000000,12.000000,0.058189,0.056352,0.274145,0.0,0.063419,0.774055,0.071953,5.437097,0.323339,66.690323,34.417740,0.007847,2.674403
1,1,Glia_Genes_Screen,20241024,20241024_171133_B1024,N2,N2,N2,2.0,617.025024,16.193548,12.000000,0.094568,0.079076,0.567242,0.0,0.069426,0.794218,0.077992,13.196774,0.394581,70.537094,37.046772,0.013332,2.905065
2,1,Glia_Genes_Screen,20241024,20241024_171133_B1024,N2,N2,N2,3.0,627.020020,17.000000,13.758065,0.110121,0.069482,0.593984,0.0,0.074361,0.819671,0.083514,9.058064,0.339935,72.608070,30.751612,0.012123,2.678081
3,1,Glia_Genes_Screen,20241024,20241024_171133_B1024,N2,N2,N2,4.0,637.020020,20.000000,17.032258,0.111542,0.062279,0.847194,0.0,0.080366,0.819053,0.085719,8.482259,0.295935,49.640324,31.625807,0.012297,2.752871
4,1,Glia_Genes_Screen,20241024,20241024_171133_B1024,N2,N2,N2,5.0,647.075012,20.629630,19.000000,0.093339,0.054163,0.803926,0.0,0.077781,0.849924,0.088974,4.809259,0.261130,50.259258,29.918518,0.007341,3.041685
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9852,318,Glia_Genes_Screen,20250313,20250313_220015_C0313,ztf-16,ok1916,ztf-16_ok1916,27.0,867.038025,86.444444,68.666667,0.121700,0.069508,0.824222,0.0,0.109219,0.947672,0.123068,7.602778,0.287528,42.944443,29.627777,0.012017,5.267722
9853,318,Glia_Genes_Screen,20250313,20250313_220015_C0313,ztf-16,ok1916,ztf-16_ok1916,28.0,877.036011,79.894737,58.842105,0.114174,0.067363,0.773632,0.0,0.110263,0.939184,0.122659,7.642105,0.298079,46.328949,30.213158,0.012016,5.339684
9854,318,Glia_Genes_Screen,20250313,20250313_220015_C0313,ztf-16,ok1916,ztf-16_ok1916,29.0,887.143005,79.487805,52.512195,0.121276,0.072363,0.752512,0.0,0.109734,0.939059,0.123724,8.119512,0.300122,43.748779,30.168293,0.012920,5.445244
9855,318,Glia_Genes_Screen,20250313,20250313_220015_C0313,ztf-16,ok1916,ztf-16_ok1916,30.0,897.077026,91.393939,62.636364,0.112712,0.067391,0.786091,0.0,0.110891,0.945524,0.124778,6.393939,0.288394,42.569698,29.084848,0.011427,4.664848


In [210]:
print('done step 7')

done step 7


# Save dataframe as `.csv`

In [88]:
Baseline_data.to_csv(f"{Screen}_baseline_output.csv")
print('saved Baseline data as .csv!')

saved Baseline data as .csv!


In [None]:
Post_stimulus_data.to_csv(f"{Screen}_post_stimulus.csv")
print('saved Post stimulus data as .csv!')

saved Post stimulus data as .csv!


# Done!

In [None]:
Post_stimulus_data[['T']]