# **CellTracksColab - Track clustering analysis**
---
<font size = 4>Explore Spatial Clustering in Track Data with CellTracksColab: This Colab Notebook is designed to help you determine whether tracks exhibit spatial clustering. Before beginning, ensure that your data is properly loaded in the CellTracksColab format for optimal analysis.


In [None]:
# @title #MIT License

print("""
**MIT License**

Copyright (c) 2023 Guillaume Jacquemet

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.""")

--------------------------------------------------------
# **Part 0. Prepare the Google Colab session (skip this section when using a local installation)**
--------------------------------------------------------

## **0.1. Install key dependencies**
---
<font size = 4>

In [None]:
#@markdown ##Play to install
!git clone https://github.com/CellMigrationLab/CellTracksColab.git
!pip install -r "CellTracksColab/requirements.txt"

## **0.2. Mount your Google Drive**
---
<font size = 4> To use this notebook on the data present in your Google Drive, you need to mount your Google Drive to this notebook.

<font size = 4> Play the cell below to mount your Google Drive and follow the instructions.

<font size = 4> Once this is done, your data are available in the **Files** tab on the top left of notebook.

In [None]:
#@markdown ##Play the cell to connect your Google Drive to Colab

from google.colab import drive
drive.mount('/gdrive')

--------------------------------------------------------
# **Part 1. Prepare the session and load the data**
--------------------------------------------------------

## **1.1 Load key dependencies**
---
<font size = 4>

In [None]:
#@markdown ##Play to load the dependancies

import ipywidgets as widgets
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import numpy as np
import itertools
from matplotlib.gridspec import GridSpec
import requests
import os

# Current version of the notebook the user is running
current_version = "1.0.1"
Notebook_name = 'Track_Clustering'

# URL to the raw content of the version file in the repository
version_url = "https://raw.githubusercontent.com/guijacquemet/CellTracksColab/main/Notebook/latest_version.txt"

# Function to define colors for formatting messages
class bcolors:
    WARNING = '\033[91m'  # Red color for warning messages
    ENDC = '\033[0m'      # Reset color to default

# Check if this is the latest version of the notebook
try:
    All_notebook_versions = pd.read_csv(version_url, dtype=str)
    print('Notebook version: ' + current_version)

    # Check if 'Version' column exists in the DataFrame
    if 'Version' in All_notebook_versions.columns:
        Latest_Notebook_version = All_notebook_versions[All_notebook_versions["Notebook"] == Notebook_name]['Version'].iloc[0]
        print('Latest notebook version: ' + Latest_Notebook_version)

        if current_version == Latest_Notebook_version:
            print("This notebook is up-to-date.")
        else:
            print(bcolors.WARNING + "A new version of this notebook has been released. We recommend that you download it at https://github.com/guijacquemet/CellTracksColab" + bcolors.ENDC)
    else:
        print("The 'Version' column is not present in the version file.")
except requests.exceptions.RequestException as e:
    print("Unable to fetch the latest version information. Please check your internet connection.")
except Exception as e:
    print("An error occurred:", str(e))

#----------------------- Key functions -----------------------------#

# Function to calculate Cohen's d
def cohen_d(group1, group2):
    diff = group1.mean() - group2.mean()
    n1, n2 = len(group1), len(group2)
    var1 = group1.var()
    var2 = group2.var()
    pooled_var = ((n1 - 1) * var1 + (n2 - 1) * var2) / (n1 + n2 - 2)
    d = diff / np.sqrt(pooled_var)
    return d

def save_dataframe_with_progress(df, path, desc="Saving", chunk_size=50000):
    """Save a DataFrame with a progress bar."""

    # Estimating the number of chunks based on the provided chunk size
    num_chunks = int(len(df) / chunk_size) + 1

    # Create a tqdm instance for progress tracking
    with tqdm(total=len(df), unit="rows", desc=desc) as pbar:
        # Open the file for writing
        with open(path, "w") as f:
            # Write the header once at the beginning
            df.head(0).to_csv(f, index=False)

            for chunk in np.array_split(df, num_chunks):
                chunk.to_csv(f, mode="a", header=False, index=False)
                pbar.update(len(chunk))

def check_for_nans(df, df_name):
    """
    Checks the given DataFrame for NaN values and prints the count for each column containing NaNs.

    Args:
    df (pd.DataFrame): DataFrame to be checked for NaN values.
    df_name (str): The name of the DataFrame as a string, used for printing.
    """
    # Check if the DataFrame has any NaN values and print a warning if it does.
    nan_columns = df.columns[df.isna().any()].tolist()

    if nan_columns:
        for col in nan_columns:
            nan_count = df[col].isna().sum()
            print(f"Column '{col}' in {df_name} contains {nan_count} NaN values.")
    else:
        print(f"No NaN values found in {df_name}.")

## **1.2.Load your CellTracksColab dataset**
---

<font size = 4> Please ensure that your data was properly processed using CellTracksColab


In [None]:
## This section is the code is to test the same functions in all notebooks. 
## Once they work, we can copy paste all the code in the Part 1. Load Key dependencies.
import sys
# The following paths may vary a bit locally dependending on where the jupyter lab is running. 
# It's basically the path to the github repository, also in colab
sys.path.append("../")
sys.path.append("CellTracksColab/")
import celltracks

In [None]:
#@markdown ###Provide the path to your Track table

Track_table = ''  # @param {type: "string"}

#@markdown ###Provide the path to your Result folder

Results_Folder = ""  # @param {type: "string"}

if not os.path.exists(Results_Folder):
    os.makedirs(Results_Folder, exist_ok=True)  # Create Results_Folder if it doesn't exist

# Print the location of the result folder
print(f"Result folder is located at: {Results_Folder}")

# For existing dataframes

print("Loading track table file....")
merged_tracks_df = pd.read_csv(Track_table, low_memory=False)
Data_Dims = "2D" #@param ["2D", "3D"]
Data_Type = "TrackMate Files"
Track_table = ''  # @param {type: "string"}
Spot_table = ''  # @param {type: "string"}


#@markdown ###Or use a test dataset (up to 10 min download)
Use_test_dataset = False #@param {type:"boolean"}

# Update the parameters to load the data
CellTracks = celltracks.TrackingData()
if Use_test_dataset:
    # Download the test dataset
    T.DownloadTestData()
else:
    CellTracks.Folder_path = Folder_path
    if Data_Type == "TrackMate Table":
            CellTracks.Spot_table = Spot_table
            CellTracks.Track_table = Track_table
    
CellTracks.Results_Folder = Results_Folder
CellTracks.data_type = Data_Type
CellTracks.data_dims = Data_Dims

# Load data
CellTracks.LoadTrackingData()
merged_spots_df = CellTracks.spots_data
merged_tracks_df = CellTracks.tracks_data
check_for_nans(merged_tracks_df, "merged_tracks_df")
check_for_nans(merged_tracks_df, "merged_tracks_df")
print("...Done")



## **1.4. Visualise your tracks**
---

In [None]:
# @title ##Run the cell and choose the file you want to inspect

import ipywidgets as widgets
from ipywidgets import interact
import matplotlib.pyplot as plt

if not os.path.exists(Results_Folder+"/Tracks"):
    os.makedirs(Results_Folder+"/Tracks")  # Create Results_Folder if it doesn't exist

# Extract unique filenames from the dataframe
filenames = merged_spots_df['File_name'].unique()

# Create a Dropdown widget with the filenames
filename_dropdown = widgets.Dropdown(
    options=filenames,
    value=filenames[0] if len(filenames) > 0 else None,  # Default selected value
    description='File Name:',
)

def plot_coordinates(filename):
    if filename:
        # Filter the DataFrame based on the selected filename
        filtered_df = merged_spots_df[merged_spots_df['File_name'] == filename]

        plt.figure(figsize=(10, 8))
        for unique_id in filtered_df['Unique_ID'].unique():
            unique_df = filtered_df[filtered_df['Unique_ID'] == unique_id].sort_values(by='POSITION_T')
            plt.plot(unique_df['POSITION_X'], unique_df['POSITION_Y'], marker='o', linestyle='-', markersize=2)

        plt.xlabel('POSITION_X')
        plt.ylabel('POSITION_Y')
        plt.title(f'Coordinates for {filename}')
        plt.savefig(f"{Results_Folder}/Tracks/Tracks_{filename}.pdf")
        plt.show()
    else:
        print("No valid filename selected")

# Link the Dropdown widget to the plotting function
interact(plot_coordinates, filename=filename_dropdown)


# **Part 2: Assess spatial clustering using Ripley's L function**

<font size = 4>In the specific spatial analysis being performed here, the choice of a single point within each track serves to focus on key moments or characteristics of object movement that are particularly relevant to the research objectives. For instance, when analyzing spatial distribution patterns of tracked objects within each field of view (FOV), different analysis points such as the beginning, end, middle, average, or median point of each track offer unique insights. Selecting the "beginning" point might help identify where objects enter an area, while the "end" point can indicate exit locations. Choosing the "middle" point provides insights into where objects spend a significant portion of their time. On the other hand, the "average" or "median" point offers a summary of the overall movement tendencies within each track. By accommodating these various analysis point options, researchers can tailor their spatial analysis to uncover specific aspects of object distribution that are most pertinent to their research questions, enhancing the depth and relevance of their findings.






## **2.1. Choose the point to use for each track**

<font size = 4>This section offers users an interactive visualization tool to compare and select the most suitable analysis point within each track for spatial analysis. By providing a dynamic interface, users can assess the impact of different analysis points (e.g., "beginning," "end," "middle," etc.) on spatial distribution patterns. This hands-on exploration empowers users to make informed decisions, ensuring that the chosen analysis point effectively captures the spatial characteristics of each track. Ultimately, this customization enhances the precision and relevance of spatial analysis results for a wide range of research objectives.







In [None]:
# @title ##Run the cell and choose the analysis point you want to use

import ipywidgets as widgets
from ipywidgets import interact
import matplotlib.pyplot as plt
import os

def select_analysis_point(track, analysis_option):
    if analysis_option == "beginning":
        point = track.iloc[0][['POSITION_X', 'POSITION_Y']]
    elif analysis_option == "end":
        point = track.iloc[-1][['POSITION_X', 'POSITION_Y']]
    elif analysis_option == "middle":
        middle_index = len(track) // 2
        point = track.iloc[middle_index][['POSITION_X', 'POSITION_Y']]
    elif analysis_option == "average":
        point = track[['POSITION_X', 'POSITION_Y']].mean()
    elif analysis_option == "median":
        point = track[['POSITION_X', 'POSITION_Y']].median()
    else:
        point = pd.Series([np.nan, np.nan], index=['POSITION_X', 'POSITION_Y'])

    return point

if not os.path.exists(Results_Folder+"/Tracks"):
    os.makedirs(Results_Folder+"/Tracks")  # Create Results_Folder if it doesn't exist

# Extract unique filenames from the dataframe
filenames = merged_spots_df['File_name'].unique()

# Create Dropdown widgets with labels and fixed width
filename_dropdown = widgets.Dropdown(
    options=filenames,
    value=filenames[0] if len(filenames) > 0 else None,  # Default selected value
    description='File Name:',
    layout=widgets.Layout(width='300px'),  # Adjust width as needed
)

analysis_option_dropdown = widgets.Dropdown(
    options=["beginning", "end", "middle", "average", "median"],
    value="beginning",
    description='Point:',
    layout=widgets.Layout(width='300px'),  # Adjust width as needed
)

# Define the plotting function
def plot_coordinates(filename, analysis_option):
    if filename:
        # Filter the DataFrame based on the selected filename
        filtered_df = merged_spots_df[merged_spots_df['File_name'] == filename]

        plt.figure(figsize=(10, 8))
        for unique_id in filtered_df['Unique_ID'].unique():
            unique_df = filtered_df[filtered_df['Unique_ID'] == unique_id].sort_values(by='POSITION_T')
            plt.plot(unique_df['POSITION_X'], unique_df['POSITION_Y'], marker='o', linestyle='-', markersize=2)

            # Find and mark the selected analysis point
            analysis_point = select_analysis_point(unique_df, analysis_option)
            if not analysis_point.isna().any():
                plt.scatter(analysis_point['POSITION_X'], analysis_point['POSITION_Y'], color='red', s=50)

        plt.xlabel('POSITION_X')
        plt.ylabel('POSITION_Y')
        plt.title(f'Coordinates for {filename} ({analysis_option} point)')
        plt.savefig(f"{Results_Folder}/Tracks/Tracks_{filename}_{analysis_option}.pdf")
        plt.show()
    else:
        print("No valid filename selected")

# Link both Dropdown widgets to the plotting function
interact(plot_coordinates, filename=filename_dropdown, analysis_option=analysis_option_dropdown)


## **2.2. Compute Ripley's L function for each FOV**

<font size = 4>This code aims to compute Ripley's L function for each Field of View (FOV) in a dataset of tracked objects. Ripley's L function is a spatial statistics tool used to analyze the spatial distribution of points or objects in a given area. In this analysis, we are interested in understanding how objects are distributed within each FOV.

## User Input Options

1. **Analysis Option**: This option allows you to choose the point within each track that will be used for analysis. You can select one of the following options:
   - "beginning": Use the initial position of each track.
   - "end": Use the final position of each track.
   - "middle": Use the middle position of each track.
   - "average": Use the average position of all points within each track.
   - "median": Use the median position of all points within each track.

2. **r_values Range**: Ripley's L function is computed for a range of spatial distances denoted by "r." You can specify the range of r_values using the following parameters:
   - **Start Value**: The starting value of "r" (minimum distance). This number should be greater than 0.
   - **End Value**: The ending value of "r" (maximum distance).
   - **Number of Points**: The number of points or steps within the specified range. The analysis will be performed at equidistant intervals between the start and end values.



In [None]:
# @title ##Run to compute Ripley's L function for each FOV
import os
import numpy as np
import pandas as pd
from scipy.spatial import distance_matrix
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm

# Define Ripley's K function
def ripley_k(points, r, area):
    n = len(points)
    d_matrix = distance_matrix(points, points)
    sum_indicator = np.sum(d_matrix < r) - n  # Subtract n to exclude self-pairs

    K_r = (area / (n ** 2)) * sum_indicator

    # Check if K_r is negative and print relevant information
    if K_r < 0:
        print("Negative K_r encountered!")
        print("Distance matrix:", d_matrix)
        print("Sum indicator:", sum_indicator)
        print("Area:", area, "Number of points:", n, "Distance threshold r:", r)

    return K_r


# Define Ripley's L function

def ripley_l(points, r, area):
    K_r = ripley_k(points, r, area)
    # Check if K_r has negative values
    if np.any(K_r < 0):
        print("Warning: Negative value encountered in K_r")

    L_r = np.sqrt(K_r / np.pi) - r
    return L_r



def select_analysis_point(track, analysis_option):
    if analysis_option == "beginning":
        point = track.iloc[0][['POSITION_X', 'POSITION_Y']]
    elif analysis_option == "end":
        point = track.iloc[-1][['POSITION_X', 'POSITION_Y']]
    elif analysis_option == "middle":
        middle_index = len(track) // 2
        point = track.iloc[middle_index][['POSITION_X', 'POSITION_Y']]
    elif analysis_option == "average":
        point = track[['POSITION_X', 'POSITION_Y']].mean()
    elif analysis_option == "median":
        point = track[['POSITION_X', 'POSITION_Y']].median()
    else:
        point = pd.Series([np.nan, np.nan], index=['POSITION_X', 'POSITION_Y'])

    return point

analysis_option = "beginning" # @param ["beginning", "end", "middle", "average", "median"]

# Prompt the user for the desired r_values range
r_values_start = 0.1 # @param {type: "number"}
r_values_end = 50# @param {type: "number"}
r_values_count = 50# @param {type: "number"}

r_values = np.linspace(r_values_start, r_values_end, r_values_count)


# Check and create necessary directories
if not os.path.exists(f"{Results_Folder}/Track_Clustering/RipleyL"):
    os.makedirs(f"{Results_Folder}/Track_Clustering/RipleyL")

# Define area based on your dataset's extent
area = (merged_spots_df['POSITION_X'].max() - merged_spots_df['POSITION_X'].min()) * \
       (merged_spots_df['POSITION_Y'].max() - merged_spots_df['POSITION_Y'].min())

# Compute Ripley's L function for each FOV
l_values_per_fov_slow = {}
for file_name, group in tqdm(merged_spots_df.groupby('File_name'), desc="Processing FOVs"):
    # Sort each track by POSITION_T
    group = group.sort_values(by=['TRACK_ID', 'POSITION_T'])

    representative_points = group.groupby('TRACK_ID').apply(lambda track: select_analysis_point(track, analysis_option)).dropna()
    if not representative_points.empty:
        l_values = [ripley_l(representative_points.values, r, area) for r in r_values]
        l_values_per_fov_slow[file_name] = l_values






## **2.3. Compute Monte Carlo Simulations for Each FOV**

This code section performs Monte Carlo simulations to assess the significance of the observed spatial distribution patterns within each Field of View (FOV) in a dataset of tracked objects. The simulations help establish confidence envelopes for Ripley's L function, allowing for statistical testing.


**Number of Simulations (Nb_simulation)**: You can specify the number of Monte Carlo simulations to run for each FOV. This parameter determines the level of statistical confidence and computational resources used in the analysis.

In [None]:
from tqdm.notebook import tqdm

# @title ##Run to compute Monte Carlo simulations for each FOV

Nb_simulation = 10 # @param {type: "number"}

# Simulate random points for Monte Carlo simulations
def simulate_random_points(num_points, x_range, y_range):
    x_coords = np.random.uniform(x_range[0], x_range[1], num_points)
    y_coords = np.random.uniform(y_range[0], y_range[1], num_points)
    return np.column_stack((x_coords, y_coords))

# Initialize simulated_l_values as an empty dictionary
simulated_l_values_dict_slow = {}

# Perform Monte Carlo simulations for significance testing
confidence_envelopes_slow = {}
for file_name, group in tqdm(merged_spots_df.groupby('File_name'), desc='Processing FOVs'):

    group = group.sort_values(by=['TRACK_ID', 'POSITION_T'])
    representative_points = group.groupby('TRACK_ID').apply(lambda track: select_analysis_point(track, analysis_option)).dropna()

    simulations = [simulate_random_points(len(representative_points),
                                          (merged_spots_df['POSITION_X'].min(), merged_spots_df['POSITION_X'].max()),
                                          (merged_spots_df['POSITION_Y'].min(), merged_spots_df['POSITION_Y'].max()))
                   for _ in tqdm(range(Nb_simulation), desc=f'Simulating for {file_name}', leave=False)]

    simulated_l_values = [[ripley_l(points, r, area) for r in r_values] for points in simulations]
    simulated_l_values_dict_slow[file_name] = simulated_l_values  # Store the simulated values in the dictionary

    lower_bound = np.percentile(simulated_l_values, 2.5, axis=0)
    upper_bound = np.percentile(simulated_l_values, 97.5, axis=0)
    confidence_envelopes_slow[file_name] = (lower_bound, upper_bound)



## **2.4. Plots the results for each FOV**


In [None]:
# @title ##Plots the results for each FOV

import os
import matplotlib.pyplot as plt

# Visualization of Ripley's L function with confidence envelopes
for file_name, l_values in l_values_per_fov_slow.items():
    # Retrieve the confidence envelope for the current file
    lower_bound, upper_bound = confidence_envelopes_slow.get(file_name, (None, None))

    # Only proceed if the confidence envelope exists
    if lower_bound is not None and upper_bound is not None:
        plt.figure(figsize=(10, 6))
        plt.plot(r_values, l_values, label=f'L(r) for {file_name}')
        plt.fill_between(r_values, lower_bound, upper_bound, color='gray', alpha=0.5)
        plt.xlabel('Radius (r)')
        plt.ylabel("Ripley's L Function")
        plt.title(f"Ripley's L Function - {file_name}_{analysis_option}")
        plt.legend()
        plt.grid(True)

        # Save the plot as a PDF in the specified folder
        pdf_path = os.path.join(f"{Results_Folder}/Track_Clustering/RipleyL/{file_name}_{analysis_option}.pdf")
        plt.savefig(pdf_path,bbox_inches='tight')
        plt.show()
        plt.close()  # Close the plot to free memory
    else:
        print(f"No confidence envelope data available for {file_name}_{analysis_option}")


## **2.5. Chose a specific radius and plot the results**


In [None]:
# @title ##Define a specific radius and run

# Define the specific radius for comparison
specific_radius = 25 # @param {type: "number"}

# Extract L values at the specific radius
specific_radius_index = np.argmin(np.abs(r_values - specific_radius))  # Find the index of the closest radius value
l_values_at_specific_radius_slow = {fov: l_values[specific_radius_index] for fov, l_values in l_values_per_fov_slow.items()}

# Plotting
plt.figure(figsize=(12, 6))
plt.bar(l_values_at_specific_radius_slow.keys(), l_values_at_specific_radius_slow.values())
plt.xlabel('Field of View')
plt.ylabel(f"Ripley's L at radius {specific_radius}")
plt.title(f"Comparison of Ripley's L Function at Radius {specific_radius} Across Different FOVs")
plt.xticks(rotation=45)
# Save the plot as a PDF in the specified folder
pdf_path = os.path.join(f"{Results_Folder}/Track_Clustering/RipleyL/l_values_at_specific_radius_{specific_radius}_{analysis_option}.pdf")
plt.savefig(pdf_path, bbox_inches='tight')

plt.show()


# Create DataFrame with confidence envelopes, median, and L values at the specific radius
rows = []
for fov, (lower_bound, upper_bound) in confidence_envelopes_slow.items():
    l_value = l_values_per_fov_slow[fov][specific_radius_index]
    lower = lower_bound[specific_radius_index]
    upper = upper_bound[specific_radius_index]

    # Retrieve simulated L values for the FOV
    simulated_l_values_for_fov_slow = simulated_l_values_dict_slow.get(fov, [])

    # Calculate median if simulated L values are available for the FOV
    if simulated_l_values_for_fov_slow:
        median_vals = [l_vals[specific_radius_index] for l_vals in simulated_l_values_for_fov_slow]
        median = np.median(median_vals) if median_vals else np.nan
    else:
        median = np.nan

    rows.append([fov, l_value, lower, upper, median])

confidence_df = pd.DataFrame(rows, columns=['File_name', 'Ripley_L_at_Specific_Radius', 'Lower_Bound', 'Upper_Bound', 'Median'])

# Merge with additional information
additional_info_df = merged_tracks_df[['File_name', 'Condition', 'experiment_nb', 'Repeat']].drop_duplicates('File_name')
merged_df = pd.merge(confidence_df, additional_info_df, left_on='File_name', right_on='File_name')

# Save the merged DataFrame to a CSV file
merged_df.to_csv(f"{Results_Folder}/Track_Clustering/ripleys_l_values__{specific_radius}_{analysis_option}.csv", index=False)


## **2.6. Comparison of Ripley's L Values Across Conditions**



In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# @title ##Comparison of Ripley\'s L Values Across Conditions

# Convert 'Condition' to string if it's not already
merged_df['Condition'] = merged_df['Condition'].astype(str)

# Create the box plot
plt.figure(figsize=(12, 8))
sns.boxplot(data=merged_df, x='Condition', y='Ripley_L_at_Specific_Radius')

# Overlay the Monte Carlo simulation results
for condition in merged_df['Condition'].unique():
    condition_data = merged_df[merged_df['Condition'] == condition]

    # Plot median values
    medians = condition_data['Median']
    plt.scatter([condition] * len(medians), medians, color='red', alpha=0.5)  # Median

    # Handle NaN values and calculate mean and error only for non-NaN values
    valid_data = condition_data.dropna(subset=['Median', 'Lower_Bound', 'Upper_Bound'])
    if not valid_data.empty:
        median_mean = valid_data['Median'].mean()
        lower_mean = valid_data['Lower_Bound'].mean()
        upper_mean = valid_data['Upper_Bound'].mean()
        yerr = [[median_mean - lower_mean], [upper_mean - median_mean]]

        # Check if yerr contains valid data before plotting
        if not any(np.isnan(yerr)):
            plt.errorbar(condition, median_mean, yerr=yerr, fmt='o', color='black', alpha=0.5)  # Confidence interval

# Add labels and title
plt.xlabel('Condition')
plt.ylabel(f"Ripley's L at radius {specific_radius}")
plt.title('Comparison of Ripley\'s L Values Across Conditions with Monte Carlo Simulation Results')
plt.xticks(rotation=45)
plt.grid(True)

# Save the figure before showing it
pdf_path = os.path.join(f"{Results_Folder}/l_values_Conditions_radius_{specific_radius}_{analysis_option}.pdf")
plt.savefig(pdf_path, bbox_inches='tight')

# Show the plot
plt.show()


## **2.7. Plot the analysis point for each FOV**


In [None]:
# @title ##Run the cell to plot the coordinates used for the spatial analysis


import matplotlib.pyplot as plt
import os
from google.colab import widgets  # Import Google Colab widgets

if not os.path.exists(Results_Folder + "/Track_Clustering/Coordinates"):
    os.makedirs(Results_Folder + "/Track_Clustering/Coordinates")  # Create Results_Folder if it doesn't exist


show_plots = False # @param {type:"boolean"}


# Extract unique filenames from the dataframe
filenames = merged_spots_df['File_name'].unique()

# Define the plotting function
def plot_points(filename, analysis_option, show_plots=True):
    if filename:
        # Filter the DataFrame based on the filename
        filtered_df = merged_spots_df[merged_spots_df['File_name'] == filename]

        plt.figure(figsize=(10, 8))
        for unique_id in filtered_df['Unique_ID'].unique():
            unique_df = filtered_df[filtered_df['Unique_ID'] == unique_id].sort_values(by='POSITION_T')

            # Find and mark the selected analysis point
            analysis_point = select_analysis_point(unique_df, analysis_option)
            if not analysis_point.isna().any():
                plt.scatter(analysis_point['POSITION_X'], analysis_point['POSITION_Y'], color='red', s=50, label=f'Track {unique_id}')

        plt.xlabel('POSITION_X')
        plt.ylabel('POSITION_Y')
        plt.title(f'Analysis points for {filename} ({analysis_option} point)')
        plt.savefig(f"{Results_Folder}/Track_Clustering/Coordinates/Analysis_points_{filename}_{analysis_option}.pdf")

        if show_plots:
            plt.show()
        else:
            plt.close()

    else:
        print("No valid filename selected")

# Loop through all filenames and plot them one by one
for filename in filenames:
    analysis_option = "beginning"  # You can set your preferred analysis option here
    plot_points(filename, analysis_option, show_plots)



# **Part 3: Version log**
---
<font size = 4>While I strive to provide accurate and helpful information, please be aware that:
  - This notebook may contain bugs.
  - Features are currently limited and will be expanded in future releases.

<font size = 4>We encourage users to report any issues or suggestions for improvement. Please check the [repository](https://github.com/guijacquemet/CellTracksColab) regularly for updates and the latest version of this notebook.

<font size = 4>**Version 0.9.1**
  - Improved documentation
  - Improved saving strategy

<font size = 4>**Version 0.8**
  - First release of this notebook

