# MEA-Flow Tutorial: Comprehensive Analysis of Multi-Electrode Array Data

This notebook demonstrates the complete workflow for analyzing MEA data using the MEA-Flow library, focusing on neural population dynamics and comparative analysis across experimental conditions.

## Overview

MEA-Flow provides a comprehensive pipeline for:
1. **Data Loading**: Support for various MEA data formats (.spk, .mat, CSV)
2. **Metrics Calculation**: Activity, regularity, and synchrony measures
3. **Manifold Analysis**: Population geometry and dimensionality reduction
4. **Visualization**: Publication-ready plots and visualizations
5. **Comparative Analysis**: Cross-condition statistical comparisons

## 1. Setup and Imports

In [1]:
# Standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings
warnings.filterwarnings('ignore', category=UserWarning)

# MEA-Flow imports
from mea_flow import (
    # Data loading and processing
    SpikeList, load_data, load_multiple_files,
    
    # Analysis modules
    MEAMetrics, ManifoldAnalysis,
    
    # Visualization
    MEAPlotter,
)

from mea_flow.analysis import AnalysisConfig
from mea_flow.manifold import ManifoldConfig
from mea_flow.utils import get_default_parameters, setup_logging

# Set up logging
logger = setup_logging('INFO')

print("MEA-Flow Tutorial - Ready to begin!")

ImportError: cannot import name 'load_multiple_files' from 'mea_flow' (/home/neuro/repos/mea-flow/src/mea_flow/__init__.py)

## 2. Data Loading and Initial Exploration

We'll demonstrate loading MEA data from different formats and explore the basic structure.

In [2]:
# Define data paths (adjust these to your actual data location)
data_dir = Path("../MEA-data")  # Adjust path as needed

# For this tutorial, we'll create synthetic data to demonstrate the workflow
# In practice, you would load your actual .spk or .mat files

def create_synthetic_mea_data(n_channels=64, duration=300.0, condition_name="synthetic"):
    """
    Create synthetic MEA data for demonstration purposes.
    
    This simulates realistic MEA recordings with bursts and network activity.
    """
    np.random.seed(42 if condition_name == "control" else 123)
    
    spike_data = []
    
    # Different activity levels for different conditions
    if condition_name == "control":
        base_rate = 2.0  # Hz
        burst_prob = 0.05
    elif condition_name == "treatment1":
        base_rate = 3.5  # Higher activity
        burst_prob = 0.08
    else:  # treatment2
        base_rate = 1.5  # Lower activity
        burst_prob = 0.03
    
    for ch in range(n_channels):
        # Channel-specific activity level
        channel_rate = base_rate * np.random.uniform(0.3, 2.0)
        
        # Generate background spikes
        n_background_spikes = int(np.random.poisson(channel_rate * duration))
        background_times = np.random.uniform(0, duration, n_background_spikes)
        
        # Add burst activity
        burst_times = []
        t = 0
        while t < duration:
            if np.random.random() < burst_prob:
                # Create a burst
                burst_start = t
                burst_duration = np.random.uniform(0.1, 0.5)
                n_burst_spikes = np.random.randint(5, 20)
                
                for _ in range(n_burst_spikes):
                    spike_time = burst_start + np.random.exponential(0.02)
                    if spike_time < burst_start + burst_duration:
                        burst_times.append(spike_time)
                
                t += burst_duration + np.random.uniform(1.0, 3.0)
            else:
                t += np.random.uniform(0.1, 1.0)
        
        # Combine background and burst spikes
        all_times = np.concatenate([background_times, burst_times])
        all_times = all_times[all_times < duration]
        
        # Add to spike data list
        for spike_time in all_times:
            spike_data.append((ch, spike_time))
    
    return spike_data

# Create synthetic data for three experimental conditions
conditions = ['control', 'treatment1', 'treatment2']
spike_lists = {}

for condition in conditions:
    print(f"Creating synthetic data for condition: {condition}")
    
    # Generate synthetic spike data
    spike_data = create_synthetic_mea_data(
        n_channels=64, 
        duration=300.0, 
        condition_name=condition
    )
    
    # Create SpikeList object
    spike_list = SpikeList(
        spike_data=spike_data,
        recording_length=300.0,
        sampling_rate=12500.0
    )
    
    spike_lists[condition] = spike_list
    
    print(f"  - Channels: {len(spike_list.channel_ids)}")
    print(f"  - Active channels: {len(spike_list.get_active_channels())}")
    print(f"  - Total spikes: {sum(train.n_spikes for train in spike_list.spike_trains.values())}")
    print(f"  - Recording length: {spike_list.recording_length:.1f} s")
    print()

print("Data loading completed!")

Creating synthetic data for condition: control
  - Channels: 64
  - Active channels: 64
  - Total spikes: 62718
  - Recording length: 300.0 s

Creating synthetic data for condition: treatment1


  - Channels: 64
  - Active channels: 64
  - Total spikes: 95075
  - Recording length: 300.0 s

Creating synthetic data for condition: treatment2
  - Channels: 64
  - Active channels: 64
  - Total spikes: 43189
  - Recording length: 300.0 s

Data loading completed!


## 3. Basic Data Exploration and Visualization

In [3]:
# Initialize plotter
plotter = MEAPlotter(figsize=(12, 8))

# Create raster plots for each condition
for condition, spike_list in spike_lists.items():
    print(f"Creating raster plot for {condition}...")
    
    # Plot first 30 seconds of activity
    fig = plotter.plot_raster(
        spike_list,
        time_range=(0, 30),
        color_by_well=True
    )
    
    plt.suptitle(f'Raster Plot - {condition.title()}')
    plt.show()
    
    # Display summary statistics
    summary_stats = spike_list.summary_statistics()
    print(f"\nSummary for {condition}:")
    print(f"  - Active channels: {len(spike_list.get_active_channels())}/{len(spike_list.channel_ids)}")
    print(f"  - Mean firing rate: {summary_stats['firing_rate'].mean():.2f} Â± {summary_stats['firing_rate'].std():.2f} Hz")
    print(f"  - Total spikes: {summary_stats['n_spikes'].sum()}")
    print()

NameError: name 'MEAPlotter' is not defined

## 4. Comprehensive Metrics Analysis

Now we'll compute comprehensive metrics for all conditions including activity, regularity, and synchrony measures.

In [4]:
# Configure analysis parameters
config = AnalysisConfig(
    time_bin_size=1.0,
    min_spikes_for_rate=10,
    n_pairs_sync=200,  # Reduced for demo
    burst_detection=True,
    network_burst_detection=True
)

# Initialize metrics analyzer
metrics_analyzer = MEAMetrics(config=config)

# Compute metrics for all conditions
print("Computing comprehensive metrics for all conditions...")
all_metrics = metrics_analyzer.compare_conditions(
    spike_lists, 
    grouping='global'
)

print(f"\nMetrics computed for {len(all_metrics)} condition(s)")
print(f"Metrics calculated: {list(all_metrics.columns)}")

# Display key metrics
key_metrics = [
    'mean_firing_rate', 'network_firing_rate', 'active_channels_count',
    'cv_isi_mean', 'pearson_cc_mean', 'network_burst_rate'
]

available_metrics = [m for m in key_metrics if m in all_metrics.columns]
summary_table = all_metrics[['condition'] + available_metrics]

print("\n=== Key Metrics Summary ===")
print(summary_table.round(4))

NameError: name 'AnalysisConfig' is not defined

## 5. Metrics Visualization and Statistical Comparison

In [5]:
# Create comprehensive metrics comparison plots
print("Creating metrics comparison plots...")

# Activity metrics comparison
activity_metrics = [
    col for col in all_metrics.columns 
    if any(keyword in col.lower() for keyword in ['firing', 'rate', 'activity', 'spike', 'active'])
    and col != 'condition'
][:6]  # Limit to 6 for clean visualization

if activity_metrics:
    fig = plotter.plot_metrics_comparison(
        all_metrics,
        grouping_col='condition',
        metrics_to_plot=activity_metrics,
        plot_type='box'
    )
    plt.suptitle('Activity Metrics Comparison', fontsize=16, y=1.02)
    plt.show()

# Regularity metrics comparison
regularity_metrics = [
    col for col in all_metrics.columns 
    if any(keyword in col.lower() for keyword in ['cv', 'lv', 'entropy', 'regularity'])
][:6]

if regularity_metrics:
    fig = plotter.plot_metrics_comparison(
        all_metrics,
        grouping_col='condition',
        metrics_to_plot=regularity_metrics,
        plot_type='violin'
    )
    plt.suptitle('Regularity Metrics Comparison', fontsize=16, y=1.02)
    plt.show()

# Synchrony metrics comparison
synchrony_metrics = [
    col for col in all_metrics.columns 
    if any(keyword in col.lower() for keyword in ['correlation', 'sync', 'distance', 'pearson'])
][:4]

if synchrony_metrics:
    fig = plotter.plot_metrics_comparison(
        all_metrics,
        grouping_col='condition',
        metrics_to_plot=synchrony_metrics,
        plot_type='box'
    )
    plt.suptitle('Synchrony Metrics Comparison', fontsize=16, y=1.02)
    plt.show()

Creating metrics comparison plots...


NameError: name 'all_metrics' is not defined

## 6. Population Dynamics and Manifold Analysis

Now we'll analyze the geometry of population dynamics using manifold learning techniques.

In [6]:
# Configure manifold analysis
manifold_config = ManifoldConfig(
    tau=0.02,  # Exponential filter time constant
    dt=0.001,  # Sampling interval
    max_components=10,  # Limit for demo
    methods=['PCA', 'UMAP', 'MDS']  # Subset of methods for speed
)

# Initialize manifold analyzer
manifold_analyzer = ManifoldAnalysis(config=manifold_config)

print("Performing manifold analysis...")
print("This may take a few minutes for the full analysis...")

# Perform comparative manifold analysis
manifold_results = manifold_analyzer.compare_conditions(
    spike_lists,
    time_range=(0, 60)  # Analyze first 60 seconds for speed
)

print(f"\nManifold analysis completed!")
print(f"Analyzed conditions: {list(manifold_results['individual_results'].keys())}")

# Display effective dimensionalities
print("\n=== Effective Dimensionalities ===")
for condition, results in manifold_results['individual_results'].items():
    eff_dim = results.get('effective_dimensionality', np.nan)
    print(f"{condition}: {eff_dim:.2f}")

NameError: name 'ManifoldConfig' is not defined

## 7. Manifold Visualization

In [7]:
# Visualize embeddings for each condition
for condition, results in manifold_results['individual_results'].items():
    if 'embeddings' in results and len(results['embeddings']) > 0:
        print(f"\nVisualizing embeddings for {condition}...")
        
        embeddings = results['embeddings']
        time_vector = results.get('time_vector', None)
        
        # Create embedding comparison plot
        embedding_data = {}
        for method, emb_result in embeddings.items():
            if 'embedding' in emb_result:
                embedding_data[method] = emb_result['embedding']
        
        if len(embedding_data) > 0:
            fig = plotter.plot_manifold_comparison(
                embedding_data,
                labels=time_vector
            )
            plt.suptitle(f'Manifold Embeddings - {condition.title()}', fontsize=16)
            plt.show()
        
        # Show PCA variance explained if available
        if 'PCA' in embeddings and 'explained_variance_ratio' in embeddings['PCA']:
            fig = plotter.plot_dimensionality_analysis(
                embeddings['PCA']['explained_variance_ratio'],
                method_name='PCA'
            )
            plt.suptitle(f'PCA Dimensionality Analysis - {condition.title()}', fontsize=14)
            plt.show()

NameError: name 'manifold_results' is not defined

## 8. Cross-Condition Analysis and Classification

Let's examine how well we can distinguish between experimental conditions based on their neural dynamics.

In [8]:
# Examine cross-condition comparison results
if 'comparison' in manifold_results and len(manifold_results['comparison']) > 0:
    comparison = manifold_results['comparison']
    
    print("=== Cross-Condition Analysis Results ===")
    
    # Classification analysis
    if 'classification_analysis' in comparison:
        classification = comparison['classification_analysis']
        
        if 'classification_scores' in classification:
            print("\nClassification Accuracy (Cross-Validation):")
            
            for method, classifiers in classification['classification_scores'].items():
                print(f"\n{method}:")
                for clf_name, scores in classifiers.items():
                    accuracy = scores.get('mean_accuracy', np.nan)
                    std_acc = scores.get('std_accuracy', np.nan)
                    print(f"  {clf_name}: {accuracy:.3f} Â± {std_acc:.3f}")
    
    # Manifold alignment analysis
    if 'manifold_alignment' in comparison:
        alignment = comparison['manifold_alignment']
        
        if 'alignment_scores' in alignment:
            print("\n\nManifold Alignment Scores:")
            
            for method, alignments in alignment['alignment_scores'].items():
                if alignments:
                    print(f"\n{method}:")
                    for comparison_name, alignment_data in alignments.items():
                        mse = alignment_data.get('procrustes_mse', np.nan)
                        corr = alignment_data.get('mean_correlation', np.nan)
                        print(f"  {comparison_name}: MSE={mse:.4f}, Corr={corr:.4f}")

else:
    print("Cross-condition comparison not available or failed.")

NameError: name 'manifold_results' is not defined

## 9. Feature Importance Analysis

Identify which metrics are most discriminative between conditions.

In [9]:
# Analyze feature importance for discriminating conditions
try:
    from mea_flow.manifold.comparison import identify_discriminative_features
    
    # Extract individual condition results for feature analysis
    individual_results = manifold_results.get('individual_results', {})
    
    if len(individual_results) >= 2:
        print("Analyzing discriminative features...")
        
        # Identify most important population statistics
        feature_importance = identify_discriminative_features(
            individual_results,
            feature_type='population_statistics'
        )
        
        if not feature_importance.empty:
            print("\n=== Most Discriminative Features ===")
            print(feature_importance.head(10))
            
            # Plot feature importance
            if len(feature_importance) > 0:
                fig = plotter.plot_feature_importance(
                    feature_importance.head(10),
                    top_n=10
                )
                plt.show()
        else:
            print("No discriminative features could be computed.")
    else:
        print("Need at least 2 conditions for feature importance analysis.")
        
except Exception as e:
    print(f"Feature importance analysis failed: {e}")

Feature importance analysis failed: name 'manifold_results' is not defined


## 10. Well-Based Analysis

Analyze activity patterns at the individual well level.

In [10]:
# Perform well-based analysis for one condition
example_condition = 'control'
if example_condition in spike_lists:
    print(f"Performing well-based analysis for {example_condition}...")
    
    spike_list = spike_lists[example_condition]
    
    # Compute metrics per well
    well_metrics = metrics_analyzer.compute_all_metrics(
        spike_list,
        grouping='well'
    )
    
    if not well_metrics.empty:
        print(f"\nWell-based metrics computed for {len(well_metrics)} wells")
        print(well_metrics[['group_id', 'n_channels', 'mean_firing_rate', 'cv_isi_mean']].round(3))
        
        # Create well activity visualization
        fig = plotter.plot_well_activity(
            spike_list,
            time_window=5.0
        )
        plt.show()
        
        # Create electrode map for the first well
        fig = plotter.plot_electrode_map(
            spike_list,
            metric='firing_rate',
            well_id=1
        )
        plt.show()
    
    else:
        print("No well-based metrics could be computed.")

Performing well-based analysis for control...


NameError: name 'metrics_analyzer' is not defined

## 11. Time-Resolved Analysis

Examine how metrics change over time within recordings.

In [11]:
# Perform time-resolved analysis
print("Performing time-resolved analysis...")

time_window_length = 30.0  # 30-second windows

for condition, spike_list in spike_lists.items():
    print(f"\nAnalyzing temporal dynamics for {condition}...")
    
    # Compute metrics over time windows
    time_metrics = metrics_analyzer.compute_all_metrics(
        spike_list,
        grouping='time',
        group_params={'window_length': time_window_length}
    )
    
    if not time_metrics.empty and len(time_metrics) > 2:
        print(f"  Computed metrics for {len(time_metrics)} time windows")
        
        # Plot temporal evolution of key metrics
        fig, axes = plt.subplots(2, 2, figsize=(15, 10))
        axes = axes.flatten()
        
        metrics_to_plot = ['mean_firing_rate', 'cv_isi_mean', 'pearson_cc_mean', 'active_channels_count']
        
        for i, metric in enumerate(metrics_to_plot):
            if metric in time_metrics.columns:
                ax = axes[i]
                
                time_points = time_metrics['window_start'] + time_window_length/2
                values = time_metrics[metric]
                
                ax.plot(time_points, values, 'o-', linewidth=2, markersize=6)
                ax.set_xlabel('Time (s)')
                ax.set_ylabel(metric.replace('_', ' ').title())
                ax.grid(True, alpha=0.3)
                ax.set_title(f'{metric.replace("_", " ").title()} Over Time')
        
        plt.suptitle(f'Temporal Dynamics - {condition.title()}', fontsize=16)
        plt.tight_layout()
        plt.show()
    
    else:
        print(f"  Insufficient data for temporal analysis in {condition}")

Performing time-resolved analysis...

Analyzing temporal dynamics for control...


NameError: name 'metrics_analyzer' is not defined

## 12. Summary and Results Export

Summarize the key findings and demonstrate how to export results.

In [12]:
# Create comprehensive summary
print("=== MEA-Flow Analysis Summary ===")
print(f"\nAnalyzed Conditions: {list(spike_lists.keys())}")
print(f"Recording Duration: {list(spike_lists.values())[0].recording_length} seconds")
print(f"Total Channels: {len(list(spike_lists.values())[0].channel_ids)}")

# Summary statistics table
summary_stats = []
for condition, spike_list in spike_lists.items():
    stats = {
        'Condition': condition,
        'Active_Channels': len(spike_list.get_active_channels()),
        'Total_Spikes': sum(train.n_spikes for train in spike_list.spike_trains.values()),
        'Mean_Rate_Hz': np.mean([train.firing_rate for train in spike_list.spike_trains.values() if train.n_spikes > 0])
    }
    summary_stats.append(stats)

summary_df = pd.DataFrame(summary_stats)
print("\n=== Condition Summary ===")
print(summary_df.round(2))

# Key findings from metrics
if not all_metrics.empty:
    print("\n=== Key Metric Differences ===")
    
    for metric in ['mean_firing_rate', 'cv_isi_mean', 'pearson_cc_mean']:
        if metric in all_metrics.columns:
            values = all_metrics.groupby('condition')[metric].mean()
            print(f"\n{metric.replace('_', ' ').title()}:")
            for condition, value in values.items():
                print(f"  {condition}: {value:.4f}")

print("\n=== Export Results ===")

# Demonstrate saving results
try:
    from mea_flow.utils.io import save_results, export_to_format
    
    # Create results directory
    results_dir = Path("./mea_flow_results")
    results_dir.mkdir(exist_ok=True)
    
    # Export metrics to CSV
    if not all_metrics.empty:
        export_to_format(
            all_metrics,
            results_dir / "metrics_summary.csv",
            format='csv'
        )
        print("âœ“ Metrics exported to metrics_summary.csv")
    
    # Save complete results
    complete_results = {
        'metrics': all_metrics,
        'manifold_results': manifold_results,
        'summary_statistics': summary_df,
        'analysis_config': config.__dict__,
        'conditions_analyzed': list(spike_lists.keys())
    }
    
    save_results(
        complete_results,
        results_dir / "complete_analysis.pkl"
    )
    print("âœ“ Complete results saved to complete_analysis.pkl")
    
    print(f"\nAll results saved to: {results_dir.absolute()}")
    
except Exception as e:
    print(f"Export failed: {e}")

print("\nðŸŽ‰ MEA-Flow Tutorial Completed Successfully! ðŸŽ‰")

=== MEA-Flow Analysis Summary ===

Analyzed Conditions: ['control', 'treatment1', 'treatment2']
Recording Duration: 300.0 seconds
Total Channels: 64

=== Condition Summary ===
    Condition  Active_Channels  Total_Spikes  Mean_Rate_Hz
0     control               64         62718          3.27
1  treatment1               64         95075          4.95
2  treatment2               64         43189          2.25


NameError: name 'all_metrics' is not defined

## 13. Next Steps and Advanced Usage

This tutorial covered the basic workflow of MEA-Flow. Here are some directions for further analysis:

### Advanced Analysis Options:

1. **Custom Parameter Sets**: Use `get_analysis_presets()` for specialized analysis types
2. **Detailed Manifold Analysis**: Explore additional dimensionality reduction methods
3. **Network Burst Analysis**: Examine burst dynamics in detail
4. **Cross-Temporal Analysis**: Compare dynamics across different time periods
5. **Multi-Scale Analysis**: Analyze at different temporal resolutions

### Loading Real Data:

```python
# Load Axion .spk files (after MATLAB conversion)
spike_list = load_data('path/to/data.mat', 
                     channels_key='Channels',
                     times_key='Times')

# Load multiple files
file_paths = ['condition1.mat', 'condition2.mat', 'condition3.mat']
condition_names = ['Control', 'Treatment1', 'Treatment2']
spike_lists = load_multiple_files(file_paths, condition_names)
```

### Statistical Analysis:

For rigorous statistical comparisons, consider:
- Multiple comparison corrections
- Non-parametric tests for non-normal distributions
- Effect size calculations
- Bootstrap confidence intervals

### Performance Optimization:

For large datasets:
- Use time windowing to reduce computational load
- Limit the number of manifold learning methods
- Reduce the number of pairs for synchrony analysis
- Use parallel processing where available

Refer to the MEA-Flow documentation for detailed API reference and additional examples.