# Behavioral EDA Class Demo

This notebook demonstrates how to use the `BehavioralEDA` class to create all the same plots and analyses as the original `behavioral_eda.ipynb` notebook.

The class provides a clean separation between:
1. **Data Processing Methods** - Generate DataFrames for analysis
2. **Plotting Methods** - Create visualizations
3. **Utility Methods** - Helper functions for data processing

## Setup and Initialization

In [1]:
# Import the BehavioralEDA class
from behavioral_eda_class import BehavioralEDA
from pathlib import Path
import holoviews as hv
# Enable Jupyter notebook display
from bokeh.io import output_notebook
output_notebook()

In [2]:
# Set up data path
monkey = 'fiona'  # Change to 'yasmin' if needed
base_path = Path.cwd().parent / 'data' / f'{monkey}_sst'
filepath = base_path.parent / 'csst_trials_pkls' / f'all_{monkey}_CSST_trials_df.pkl'

print(f"Loading data from: {filepath}")
print(f"File exists: {filepath.exists()}")

Loading data from: /home/barak/Projects/population_analysis/data/csst_trials_pkls/all_fiona_CSST_trials_df.pkl
File exists: True


In [3]:
# Initialize the BehavioralEDA class
eda = BehavioralEDA(str(filepath))



Loaded data for fiona
Total trials: 110,358
Date range: fi210628 to fi211125
✓ Reaction time data available, will add derived columns as needed


## Comprehensive Data Summary

First, let's get a comprehensive overview of the behavioral data:

In [4]:
# Print comprehensive summary statistics
eda.print_summary_stats()

=== BEHAVIORAL DATA OVERVIEW ===

Trial Types:
  GO: 61,460
  CONT: 25,294
  STOP: 23,604

Directions:
  R: 55,453
  L: 54,905

Overall Success Rate: 84.6%

Success rates by trial type:
   type  success_rate  total_trials
0  CONT          84.5         25294
1    GO          95.3         61460
2  STOP          56.9         23604
Processing reaction times and adding to original DataFrame...
✓ Using existing reaction_time column
✓ Reaction time processing completed and added to original DataFrame

=== RT STATISTICS BY TYPE ===
               count   mean    std  min    max
rt_type                                       
Continue_RT    22902  250.3  109.0  1.0  703.0
Error_Stop_RT  10171  178.7   82.4  1.0  898.0
GO_RT          58576  208.3   74.2  1.0  518.0
Other          11444  445.1  207.0  1.0  912.0


## Data Processing Examples

Let's explore some of the data processing methods:

In [5]:
# Get basic summary as a dictionary
basic_summary = eda.get_basic_summary()
print("Basic Summary:")
for key, value in basic_summary.items():
    print(f"  {key}: {value}")

Basic Summary:
  total_trials: 110358
  trial_types: {'GO': 61460, 'CONT': 25294, 'STOP': 23604}
  directions: {'R': 55453, 'L': 54905}
  overall_success_rate: 84.59830732706284
  experimental_sets: {'CSST': 110358}


In [6]:
# Get success rates by trial type
success_rates = eda.get_success_rates_data()
print("Success Rates by Trial Type:")
display(success_rates)

Success Rates by Trial Type:


Unnamed: 0,type,total_trials,failed_trials,failure_rate,success_rate
0,CONT,25294,3933,15.5,84.5
1,GO,61460,2884,4.7,95.3
2,STOP,23604,10180,43.1,56.9


In [7]:
# Get signal delay performance data
stop_performance, cont_performance = eda.get_signal_delay_performance_data()
print("Stop Performance by SSD:")
display(stop_performance)
print("\nContinue Performance by SSD:")
display(cont_performance)

Stop Performance by SSD:


Unnamed: 0,ssd_number,total_trials,failed_trials,error_rate,error_percentage,ssd_len
0,1.0,5882,606,0.103,10.3,48
1,2.0,5890,1566,0.266,26.6,108
2,3.0,5927,3119,0.526,52.6,168
3,4.0,5905,4889,0.828,82.8,228



Continue Performance by SSD:


Unnamed: 0,ssd_number,total_trials,failed_trials,failure_rate,correct_percentage,ssd_len
0,1.0,6328,949,0.15,85.0,48
1,2.0,6189,688,0.111,88.9,108
2,3.0,6464,1289,0.199,80.1,168
3,4.0,6313,1007,0.16,84.0,228


## Visualization Gallery

Now let's create all the same plots as the original notebook using the class methods:

### 1. Trial Distribution by Type and Outcome

In [8]:
# Create trial distribution plot (stacked bar chart)
plot1 = eda.plot_trial_distribution()
plot1

### 2. Success Rate by Trial Type (Percentage View)

In [9]:
# Create success rate percentage plot
plot2 = eda.plot_success_rates_percentage()
plot2

  trial_pct = self.df.groupby('type').apply(


### 3. Successful Trials by Type and Direction

In [10]:
# Create direction analysis plot
plot3 = eda.plot_direction_analysis()
plot3

### 4. Trial Length Distribution by Type

In [11]:
# Create trial length distribution histogram
plot4 = eda.plot_trial_length_distribution()
plot4

### 5. Go Cue Timing Distribution by Type

In [12]:
# Create go cue timing distribution
plot5 = eda.plot_go_cue_timing()
plot5

### 6. Stop and Continue Performance (Figure 1b Replication)

In [13]:
# Create signal delay performance plot (replicates Figure 1b)
plot6 = eda.plot_signal_delay_performance()
plot6



In [14]:
hv.help(hv.Scatter)

Scatter

Online example: https://holoviews.org/reference/elements/bokeh/Scatter.html

[1;35m-------------
Style Options
-------------[0m

	alpha, angle, cmap, color, fill_alpha, fill_color, hit_dilation, hover_alpha, hover_color, hover_fill_alpha, hover_fill_color, hover_line_alpha, hover_line_cap, hover_line_color, hover_line_dash, hover_line_dash_offset, hover_line_join, hover_line_width, line_alpha, line_cap, line_color, line_dash, line_dash_offset, line_join, line_width, marker, muted, muted_alpha, muted_color, muted_fill_alpha, muted_fill_color, muted_line_alpha, muted_line_cap, muted_line_color, muted_line_dash, muted_line_dash_offset, muted_line_join, muted_line_width, nonselection_alpha, nonselection_color, nonselection_fill_alpha, nonselection_fill_color, nonselection_line_alpha, nonselection_line_cap, nonselection_line_color, nonselection_line_dash, nonselection_line_dash_offset, nonselection_line_join, nonselection_line_width, palette, radius, radius_dimension, selection_a

### 7. Session Mean RT Scatter Plots

In [15]:
# Create RT scatter plot comparing different RT types across sessions
plot7 = eda.plot_rt_scatter()
plot7

### 8. Continue and Error Stop RT Distributions (Figure 1d Replication)

In [16]:
# Create RT distributions plot (replicates Figure 1d)
plot8 = eda.plot_rt_distributions()
plot8

## Advanced Data Analysis

You can also access the processed data directly for custom analyses:

In [17]:
# Get RT scatter data for custom analysis
rt_scatter_data = eda.get_rt_scatter_data()
print("RT Types available:")
print(rt_scatter_data['rt_type'].value_counts())
print("\nSample RT data:")
display(rt_scatter_data.head(10))

RT Types available:
rt_type
Continue_RT      88
GO_RT            88
Error_Stop_RT    87
Name: count, dtype: int64

Sample RT data:


Unnamed: 0,rt_type,trial_session,mean_rt
0,Continue_RT,fi210628a,277.521739
1,Continue_RT,fi210629a,139.666667
2,Continue_RT,fi210630a,272.177305
3,Continue_RT,fi210701a,304.880597
4,Continue_RT,fi210704a,221.576923
5,Continue_RT,fi210705a,332.065217
6,Continue_RT,fi210706a,314.934307
7,Continue_RT,fi210707a,315.878378
8,Continue_RT,fi210708a,272.841463
9,Continue_RT,fi210711a,266.904762


In [18]:
# Get RT distribution data
cont_dist, stop_dist = eda.get_rt_distribution_data()
print("Continue RT Distribution Data:")
display(cont_dist.head())
print("\nStop RT Distribution Data:")
display(stop_dist.head())

Continue RT Distribution Data:


Unnamed: 0,Reaction Time Bin,SSD Number,percentage
0,0.0,CSD1,0.466514
1,0.0,CSD2,0.375583
2,0.0,CSD3,0.288606
3,0.0,CSD4,0.407211
4,20.0,CSD1,0.336048



Stop RT Distribution Data:


Unnamed: 0,Reaction Time Bin,SSD Number,percentage
0,0.0,SSD1,0.063549
1,0.0,SSD2,0.105914
2,0.0,SSD3,0.321979
3,0.0,SSD4,0.338926
4,20.0,SSD1,0.088968


## Class Benefits Summary

The `BehavioralEDA` class provides several advantages over the original notebook:

### ✅ **Organization**
- Clean separation between data processing and plotting
- Reusable methods for consistent analysis
- Well-documented and maintainable code

### ✅ **Efficiency**
- Automatic caching of processed data (RT processing)
- On-demand processing only when needed
- No redundant computations

### ✅ **Flexibility**
- Easy to use with different datasets
- Data processing methods can be used independently
- Customizable plotting parameters

### ✅ **Robustness**
- Error handling for missing data
- Validation of data formats
- Automatic monkey name detection

### ✅ **Reproducibility**
- Same statistical analyses as original notebook
- Consistent plotting styles and parameters
- Documented methods and workflows

## Next Steps

You can now:

1. **Use with different datasets**: Change the `monkey` variable to analyze different subjects
2. **Extend functionality**: Add new methods to the class for additional analyses
3. **Integrate into pipelines**: Use the class methods in larger analysis workflows
4. **Customize plots**: Modify plotting parameters or create new visualization methods
5. **Export data**: Use the data processing methods to export data for other analyses

The class maintains all the functionality of the original notebook while providing a much cleaner, more maintainable interface!