# Understanding INTENSE Results

This notebook provides a deep dive into interpreting INTENSE analysis outputs. You'll learn:
- How to navigate the results data structures
- What each statistical measure means
- How to create custom visualizations
- Advanced analysis techniques

In [None]:
# Setup
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(os.getcwd()), 'src'))

import driada
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.gridspec import GridSpec

# Configure plotting
plt.rcParams['figure.dpi'] = 100
sns.set_style("whitegrid")

print("Setup complete! ✓")

## Generate Example Data with Known Selectivity

Let's create synthetic data with specific selectivity patterns to better understand the results:

In [None]:
# Generate experiment with more neurons for richer results
exp = driada.generate_synthetic_exp(
    n_dfeats=3,      # 3 discrete features
    n_cfeats=3,      # 3 continuous features
    nneurons=50,     # 50 neurons
    duration=600,    # 10 minutes
    seed=123         # different seed for variety
)

# Run comprehensive analysis
stats, significance, info, results = driada.compute_cell_feat_significance(
    exp,
    mode='two_stage',
    n_shuffles_stage1=100,
    n_shuffles_stage2=2000,
    find_optimal_delays=True,
    shift_window=2,  # ±2 second delay search
    verbose=True
)

print("\nAnalysis complete!")

## Understanding the Results Structure

INTENSE returns four key outputs:

In [None]:
# 1. Stats: Raw statistical measures for each neuron-feature pair
print("1. STATS STRUCTURE:")
print(f"   Type: {type(stats)}")
print(f"   Shape: {stats.shape if hasattr(stats, 'shape') else 'Dictionary'}")
print(f"   Keys: {list(stats.keys())[:3]} ...\n")

# 2. Significance: Binary significance matrix
print("2. SIGNIFICANCE STRUCTURE:")
print(f"   Type: {type(significance)}")
print(f"   Shape: {significance.shape}")
print(f"   Values: {np.unique(significance)}")
print(f"   Significant pairs: {np.sum(significance)}\n")

# 3. Info: Metadata about the analysis
print("3. INFO STRUCTURE:")
print(f"   Keys: {list(info.keys())}\n")

# 4. Results: Detailed statistics for significant pairs
print("4. RESULTS STRUCTURE:")
print(f"   Number of entries: {len(results)}")
if results:
    print(f"   Example entry keys: {list(results[0].keys())}")

## Accessing Results Through the Experiment Object

The Experiment object provides convenient methods to access results:

In [None]:
# Method 1: Get all significant neurons
significant_neurons = exp.get_significant_neurons()
print(f"Neurons with significant selectivity: {len(significant_neurons)}")

# Show which features each neuron is selective to
for neuron_id, features in list(significant_neurons.items())[:5]:
    print(f"  Neuron {neuron_id}: {features}")

# Method 2: Access the complete stats table
print(f"\nStats table features: {list(exp.stats_table.keys())}")

## Deep Dive: Understanding Statistical Measures

Let's examine what each statistical measure tells us:

In [None]:
# Get detailed statistics for a significant pair
if significant_neurons:
    # Pick first significant pair
    cell_id = list(significant_neurons.keys())[0]
    feat_name = significant_neurons[cell_id][0]
    
    # Get all statistics
    pair_stats = exp.get_neuron_feature_pair_stats(cell_id, feat_name)
    
    print(f"Statistics for Neuron {cell_id} ↔ Feature '{feat_name}':\n")
    
    # Core measures
    print("CORE MEASURES:")
    print(f"  • Mutual Information (MI): {pair_stats.get('me', 'N/A')}")
    print(f"  • Normalized MI: {pair_stats.get('pre_rval', 'N/A')}")
    print(f"  • P-value: {pair_stats.get('pval', 'N/A')}")
    print(f"  • Passed Stage 1: {pair_stats.get('passed_stg1', 'N/A')}")
    print(f"  • Passed Stage 2: {pair_stats.get('passed_stg2', 'N/A')}")
    
    # Temporal information
    print("\nTEMPORAL INFORMATION:")
    print(f"  • Optimal delay: {pair_stats.get('shift_used', 0):.3f}s")
    print(f"  • Delay search performed: {pair_stats.get('shift_found', False)}")
    
    # Null distribution parameters
    print("\nNULL DISTRIBUTION:")
    print(f"  • Distribution type: {pair_stats.get('distr_chosen', 'N/A')}")
    print(f"  • Mean of shuffles: {pair_stats.get('mean_bs', 'N/A')}")
    print(f"  • Std of shuffles: {pair_stats.get('std_bs', 'N/A')}")
    print(f"  • Number of shuffles: {pair_stats.get('nsh', 'N/A')}")

## Visualization 1: Selectivity Overview Heatmap

Let's create a heatmap showing all neuron-feature relationships:

In [None]:
# Create selectivity heatmap
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Plot 1: Binary significance
feature_names = list(exp.dynamic_features.keys())
sns.heatmap(significance.T, 
            xticklabels=range(exp.n_cells),
            yticklabels=feature_names,
            cmap='RdBu_r',
            center=0.5,
            cbar_kws={'label': 'Significant'},
            ax=ax1)
ax1.set_xlabel('Neuron ID')
ax1.set_ylabel('Feature')
ax1.set_title('Significant Neuron-Feature Pairs')

# Plot 2: MI values for significant pairs
mi_matrix = np.zeros_like(significance, dtype=float)
for i, feat in enumerate(feature_names):
    for j in range(exp.n_cells):
        if significance[j, i]:
            pair_stats = exp.get_neuron_feature_pair_stats(j, feat)
            mi_matrix[j, i] = pair_stats.get('me', 0)

# Mask non-significant pairs
masked_mi = np.ma.masked_where(significance == 0, mi_matrix)

im = ax2.imshow(masked_mi.T, aspect='auto', cmap='viridis', interpolation='nearest')
ax2.set_xlabel('Neuron ID')
ax2.set_ylabel('Feature')
ax2.set_yticks(range(len(feature_names)))
ax2.set_yticklabels(feature_names)
ax2.set_title('Mutual Information Values (Significant Pairs Only)')
plt.colorbar(im, ax=ax2, label='MI')

plt.tight_layout()
plt.show()

## Visualization 2: Temporal Delay Analysis

Understanding optimal delays helps reveal the temporal dynamics of neural encoding:

In [None]:
# Collect all optimal delays
delays = []
mi_values = []
feature_types = []

for cell_id, features in significant_neurons.items():
    for feat_name in features:
        pair_stats = exp.get_neuron_feature_pair_stats(cell_id, feat_name)
        delay = pair_stats.get('shift_used', 0)
        mi = pair_stats.get('me', pair_stats.get('pre_rval', 0))
        
        delays.append(delay)
        mi_values.append(mi)
        
        # Categorize feature type
        if feat_name.startswith('d_feat'):
            feature_types.append('Discrete')
        else:
            feature_types.append('Continuous')

if delays:
    # Create delay distribution plot
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
    
    # Histogram of delays
    ax1.hist(delays, bins=20, alpha=0.7, edgecolor='black')
    ax1.axvline(0, color='red', linestyle='--', label='Zero delay')
    ax1.set_xlabel('Optimal Delay (seconds)')
    ax1.set_ylabel('Count')
    ax1.set_title('Distribution of Optimal Delays')
    ax1.legend()
    
    # Scatter plot: MI vs delay
    for ftype in set(feature_types):
        mask = [ft == ftype for ft in feature_types]
        ax2.scatter([d for d, m in zip(delays, mask) if m],
                   [mi for mi, m in zip(mi_values, mask) if m],
                   alpha=0.6, label=ftype, s=50)
    
    ax2.axvline(0, color='red', linestyle='--', alpha=0.5)
    ax2.set_xlabel('Optimal Delay (seconds)')
    ax2.set_ylabel('Mutual Information')
    ax2.set_title('MI Strength vs Temporal Delay')
    ax2.legend()
    
    plt.tight_layout()
    plt.show()
    
    # Interpretation
    print(f"Delay Statistics:")
    print(f"  • Mean delay: {np.mean(delays):.3f}s")
    print(f"  • Median delay: {np.median(delays):.3f}s")
    print(f"  • Positive delays (neural follows behavior): {np.sum(np.array(delays) > 0)}")
    print(f"  • Negative delays (neural precedes behavior): {np.sum(np.array(delays) < 0)}")

## Visualization 3: Statistical Confidence

Let's examine the statistical confidence of our findings:

In [None]:
# Collect p-values and effect sizes
pvals = []
effect_sizes = []
feature_names_list = []

for cell_id, features in significant_neurons.items():
    for feat_name in features:
        pair_stats = exp.get_neuron_feature_pair_stats(cell_id, feat_name)
        
        pval = pair_stats.get('pval', None)
        if pval is not None and pval > 0:  # Valid p-value
            pvals.append(pval)
            
            # Calculate effect size (MI / mean of shuffles)
            mi = pair_stats.get('me', pair_stats.get('pre_rval', 0))
            mean_shuffle = pair_stats.get('mean_bs', 1)
            if mean_shuffle > 0:
                effect_size = mi / mean_shuffle
                effect_sizes.append(effect_size)
                feature_names_list.append(feat_name)

if pvals:
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
    
    # P-value distribution (log scale)
    ax1.hist(np.log10(pvals), bins=20, alpha=0.7, edgecolor='black')
    ax1.axvline(np.log10(0.01), color='red', linestyle='--', label='p=0.01')
    ax1.axvline(np.log10(0.001), color='orange', linestyle='--', label='p=0.001')
    ax1.set_xlabel('log10(p-value)')
    ax1.set_ylabel('Count')
    ax1.set_title('Distribution of P-values (Significant Pairs)')
    ax1.legend()
    
    # Effect size vs significance
    scatter = ax2.scatter(effect_sizes, -np.log10(pvals), 
                         c=range(len(pvals)), cmap='viridis',
                         alpha=0.6, s=50)
    ax2.axhline(-np.log10(0.01), color='red', linestyle='--', alpha=0.5)
    ax2.set_xlabel('Effect Size (MI / Mean Shuffle MI)')
    ax2.set_ylabel('-log10(p-value)')
    ax2.set_title('Effect Size vs Statistical Significance')
    
    plt.tight_layout()
    plt.show()
    
    print(f"Statistical Summary:")
    print(f"  • Median p-value: {np.median(pvals):.2e}")
    print(f"  • Highly significant (p<0.001): {np.sum(np.array(pvals) < 0.001)}")
    print(f"  • Mean effect size: {np.mean(effect_sizes):.2f}x baseline")

## Advanced: Feature-wise Summary

Let's summarize selectivity patterns by feature:

In [None]:
# Analyze selectivity by feature
feature_summary = {}

for feat in exp.dynamic_features.keys():
    selective_neurons = []
    mi_values = []
    
    for cell_id in range(exp.n_cells):
        if cell_id in significant_neurons and feat in significant_neurons[cell_id]:
            selective_neurons.append(cell_id)
            pair_stats = exp.get_neuron_feature_pair_stats(cell_id, feat)
            mi_values.append(pair_stats.get('me', pair_stats.get('pre_rval', 0)))
    
    feature_summary[feat] = {
        'n_selective': len(selective_neurons),
        'percent_selective': 100 * len(selective_neurons) / exp.n_cells,
        'mean_mi': np.mean(mi_values) if mi_values else 0,
        'max_mi': np.max(mi_values) if mi_values else 0,
        'neurons': selective_neurons
    }

# Display summary
print("FEATURE-WISE SELECTIVITY SUMMARY:\n")
print(f"{'Feature':<15} {'N Selective':<12} {'% Selective':<12} {'Mean MI':<10} {'Max MI':<10}")
print("-" * 65)

for feat, summary in sorted(feature_summary.items(), 
                           key=lambda x: x[1]['n_selective'], 
                           reverse=True):
    print(f"{feat:<15} {summary['n_selective']:<12} "
          f"{summary['percent_selective']:<12.1f} "
          f"{summary['mean_mi']:<10.3f} "
          f"{summary['max_mi']:<10.3f}")

# Visualize feature selectivity
if feature_summary:
    fig, ax = plt.subplots(figsize=(10, 6))
    
    features = list(feature_summary.keys())
    n_selective = [feature_summary[f]['n_selective'] for f in features]
    colors = ['skyblue' if f.startswith('d_') else 'lightcoral' for f in features]
    
    bars = ax.bar(features, n_selective, color=colors, edgecolor='black', alpha=0.7)
    ax.set_xlabel('Feature')
    ax.set_ylabel('Number of Selective Neurons')
    ax.set_title('Feature Selectivity Summary')
    ax.axhline(exp.n_cells * 0.05, color='red', linestyle='--', 
               alpha=0.5, label='5% threshold')
    
    # Add legend
    from matplotlib.patches import Patch
    legend_elements = [Patch(facecolor='skyblue', label='Discrete'),
                      Patch(facecolor='lightcoral', label='Continuous')]
    ax.legend(handles=legend_elements, loc='upper right')
    
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

## Exporting Results for Further Analysis

Here's how to export your results for use in other tools:

In [None]:
import pandas as pd

# Create a summary DataFrame
results_list = []

for cell_id, features in significant_neurons.items():
    for feat_name in features:
        pair_stats = exp.get_neuron_feature_pair_stats(cell_id, feat_name)
        
        results_list.append({
            'neuron_id': cell_id,
            'feature': feat_name,
            'mi': pair_stats.get('me', np.nan),
            'normalized_mi': pair_stats.get('pre_rval', np.nan),
            'pvalue': pair_stats.get('pval', np.nan),
            'optimal_delay': pair_stats.get('shift_used', 0),
            'n_shuffles': pair_stats.get('nsh', np.nan),
            'null_mean': pair_stats.get('mean_bs', np.nan),
            'null_std': pair_stats.get('std_bs', np.nan)
        })

if results_list:
    results_df = pd.DataFrame(results_list)
    
    # Display summary
    print("Results DataFrame:")
    print(results_df.head(10))
    print(f"\nShape: {results_df.shape}")
    
    # Save to CSV (uncomment to save)
    # results_df.to_csv('intense_results.csv', index=False)
    # print("\nResults saved to 'intense_results.csv'")
else:
    print("No significant results to export.")

## Key Takeaways

### Understanding Your Results:

1. **Mutual Information (MI)** quantifies the strength of the relationship
   - Higher MI = stronger encoding
   - Captures both linear and nonlinear relationships

2. **P-values** indicate statistical confidence
   - Already corrected for multiple comparisons
   - Lower p-values = higher confidence

3. **Optimal delays** reveal temporal dynamics
   - Positive delays: neural activity follows behavior (sensory processing)
   - Negative delays: neural activity precedes behavior (motor planning)

4. **Effect sizes** (MI / baseline) show practical significance
   - Values > 2 indicate strong selectivity
   - Consider both statistical and practical significance

### Best Practices:

- Always check both p-values AND effect sizes
- Consider temporal delays when interpreting relationships
- Look for patterns across features (specialized vs mixed selectivity)
- Export results for cross-validation with other methods

### Next Steps:
- Try the mixed selectivity analysis (see examples/mixed_selectivity.py)
- Explore feature-feature relationships with compute_feat_feat_significance
- Apply to your own neural recordings (see Notebook 03)