# Gap Analysis Notebook

This notebook orchestrates the execution of multiple Gap analysis notebooks using Papermill, allowing for automated parameter injection and HTML report generation.

## 📋 Overview

The notebook executes the following analysis pipeline:

1. **Gap ET Extraction** (`1_gap_et_extraction.ipynb`)
   - Extracts eye-tracking data and demographics
   - Parameters: `date`, `files_date`, `derivative`

2. **Gap Data Preprocessing** (`2_gap_et_preprocessing.ipynb`) 
   - Comprehensive preprocessing and quality control
   - Parameters: `date`, `files_date`, `derivative`, `outlier_rem`, `min_n_trials_per_condition`, `full_preprocessing`

3. **Additional Analysis** (using preprocessing notebook with different parameters)
   - Runs specific analysis configurations
   - Allows for parameter variations and comparisons

## 🔧 Key Features

- **Flexible Parameter Handling**: Each notebook gets only the parameters it needs
- **Error Handling**: Robust error reporting and continuation
- **Output Management**: Organized HTML reports in separate directories
- **Progress Tracking**: Clear status updates during execution

## 📁 Output Structure

```
papermill_outputs/
├── gap_et_extraction/
│   ├── gap_et_extraction_papermill.html
│   └── gap_et_extraction_papermill_executed.ipynb
├── gap_et_preprocessing/
│   ├── gap_et_preprocessing_papermill.html
│   └── gap_et_preprocessing_papermill_executed.ipynb
└── gap_additional_analysis/
    ├── gap_analysis_outliers_removed_[date].html
    └── gap_analysis_outliers_removed_[date]_executed.ipynb
```

## 🚀 Usage

1. Update date parameters as needed
2. Modify notebook-specific parameters in their respective sections
3. Run all cells to execute the complete pipeline
4. Check generated HTML reports for results

In [5]:
import nbformat
from nbconvert import HTMLExporter
from nbclient import NotebookClient
from pathlib import Path
import papermill as pm


In [22]:
def execute_notebook_with_papermill(notebook_path, output_html, **parameters):
    """Execute a Jupyter notebook with specified parameters using Papermill.

    Args:
        notebook_path (str): Path to the notebook to execute.
        output_html (str): Path to save the HTML output of the notebook.
        **parameters: Keyword arguments to pass as parameters to the notebook.
    """
    print(f"Executing notebook: {Path(notebook_path).name}")
    print(f"Parameters: {parameters}")
    
    # Load the current notebook to get the kernel information
    with open(notebook_path, "r", encoding="utf-8") as f:
        notebook = nbformat.read(f, as_version=4)
    
    # Extract the kernel name used in the current notebook
    kernel_name = notebook.metadata.get('kernelspec', {}).get('name', 'python3')

    # Execute the notebook with Papermill
    executed_notebook_path = output_html.replace(".html", "_executed.ipynb")
    
    try:
        # Execute with parameters - only pass non-None parameters
        filtered_parameters = {k: v for k, v in parameters.items() if v is not None}
        
        pm.execute_notebook(
            notebook_path,
            executed_notebook_path,
            parameters=filtered_parameters,
            kernel_name=kernel_name        
        )

        # Convert the executed notebook to HTML
        html_exporter = HTMLExporter()
        html_exporter.exclude_input = True  # Exclude input cells for cleaner output
        
        with open(executed_notebook_path, "r", encoding="utf-8") as f:
            executed_notebook = nbformat.read(f, as_version=4)
            html_content, _ = html_exporter.from_notebook_node(executed_notebook)
        
        # Save HTML output
        with open(output_html, "w", encoding="utf-8") as html_file:
            html_file.write(html_content)

        print(f"✓ Successfully executed and saved to: {output_html}")
        
    except Exception as e:
        print(f"✗ Error executing notebook: {str(e)}")
        raise
    
    return executed_notebook_path

# 📊 Notebook Execution Pipeline

## 1. Gap ET Extraction and Demographics

Extracts raw eye-tracking data and participant demographics from the Gap task files. This notebook handles the initial data extraction and basic quality checks.

In [23]:
# Define paths
notebook_to_run = "C:/Users/gabot/OneDrive - McGill University/Desktop/github_repos/q1k_neurosubs/code/et/1_gap_et_extraction.ipynb"
out_dir = Path("papermill_outputs/gap_et_extraction")
out_dir.mkdir(exist_ok=True)

# Parameters specific to extraction notebook
extraction_params = {
    "date": "2025_06_29"}

print("Run gap extraction notebook")

execute_notebook_with_papermill(
        notebook_path=notebook_to_run,
                output_html=str(out_dir / "gap_et_extraction_papermill.html"),
    **extraction_params
    )


Run gap extraction notebook
Executing notebook: 1_gap_et_extraction.ipynb
Parameters: {'date': '2025_06_29'}


Executing:   0%|          | 0/60 [00:00<?, ?cell/s]

✓ Successfully executed and saved to: papermill_outputs\gap_et_extraction\gap_et_extraction_papermill.html


'papermill_outputs\\gap_et_extraction\\gap_et_extraction_papermill_executed.ipynb'

## 2. Gap Data Preprocessing

Performs comprehensive preprocessing of the extracted Gap task data, including:
- Trial validity assessment and exclusion criteria
- Participant-level quality control
- Eye-tracker calibration analysis
- Statistical summaries and visualizations

This step is crucial for ensuring data quality before statistical analysis.

In [24]:
# Define paths and parameters for preprocessing notebook
notebook_to_run = "C:/Users/gabot/OneDrive - McGill University/Desktop/github_repos/q1k_neurosubs/code/et/2_gap_et_preprocessing.ipynb"
out_dir = Path("papermill_outputs/gap_et_preprocessing")
out_dir.mkdir(exist_ok=True)

# Parameters specific to preprocessing notebook
preprocessing_params = {
    "date": "2025_06_29",
    "min_n_trials_per_condition": 6,  # Minimum trials required per condition
    "full_preprocessing": False,  # Whether to process all EDF files from scratch (Time consuming)
    "run_test": False  # Set to True to check for discrepancies in the methods used to export calibration
    
}

print("Run gap preprocessing notebook")

execute_notebook_with_papermill(
        notebook_path=notebook_to_run,
                output_html=str(out_dir / "gap_et_preprocessing_papermill.html"),
    **preprocessing_params
    )

Passed unknown parameter: date
Passed unknown parameter: min_n_trials_per_condition
Passed unknown parameter: full_preprocessing
Passed unknown parameter: run_test


Input notebook does not contain a cell with tag 'parameters'


Run gap preprocessing notebook
Executing notebook: 2_gap_et_preprocessing.ipynb
Parameters: {'date': '2025_06_29', 'min_n_trials_per_condition': 6, 'full_preprocessing': False, 'run_test': False}


Executing:   0%|          | 0/50 [00:00<?, ?cell/s]

✓ Successfully executed and saved to: papermill_outputs\gap_et_preprocessing\gap_et_preprocessing_papermill.html


'papermill_outputs\\gap_et_preprocessing\\gap_et_preprocessing_papermill_executed.ipynb'

## 3. Accuracy Analysis 

Run the preprocessing notebook with specific parameter combinations for targeted analyses. This allows for:
- Comparison of different outlier removal strategies
- Analysis of different data subsets (e.g., pilot vs. main study)
- Parameter sensitivity testing

In [25]:
#Define paths and parameters for preprocessing notebook
notebook_to_run = "C:/Users/gabot/OneDrive - McGill University/Desktop/github_repos/q1k_neurosubs/code/et/3_gap_accuracy_analysis.ipynb"
out_dir = Path("papermill_outputs/gap_accuracy_analysis")
out_dir.mkdir(exist_ok=True)

# Parameters specific to preprocessing notebook
preprocessing_params = {
    "date": "2025_06_30",
    "participant_removal_type":"remove_all",  # Can be "remove_min_trials", "remove_all"
    "age_group": "all" # Can be "all", "child" or "adult"
}


print("Run gap accuracy analysis notebook")

execute_notebook_with_papermill(
        notebook_path=notebook_to_run,
                output_html=str(out_dir /   "gap_accuracy_analysis_papermill.html"),
    **preprocessing_params
    )

Passed unknown parameter: date
Passed unknown parameter: participant_removal_type
Passed unknown parameter: age_group


Input notebook does not contain a cell with tag 'parameters'


Run gap accuracy analysis notebook
Executing notebook: 3_gap_accuracy_analysis.ipynb
Parameters: {'date': '2025_06_30', 'participant_removal_type': 'remove_all', 'age_group': 'all'}


Executing:   0%|          | 0/77 [00:00<?, ?cell/s]

✓ Successfully executed and saved to: papermill_outputs\gap_accuracy_analysis\gap_accuracy_analysis_papermill.html


'papermill_outputs\\gap_accuracy_analysis\\gap_accuracy_analysis_papermill_executed.ipynb'

## 4. Reaction time Analysis 

Run the preprocessing notebook with specific parameter combinations for targeted analyses. This allows for:
- Comparison of different outlier removal strategies
- Analysis of different data subsets (e.g., pilot vs. main study)
- Parameter sensitivity testing

## 3. Additional Analysis Configurations

Run the preprocessing notebook with specific parameter combinations for targeted analyses. This allows for:
- Comparison of different outlier removal strategies
- Analysis of different data subsets (e.g., pilot vs. main study)
- Parameter sensitivity testing

## 3. Additional Analysis Configurations

Run the preprocessing notebook with specific parameter combinations for targeted analyses. This allows for:
- Comparison of different outlier removal strategies
- Analysis of different data subsets (e.g., pilot vs. main study)
- Parameter sensitivity testing

## 3. Additional Analysis Configurations

Run the preprocessing notebook with specific parameter combinations for targeted analyses. This allows for:
- Comparison of different outlier removal strategies
- Analysis of different data subsets (e.g., pilot vs. main study)
- Parameter sensitivity testing

## 3. Additional Analysis Configurations

Run the preprocessing notebook with specific parameter combinations for targeted analyses. This allows for:
- Comparison of different outlier removal strategies
- Analysis of different data subsets (e.g., pilot vs. main study)
- Parameter sensitivity testing

## 3. Additional Analysis Configurations

Run the preprocessing notebook with specific parameter combinations for targeted analyses. This allows for:
- Comparison of different outlier removal strategies
- Analysis of different data subsets (e.g., pilot vs. main study)
- Parameter sensitivity testing

In [3]:
# Run the Gap overlap notebooks wih each outliers

# Date
date = "01_24_2025"
files_date = "01_20_2025"
derivative = "standard" # "standard" or "no_pilot"
outlier_rem = False
execute_notebook_with_papermill(
        notebook_path=notebook_to_run,
        outlier_rem_value=outlier_rem, date=date, files_date=files_date, derivative=derivative,
        output_html=str(out_dir / f"output_outliers_removed_{date}.html")
    )


NameError: name 'out_dir' is not defined

In [None]:
 # 4. Execution Summary and Utilities
 
def list_output_files():
    """List all generated output files."""
    print("GENERATED OUTPUT FILES")
    
    output_base = Path("papermill_outputs")
    if output_base.exists():
        for subfolder in output_base.iterdir():
            if subfolder.is_dir():
                print(f"\n📁 {subfolder.name}/")
                for file in subfolder.iterdir():
                    if file.suffix in ['.html', '.ipynb']:
                        size = file.stat().st_size / 1024  # Size in KB
                        print(f"   📄 {file.name} ({size:.1f} KB)")
    else:
        print("No output directory found.")

def clean_output_files():
    """Clean up generated output files."""
    output_base = Path("papermill_outputs")
    if output_base.exists():
        import shutil
        shutil.rmtree(output_base)
        print("🗑️ Cleaned up all output files.")
    else:
        print("No output files to clean.")

# Show summary of outputs
list_output_files()

print("NOTEBOOK EXECUTION COMPLETE")
print("Use list_output_files() to see generated files")
print("Use clean_output_files() to remove all outputs")

## 4. Execution Summary and File Management

Utility functions for managing outputs and summarizing execution results.