# Notebook Generator Demo

This notebook demonstrates how to use the `notebook_generator` module to dynamically create Jupyter notebooks from CSV or Excel files.

## Overview

The notebook generator reads a CSV or Excel file with the following structure:
- **Stage**: Folder name
- **Step**: Notebook name (without .ipynb extension)
- **Activities**: Markdown header content

It then creates:
1. A folder for each unique Stage
2. A notebook for each unique Step within its Stage folder
3. For each Activity: a markdown cell (with activity as header) followed by an empty code cell

## Key Features

- **Automatic file type detection**: Based on file extension (.csv, .xlsx, .xls, .xlsm)
- **Excel sheet support**: Specify sheet name or use first sheet by default
- **Overwrite control**: Choose whether to overwrite existing notebooks
- **Clear function**: Safely clear generated folders with built-in safety checks
- **Statistics tracking**: Get detailed info on what was created or deleted
- **Preview mode**: Check structure before generating files

## Dependencies

- **nbformat**: Required (for creating notebooks)
- **pandas**: Required for Excel files only (install with: `pip install pandas openpyxl`)

## Setup

In [7]:
%load_ext autoreload
%autoreload 2
import sys
import os
from pathlib import Path
sys.path.append('../../')

from helpers.notebook_generator import (
    generate_notebooks_from_file,
    generate_notebooks_from_csv,  # Backward compatibility
    preview_structure,
    clear_generated_folder
)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Example 1: CSV Files

### Preview Structure

Before generating notebooks, let's preview what will be created from a CSV file:

In [8]:
csv_filename = "stages_and_steps.csv"
csv_file = os.path.join(Path(os.getcwd()) / "workspace" / "workflows" / "_Tools" / "files", csv_filename)

# Preview the folder structure
print(preview_structure(csv_file))

Folder Structure Preview:
Source: CSV file
01_Setup/
  - 01_Initialize_Environment.ipynb (4 activities)
  - 02_Validate_Config_File.ipynb (4 activities)
02_Data_Extraction/
  - 01_Validate_And_Update_SQL_Scripts.ipynb (3 activities)
  - 02_Extract_Import_CSVs.ipynb (2 activities)
03_Data_Import/
  - 01_Import_Base_Data.ipynb (4 activities)
  - 02_Portfolio_Mapping.ipynb (4 activities)
04_Analysis_Execution/
  - 01_GeoHaz_Portfolios.ipynb (3 activities)
  - 02_Execute_Analysis.ipynb (2 activities)
05_Grouping_and_Export/
  - 01_Group_Analysis_Results.ipynb (2 activities)
  - 02_Export_to_RDM.ipynb (2 activities)


### Generate Notebooks from CSV

Now let's generate the actual notebooks and folder structure from the CSV file:

In [9]:
# Construct output directory path
output_dir = os.path.join(Path(os.getcwd()) / "workspace" / "workflows", "_Template", "generated")

# Generate notebooks from CSV
stats = generate_notebooks_from_file(
    file_path=csv_file,
    output_dir=output_dir,
    overwrite=True,  # Set to True to overwrite existing notebooks
    verbose=True
)

print("\nGeneration statistics:")
print(stats)

Parsing CSV file: /home/jovyan/workspace/workflows/_Tools/files/stages_and_steps.csv
Output directory: /home/jovyan/workspace/workflows/_Template/generated
Found 5 stages

Stage: 01_Setup
  - 01_Initialize_Environment.ipynb (4 activities)
  - 02_Validate_Config_File.ipynb (4 activities)

Stage: 02_Data_Extraction
  - 01_Validate_And_Update_SQL_Scripts.ipynb (3 activities)
  - 02_Extract_Import_CSVs.ipynb (2 activities)

Stage: 03_Data_Import
  - 01_Import_Base_Data.ipynb (4 activities)
  - 02_Portfolio_Mapping.ipynb (4 activities)

Stage: 04_Analysis_Execution
  - 01_GeoHaz_Portfolios.ipynb (3 activities)
  - 02_Execute_Analysis.ipynb (2 activities)

Stage: 05_Grouping_and_Export
  - 01_Group_Analysis_Results.ipynb (2 activities)
  - 02_Export_to_RDM.ipynb (2 activities)

GENERATION COMPLETE
Stages created: 5
Notebooks created: 10
Notebooks skipped: 0
Total activities: 30

Generation statistics:
{'stages_created': 5, 'notebooks_created': 10, 'notebooks_skipped': 0, 'total_activities': 

## Example 2: Excel Files

### Preview Excel Structure

You can also work with Excel files. The module will automatically detect the file type:

In [10]:
# Example with Excel file
excel_filename = "stages_and_steps.xlsx"
excel_file = os.path.join(Path(os.getcwd()) / "workspace" / "workflows" / "_Tools" / "files", excel_filename)
print(preview_structure(excel_file))

# Or specify a specific sheet
print(preview_structure(excel_file, sheet_name="Sheet1"))

Folder Structure Preview:
Source: Excel file (first sheet)
01_Setup/
  - 01_Initialize_Environment.ipynb (4 activities)
  - 02_Validate_Config_File.ipynb (4 activities)
02_Data_Extraction/
  - 01_Validate_And_Update_SQL_Scripts.ipynb (3 activities)
  - 02_Extract_Import_CSVs.ipynb (2 activities)
03_Data_Import/
  - 01_Import_Base_Data.ipynb (4 activities)
  - 02_Portfolio_Mapping.ipynb (4 activities)
04_Analysis_Execution/
  - 01_GeoHaz_Portfolios.ipynb (3 activities)
  - 02_Execute_Analysis.ipynb (2 activities)
05_Grouping_and_Export/
  - 01_Group_Analysis_Results.ipynb (2 activities)
  - 02_Export_to_RDM.ipynb (2 activities)
Folder Structure Preview:
Source: Excel file (sheet: Sheet1)
01_Setup/
  - 01_Initialize_Environment.ipynb (4 activities)
  - 02_Validate_Config_File.ipynb (4 activities)
02_Data_Extraction/
  - 01_Validate_And_Update_SQL_Scripts.ipynb (3 activities)
  - 02_Extract_Import_CSVs.ipynb (2 activities)
03_Data_Import/
  - 01_Import_Base_Data.ipynb (4 activities)
  - 0

### Generate Notebooks from Excel

Generate notebooks from Excel file (automatically detects file type):

In [11]:
output_dir = os.path.join(Path(os.getcwd()) / "workspace" / "workflows", "_Template", "generated")

stats = generate_notebooks_from_file(
    file_path=excel_file,
    output_dir=output_dir,
    sheet_name="Sheet1",  # Optional: specify sheet name (default: first sheet)
    overwrite=True,
    verbose=True
)

print("\nGeneration statistics:")
print(stats)

Parsing Excel file: /home/jovyan/workspace/workflows/_Tools/files/stages_and_steps.xlsx (sheet: Sheet1)
Output directory: /home/jovyan/workspace/workflows/_Template/generated
Found 5 stages

Stage: 01_Setup
  - 01_Initialize_Environment.ipynb (4 activities)
  - 02_Validate_Config_File.ipynb (4 activities)

Stage: 02_Data_Extraction
  - 01_Validate_And_Update_SQL_Scripts.ipynb (3 activities)
  - 02_Extract_Import_CSVs.ipynb (2 activities)

Stage: 03_Data_Import
  - 01_Import_Base_Data.ipynb (4 activities)
  - 02_Portfolio_Mapping.ipynb (4 activities)

Stage: 04_Analysis_Execution
  - 01_GeoHaz_Portfolios.ipynb (3 activities)
  - 02_Execute_Analysis.ipynb (2 activities)

Stage: 05_Grouping_and_Export
  - 01_Group_Analysis_Results.ipynb (2 activities)
  - 02_Export_to_RDM.ipynb (2 activities)

GENERATION COMPLETE
Stages created: 0
Notebooks created: 10
Notebooks skipped: 0
Total activities: 30

Generation statistics:
{'stages_created': 0, 'notebooks_created': 10, 'notebooks_skipped': 0, '

## Key Features

- **Automatic file type detection**: Based on file extension (.csv, .xlsx, .xls, .xlsm)
- **Excel sheet support**: Specify sheet name or use first sheet by default
- **Overwrite control**: Choose whether to overwrite existing notebooks
- **Clear function**: Safely clear generated folders with built-in safety checks
- **Statistics tracking**: Get detailed info on what was created or deleted
- **Preview mode**: Check structure before generating files

## Dependencies

- **nbformat**: Required (for creating notebooks)
- **pandas**: Required for Excel files only (install with: `pip install pandas openpyxl`)

## Clearing the Generated Folder

Before regenerating notebooks, you may want to clear out the existing generated folder. The module provides a safe way to do this with built-in safety checks:

### Safety Features

The `clear_generated_folder` function includes important safety features:

1. **Path Validation**: By default, the path must contain 'generated' to prevent accidental deletion of important folders
2. **Statistics**: Returns detailed info about what was deleted
3. **Confirmation**: The `confirm=True` parameter enforces the safety check

**WARNING**: Only use `confirm=False` if you're absolutely certain about the path!

In [12]:
# Clear the generated folder
# NOTE: This has a built-in safety check - the path must contain 'generated'
output_dir = os.path.join(Path(os.getcwd()) / "workspace" / "workflows", "_Template", "generated")

clear_stats = clear_generated_folder(
    output_dir=output_dir,
    confirm=True,  # Safety check: path must contain 'generated'
    verbose=True
)

print("\nClear statistics:")
print(clear_stats)

Clearing directory: /home/jovyan/workspace/workflows/_Template/generated
Found 10 files and 5 folders
CLEAR COMPLETE
Removed 10 files and 5 folders

Clear statistics:
{'deleted': True, 'folders_removed': 5, 'files_removed': 10, 'path': '/home/jovyan/workspace/workflows/_Template/generated'}
