# Cradle Data Load Configuration - Interactive UI Widget Example

This notebook demonstrates how to use the Cradle Config Widget to replace manual configuration blocks with an interactive UI.

## 🎯 What This Notebook Does

1. **Automatically starts the Cradle UI server** (self-contained)
2. Sets up a basic pipeline configuration
3. Creates an interactive UI widget for Cradle data loading configuration
4. Generates a `CradleDataLoadConfig` object that can be used in your pipeline
5. Shows how to add the config to your `config_list`
6. Demonstrates creating multiple configurations **one at a time**

## 📋 Prerequisites

- Required packages: `ipywidgets`, `requests`, `uvicorn`
- **No manual server setup required** - this notebook handles it automatically!

## Step 1: Setup and Imports

In [None]:
# Standard imports
import sys
from pathlib import Path
import json
from datetime import datetime
import subprocess
import time
import requests
import threading
import atexit

# Add project root to path
project_root = str(Path().absolute().parent.parent.parent)
if project_root not in sys.path:
    sys.path.insert(0, project_root)
    print(f"Added project root to path: {project_root}")

print("✅ Imports and path setup complete")

## Step 2: Start Cradle UI Server (Self-Contained)

This notebook automatically starts the server for you - no manual setup required!

In [None]:
# Global server process variable
server_process = None

def start_cradle_server(port=8001):
    """Start the Cradle UI server automatically."""
    global server_process
    
    # Check if server is already running
    try:
        response = requests.get(f"http://localhost:{port}/health", timeout=2)
        if response.status_code == 200:
            print(f"✅ Server already running on port {port}")
            return True
    except requests.exceptions.RequestException:
        pass
    
    # Start the server
    try:
        print(f"🚀 Starting Cradle UI server on port {port}...")
        
        # Start uvicorn server as subprocess
        cmd = [
            sys.executable, "-m", "uvicorn",
            "cursus.api.cradle_ui.app:app",
            "--host", "0.0.0.0",
            "--port", str(port),
            "--reload"
        ]
        
        server_process = subprocess.Popen(
            cmd,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            cwd=project_root
        )
        
        # Wait for server to start
        print("⏳ Waiting for server to start...")
        for i in range(10):  # Wait up to 10 seconds
            time.sleep(1)
            try:
                response = requests.get(f"http://localhost:{port}/health", timeout=2)
                if response.status_code == 200:
                    print(f"✅ Server started successfully on http://localhost:{port}")
                    return True
            except requests.exceptions.RequestException:
                continue
        
        print("❌ Server failed to start within 10 seconds")
        return False
        
    except Exception as e:
        print(f"❌ Error starting server: {str(e)}")
        return False

def stop_cradle_server():
    """Stop the Cradle UI server."""
    global server_process
    if server_process:
        print("🛑 Stopping Cradle UI server...")
        server_process.terminate()
        server_process.wait()
        server_process = None
        print("✅ Server stopped")

# Register cleanup function
atexit.register(stop_cradle_server)

# Start the server
server_started = start_cradle_server()

if server_started:
    print("\n🎉 Cradle UI server is ready!")
    print("📱 You can also access the UI directly at: http://localhost:8001")
else:
    print("\n⚠️ Server failed to start. Please check the error messages above.")

## Step 3: Create Base Configuration

First, let's create a basic pipeline configuration that the widget will use as a foundation.

In [None]:
# Import base configuration
from cursus.core.base.config_base import BasePipelineConfig

# Create a sample base configuration
base_config = BasePipelineConfig(
    bucket="sandboxdependency-abuse-secureaisandboxteamshare-1l77v9am252um",
    current_date="2025-10-06",
    region="NA",
    aws_region="us-east-1",
    author="lukexie",
    role="arn:aws:iam::601857636239:role/SandboxRole-lukexie-us-east-1",
    service_name="AtoZ",
    pipeline_version="1.3.1",
    project_root_folder="cursus",
    framework_version="1.7-1",
    py_version="py3",
    source_dir="../../../../dockers/xgboost_atoz",
)

print("✅ Base configuration created:")
print(f"   Author: {base_config.author}")
print(f"   Region: {base_config.region}")
print(f"   Service: {base_config.service_name}")
print(f"   Bucket: {base_config.bucket}")
print(f"   Project Root: {base_config.project_root_folder}")

## Step 4: Initialize Config List

Create the config list that will store all our pipeline configurations.

In [None]:
# Initialize config list (same as in demo_config.ipynb)
config_list = []

# Add base config to the list
config_list.append(base_config)

print(f"✅ Config list initialized with {len(config_list)} item(s)")

## Step 5: Create Training Configuration Widget

Now let's create the interactive widget for training data configuration. This replaces the complex manual configuration block.

**⚠️ Important**: Create and complete **one configuration at a time** to avoid conflicts.

In [None]:
# Import the widget
from cursus.api.cradle_ui.jupyter_widget import create_cradle_config_widget

# Create the training configuration widget
print("🎯 Creating Training Data Configuration Widget")
print("=" * 50)

training_widget = create_cradle_config_widget(
    base_config=base_config,
    job_type="training",
    height="700px"
)

# Display the widget
training_widget.display()

print("\n📝 Instructions:")
print("1. Complete the 4-step configuration in the UI above")
print("2. Click 'Finish' in the UI to generate the configuration")
print("3. Click 'Get Configuration' button to save to JSON file")
print("4. Run the next cell to load the configuration")

In [None]:
# Create calibration configuration widget (after clearing the previous one)
print("🎯 Creating Calibration Data Configuration Widget")
print("=" * 50)

calibration_widget = create_cradle_config_widget(
    base_config=base_config,
    job_type="calibration",
    height="700px"
)

# Display the widget
calibration_widget.display()

print("\n📝 Instructions:")
print("1. Complete the 4-step configuration in the UI above")
print("2. Click 'Finish' in the UI to generate the configuration")
print("3. Click 'Get Configuration' button to save to JSON file")
print("4. Run the next cell to load the configuration")

In [None]:
# Create calibration configuration widget (after clearing the previous one)
print("🎯 Creating Calibration Data Configuration Widget")
print("=" * 50)

calibration_widget = create_cradle_config_widget(
    base_config=base_config,
    job_type="calibration",
    height="700px"
)

# Display the widget
calibration_widget.display()

print("\n📝 Instructions:")
print("1. Complete the 4-step configuration in the UI above")
print("2. Click 'Finish' in the UI to generate the configuration")
print("3. Click 'Get Configuration' button to save to JSON file")
print("4. Run the next cell to load the configuration")

## Step 6: Load Training Configuration

After completing the UI configuration above and saving the JSON file, run this cell to load the configuration.

In [None]:
# Load the training configuration from the saved JSON file
from cursus.api.cradle_ui.utils.config_loader import load_cradle_config_from_json

try:
    # Update this path to match where you saved your configuration
    config_file_path = './cradle_data_load_config.json'  # Update this path!
    
    # Load the configuration (properly handles all nested objects)
    training_cradle_data_load_config = load_cradle_config_from_json(config_file_path)
    
    print("✅ Training configuration loaded successfully!")
    print("=" * 50)
    print(f"Job Type: {training_cradle_data_load_config.job_type}")
    print(f"Author: {training_cradle_data_load_config.author}")
    print(f"Region: {training_cradle_data_load_config.region}")
    print(f"Data Sources: {len(training_cradle_data_load_config.data_sources_spec.data_sources)}")
    print(f"Start Date: {training_cradle_data_load_config.data_sources_spec.start_date}")
    print(f"End Date: {training_cradle_data_load_config.data_sources_spec.end_date}")
    print(f"Output Format: {training_cradle_data_load_config.output_spec.output_format}")
    print(f"Cluster Type: {training_cradle_data_load_config.cradle_job_spec.cluster_type}")
    
    # Add to config list (same as manual configuration)
    config_list.append(training_cradle_data_load_config)
    print(f"\n✅ Added to config_list. Total configs: {len(config_list)}")
    
except FileNotFoundError:
    print("⚠️ Configuration file not found.")
    print("Please:")
    print("1. Complete the configuration in the UI above")
    print("2. Click 'Finish' in the UI")
    print("3. Click 'Get Configuration' button to save the JSON file")
    print("4. Update the config_file_path variable above with the correct path")
    
except Exception as e:
    print(f"❌ Error loading configuration: {str(e)}")
    print("Please ensure the JSON file was saved correctly from the UI.")

In [None]:
print(config_list[1])

## Step 7: Clear Training Widget (Important!)

**Clear the training widget before creating the next one** to avoid conflicts and duplicate error messages.

In [None]:
# Clear the training widget to avoid conflicts
if 'training_widget' in locals():
    # Stop the background checker thread
    training_widget._stop_config_checker()
    # Clear the widget display
    training_widget.widget.close()
    del training_widget
    print("✅ Training widget cleared successfully")
else:
    print("⚠️ No training widget to clear")

print("\n🎯 Ready to create the next configuration widget!")

## Step 8: Create Calibration Configuration Widget

Now create a **new widget** for calibration configuration. This ensures only one widget is active at a time.

In [None]:
# Create calibration configuration widget (after clearing the previous one)
print("🎯 Creating Calibration Data Configuration Widget")
print("=" * 50)

calibration_widget = create_cradle_config_widget(
    base_config=base_config,
    job_type="calibration",
    height="700px"
)

# Display the widget
calibration_widget.display()

print("\n📝 Instructions:")
print("1. Complete the 4-step configuration in the UI above")
print("2. Click 'Finish' in the UI to generate the configuration")
print("3. Click 'Get Configuration' button to save to JSON file")
print("4. Run the next cell to load the configuration")

## Step 9: Load Calibration Configuration

In [None]:
# Load the calibration configuration from the saved JSON file
try:
    # Update this path to match where you saved your calibration configuration
    calibration_config_file_path = './cradle_data_load_config.json'  # Update this path!
    
    calibration_cradle_data_load_config = load_cradle_config_from_json(calibration_config_file_path)
    
    print("✅ Calibration configuration loaded successfully!")
    print("=" * 50)
    print(f"Job Type: {calibration_cradle_data_load_config.job_type}")
    print(f"Author: {calibration_cradle_data_load_config.author}")
    print(f"Region: {calibration_cradle_data_load_config.region}")
    print(f"Data Sources: {len(calibration_cradle_data_load_config.data_sources_spec.data_sources)}")
    print(f"Start Date: {calibration_cradle_data_load_config.data_sources_spec.start_date}")
    print(f"End Date: {calibration_cradle_data_load_config.data_sources_spec.end_date}")
    print(f"Output Format: {calibration_cradle_data_load_config.output_spec.output_format}")
    print(f"Cluster Type: {calibration_cradle_data_load_config.cradle_job_spec.cluster_type}")
    
    # Add to config list
    config_list.append(calibration_cradle_data_load_config)
    print(f"\n✅ Added to config_list. Total configs: {len(config_list)}")
    
except FileNotFoundError:
    print("⚠️ Calibration configuration file not found.")
    print("Please complete the UI configuration and save the JSON file first.")
except Exception as e:
    print(f"❌ Error loading calibration configuration: {str(e)}")

## Step 10: Clean Up and Display Final Results

Clean up the final widget and show the complete configuration list.

In [None]:
# Clean up the calibration widget
if 'calibration_widget' in locals():
    calibration_widget._stop_config_checker()
    calibration_widget.widget.close()
    del calibration_widget
    print("✅ Calibration widget cleared")

# Display final results
print("\n🎉 Configuration Generation Complete!")
print("=" * 50)
print(f"Total configurations in config_list: {len(config_list)}")
print()

for i, config in enumerate(config_list):
    config_type = type(config).__name__
    print(f"{i+1}. {config_type}")
    
    if hasattr(config, 'job_type'):
        print(f"   Job Type: {config.job_type}")
    if hasattr(config, 'author'):
        print(f"   Author: {config.author}")
    if hasattr(config, 'region'):
        print(f"   Region: {config.region}")
    print()

print("✅ All configurations are ready for use in your pipeline!")

## Step 11: Server Management (Optional)

The server will automatically stop when the notebook kernel is shut down, but you can also stop it manually if needed.

In [None]:
# Optional: Stop the server manually
# Uncomment the line below if you want to stop the server
# stop_cradle_server()

print("ℹ️ Server management:")
print("- Server will automatically stop when kernel shuts down")
print("- To stop manually, uncomment and run: stop_cradle_server()")
print("- Server URL: http://localhost:8001")

## 🎯 Summary

This **self-contained notebook** demonstrated the complete workflow:

1. ✅ **Automatic server startup** - no manual setup required
2. ✅ **Create one widget at a time** to avoid conflicts
3. ✅ **Complete the configuration** through the UI
4. ✅ **Save to JSON file** using the "Get Configuration" button
5. ✅ **Load the configuration** using `load_cradle_config_from_json()`
6. ✅ **Clear the widget** before creating the next one
7. ✅ **Automatic cleanup** when notebook closes

## 🔑 Key Benefits of This Self-Contained Approach

- **🚀 Zero Setup**: Server starts automatically
- **🚫 No Manual Steps**: Everything handled programmatically
- **🚫 No Conflicts**: Only one widget active at a time
- **🚫 No Duplicate Errors**: Event handler cleanup prevents issues
- **✅ Clean Workflow**: Clear separation between configurations
- **✅ Resource Efficient**: Proper cleanup of background threads
- **✅ Same Results**: Identical CradleDataLoadConfig objects as manual approach
- **✅ Production Ready**: Robust error handling and server management

## 🔄 Self-Contained Workflow Pattern

```python
# Complete self-contained pattern:
start_cradle_server()  # Automatic server startup
widget = create_cradle_config_widget(base_config, job_type="training")
widget.display()
# Complete UI, save JSON file
config = load_cradle_config_from_json('file.json')
config_list.append(config)
widget._stop_config_checker()
widget.widget.close()
del widget
# Server automatically stops on kernel shutdown
```

**This self-contained approach ensures a completely automated, conflict-free experience when creating Cradle configurations!**