# 🚀 DAPP Hybrid Workflow Template

This notebook demonstrates the **hybrid approach** - seamlessly integrating Jupyter notebook analysis with the high-performance web dashboard.

## Key Features:
- 📡 **Real-time API integration** with FastAPI backend
- 🎨 **Interactive widgets** for data exploration
- 🔄 **Synchronized state** with web dashboard
- 📊 **High-performance visualizations** 
- 🔐 **Secure authentication** and data encryption

---

## 🔧 Setup & Authentication

First, let's connect to the DAPP backend and authenticate.

In [None]:
# Install required packages if not already installed
import sys
import subprocess

def install_if_missing(package):
    try:
        __import__(package)
    except ImportError:
        subprocess.check_call([sys.executable, '-m', 'pip', 'install', package])

# Core imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.offline import iplot, init_notebook_mode

# DAPP Client imports
import sys
sys.path.append('../..')  # Add backend to path
from backend.client import DAPPClient, FileUploadWidget, DataPreviewWidget, ModelTrainingWidget, ResultsVisualizationWidget

# Jupyter widgets
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output

# Initialize plotly for notebook
init_notebook_mode(connected=True)

# Configure pandas and plotting
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 1000)
plt.style.use('seaborn-v0_8')
sns.set_palette('husl')

print("✅ All packages imported successfully!")

In [None]:
# Initialize DAPP client
# Make sure your FastAPI server is running on localhost:8000
client = DAPPClient("http://localhost:8000")

# Check connection
try:
    health = client.health_check()
    print(f"🟢 Connected to DAPP backend: {health['service']} v{health['version']}")
except Exception as e:
    print(f"🔴 Connection failed: {e}")
    print("💡 Make sure to start the FastAPI server: python main.py")

In [None]:
# Authenticate with the platform
# Replace with your credentials or use the registration flow

username = input("Enter your email: ")
password = input("Enter your password: ")

try:
    auth_result = client.login(username, password)
    print(f"Welcome, {auth_result['user']['email']}!")
except Exception as e:
    print(f"Authentication failed: {e}")
    print("💡 Register a new account or check your credentials")

## 📁 Interactive File Upload

Use the interactive widget to upload your datasets with drag-and-drop support.

In [None]:
# Create interactive file upload widget
upload_widget = FileUploadWidget(client, auto_process=True)
upload_widget.display()

# Widget will show upload progress and return dataset IDs

In [None]:
# Get uploaded dataset IDs
dataset_ids = upload_widget.get_dataset_ids()
print(f"📊 Uploaded datasets: {dataset_ids}")

# Or list all your datasets
all_datasets = client.list_datasets()
print(f"\n📋 All your datasets ({len(all_datasets)}):")
for ds in all_datasets:
    print(f"  • {ds['filename']} ({ds['id'][:8]}...) - {ds['status']}")

## 👀 Interactive Data Preview

Explore your datasets with the interactive preview widget.

In [None]:
# Create interactive data preview widget
preview_widget = DataPreviewWidget(client)
preview_widget.display()

# Widget allows you to:
# - Select different datasets from dropdown
# - Adjust number of rows to display
# - View data types and null counts

In [None]:
# Get the current dataframe for advanced analysis
df = preview_widget.get_current_dataframe()

if df is not None:
    print(f"📊 Working with dataset: {df.shape[0]:,} rows × {df.shape[1]} columns")
    
    # Advanced EDA with plotly
    numeric_cols = df.select_dtypes(include=[np.number]).columns
    
    if len(numeric_cols) >= 2:
        # Correlation heatmap
        fig = px.imshow(
            df[numeric_cols].corr(),
            title="📈 Correlation Heatmap",
            color_continuous_scale="RdBu",
            aspect="auto"
        )
        fig.show()
        
        # Distribution plots
        fig = go.Figure()
        for col in numeric_cols[:4]:  # Show first 4 numeric columns
            fig.add_trace(go.Histogram(
                x=df[col],
                name=col,
                opacity=0.7,
                nbinsx=30
            ))
        
        fig.update_layout(
            title="📊 Distribution of Numeric Columns",
            xaxis_title="Value",
            yaxis_title="Frequency",
            barmode='overlay'
        )
        fig.show()
else:
    print("⚠️ Please select a dataset using the preview widget above.")

## 🤖 Interactive Model Training

Train linear regression models with an intuitive interface.

In [None]:
# Create interactive model training widget
training_widget = ModelTrainingWidget(client)
training_widget.display()

# Widget allows you to:
# - Select dataset for training
# - Choose target column
# - Select feature columns
# - Adjust train/test split ratio
# - Train model with one click

## 📊 Interactive Results Visualization

Visualize model performance with interactive plots.

In [None]:
# Create interactive results visualization widget
results_widget = ResultsVisualizationWidget(client)
results_widget.display()

# Widget provides:
# - Model selection dropdown
# - Multiple plot types (predictions, residuals, importance)
# - Interactive plotly visualizations
# - Performance metrics display

## 🔄 Web Dashboard Synchronization

All your work in this notebook is automatically synchronized with the web dashboard!

In [None]:
# Display web dashboard integration info
display(HTML("""
<div style="border: 2px solid #4CAF50; border-radius: 10px; padding: 20px; background-color: #f0f8ff;">
    <h3>🌐 Web Dashboard Integration</h3>
    <p>Your notebook work is synchronized with the web dashboard:</p>
    <ul>
        <li>✅ <strong>Datasets uploaded here</strong> → Visible in web dashboard</li>
        <li>✅ <strong>Models trained here</strong> → Available in web dashboard</li>
        <li>✅ <strong>Results generated here</strong> → Shared across platforms</li>
        <li>✅ <strong>Real-time updates</strong> → WebSocket synchronization</li>
    </ul>
    <p><strong>🔗 Access your web dashboard at:</strong> 
       <a href="http://localhost:3000" target="_blank">http://localhost:3000</a>
    </p>
    <p><em>💡 Web dashboard provides faster performance for production users and stakeholders!</em></p>
</div>
"""))

# Show current session info
user_datasets = client.list_datasets()
user_models = client.list_models()

print(f"\n📊 Your DAPP Session Summary:")
print(f"   • Datasets: {len(user_datasets)}")
print(f"   • Models: {len(user_models)}")
print(f"   • Status: Connected to {client.base_url}")

## 🚀 Advanced Features & Integration

Explore advanced capabilities of the hybrid platform.

In [None]:
# Example: Real-time collaboration
# When other users upload data or train models, you'll see updates here

import time
from threading import Thread
import asyncio

def monitor_platform_activity():
    """Monitor platform for real-time updates."""
    print("🔍 Monitoring platform activity... (run this in a separate cell)")
    
    last_dataset_count = len(client.list_datasets())
    last_model_count = len(client.list_models())
    
    while True:
        try:
            current_datasets = len(client.list_datasets())
            current_models = len(client.list_models())
            
            if current_datasets != last_dataset_count:
                print(f"📊 Dataset activity detected! Total datasets: {current_datasets}")
                last_dataset_count = current_datasets
            
            if current_models != last_model_count:
                print(f"🤖 Model activity detected! Total models: {current_models}")
                last_model_count = current_models
                
        except Exception as e:
            print(f"❌ Monitoring error: {e}")
        
        time.sleep(10)  # Check every 10 seconds

# Uncomment to start monitoring (use Kernel -> Interrupt to stop)
# monitor_thread = Thread(target=monitor_platform_activity, daemon=True)
# monitor_thread.start()

print("💡 Uncomment the lines above to enable real-time monitoring!")

In [None]:
# Export your results for sharing
def export_session_summary():
    """Export a summary of the current session."""
    datasets = client.list_datasets()
    models = client.list_models()
    
    summary = {
        'session_date': pd.Timestamp.now().isoformat(),
        'user': 'notebook_user',  # Would get from auth
        'datasets': [
            {
                'id': ds['id'],
                'filename': ds['filename'],
                'status': ds['status'],
                'created_at': ds['created_at']
            }
            for ds in datasets
        ],
        'models': [
            {
                'id': model['id'],
                'target_column': model['target_column'],
                'r2_score': model.get('metrics', {}).get('r2_score'),
                'status': model['status']
            }
            for model in models
        ]
    }
    
    # Save to JSON
    import json
    with open(f'session_summary_{pd.Timestamp.now().strftime("%Y%m%d_%H%M%S")}.json', 'w') as f:
        json.dump(summary, f, indent=2)
    
    return summary

# Export current session
summary = export_session_summary()
print(f"📄 Session exported! {len(summary['datasets'])} datasets, {len(summary['models'])} models")

## 💡 Best Practices & Tips

### When to Use Jupyter vs Web Dashboard:

**Use Jupyter Notebooks for:**
- 🔬 **Exploratory Data Analysis** - Deep dive into data patterns
- 🧪 **Experimental modeling** - Trying different approaches
- 📊 **Custom visualizations** - Advanced plotting and analysis
- 📝 **Documentation** - Explaining methodology and insights
- 🤝 **Collaboration** - Sharing analysis with data scientists

**Use Web Dashboard for:**
- ⚡ **Quick data uploads** - Faster file processing
- 👥 **Stakeholder presentations** - Clean, professional interface
- 📈 **Production monitoring** - Real-time model performance
- 🚀 **Batch operations** - Processing multiple files
- 📱 **Mobile access** - View results on any device

### Performance Tips:
- Keep large datasets in the backend, visualize samples in notebooks
- Use the web dashboard for compute-intensive operations
- Cache frequently accessed data
- Close notebook client connections when not in use


In [None]:
# Clean up resources when done
print("🧹 Cleaning up session...")

# Close client connection
if 'client' in locals():
    client.close()
    print("✅ Client connection closed")

# Clear large variables
if 'df' in locals():
    del df
    print("✅ Large dataframes cleared from memory")

print("\n🎉 Session complete! Your data and models are safely stored in the platform.")
print("🌐 Access your results anytime at: http://localhost:3000")