# üìã Selector Architecture Demo - Phase 4

**Created by: Shohruh127**  
**Repository: Chrono_LLM_RAG**

This notebook demonstrates the Sheet Selector Architecture for domain-specific analysis.

## Overview

The Selector Architecture enables:
- üìä **Multi-sheet Excel support** - List and preview sheets
- üè∑Ô∏è **Domain detection** - Automatic detection from Uzbek/English names
- üîÑ **Context propagation** - Share context across forecasting and LLM modules
- üéØ **Focused analysis** - Analyze one economic domain at a time

## 1. Setup and Imports

In [None]:
import sys
import pandas as pd
from pathlib import Path

# Add src to path
sys.path.append('..')

from src.selector import SheetManager, ContextPropagator, create_selector

print("‚úÖ Imports successful!")

## 2. Create Test Excel File with Multiple Domains

In [None]:
# Create a sample multi-sheet Excel file for testing
test_file = "Namangan_Macro_2024.xlsx"

with pd.ExcelWriter(test_file, engine='openpyxl') as writer:
    # Agriculture sheet
    df_agr = pd.DataFrame({
        'Region': ['Namangan', 'Uchqo\'rg\'on', 'Pop', 'Chust', 'Yangiqo\'rg\'on'],
        'Crop_Yield_2020': [1250, 890, 650, 780, 920],
        'Crop_Yield_2021': [1300, 920, 680, 810, 950],
        'Crop_Yield_2022': [1280, 910, 660, 795, 935],
        'Crop_Yield_2023': [1350, 950, 700, 830, 980],
        'Livestock_2020': [450, 320, 210, 280, 340],
        'Livestock_2021': [470, 335, 220, 295, 355],
        'Livestock_2022': [465, 330, 215, 290, 350],
        'Livestock_2023': [490, 350, 230, 305, 370]
    })
    df_agr.to_excel(writer, sheet_name='7-Qishloq xo\'jaligi', index=False)
    
    # Industry sheet
    df_ind = pd.DataFrame({
        'Region': ['Namangan', 'Uchqo\'rg\'on', 'Pop', 'Chust'],
        'Production_2020': [2500, 1800, 1400, 1600],
        'Production_2021': [2600, 1850, 1450, 1650],
        'Production_2022': [2700, 1900, 1500, 1700],
        'Production_2023': [2800, 1950, 1550, 1750],
        'Employment_2020': [5200, 3800, 2900, 3400],
        'Employment_2021': [5400, 3950, 3000, 3500],
        'Employment_2022': [5600, 4100, 3100, 3600],
        'Employment_2023': [5800, 4250, 3200, 3700]
    })
    df_ind.to_excel(writer, sheet_name='3-Sanoat', index=False)
    
    # Demography sheet
    df_demo = pd.DataFrame({
        'Region': ['Namangan', 'Uchqo\'rg\'on', 'Pop', 'Chust', 'Yangiqo\'rg\'on', 'Chortoq'],
        'Population_2020': [150000, 80000, 60000, 90000, 70000, 55000],
        'Population_2021': [152000, 81000, 61000, 91000, 71000, 56000],
        'Population_2022': [154000, 82000, 62000, 92000, 72000, 57000],
        'Population_2023': [156000, 83000, 63000, 93000, 73000, 58000],
        'Birth_Rate_2020': [18.5, 19.2, 17.8, 18.9, 19.5, 18.1],
        'Birth_Rate_2021': [18.3, 19.0, 17.6, 18.7, 19.3, 17.9],
        'Birth_Rate_2022': [18.1, 18.8, 17.4, 18.5, 19.1, 17.7],
        'Birth_Rate_2023': [17.9, 18.6, 17.2, 18.3, 18.9, 17.5]
    })
    df_demo.to_excel(writer, sheet_name='1-Demografiya', index=False)
    
    # Trade sheet (English name)
    df_trade = pd.DataFrame({
        'Region': ['Namangan', 'Uchqo\'rg\'on', 'Pop'],
        'Export_2020': [500, 350, 280],
        'Export_2021': [520, 365, 290],
        'Export_2022': [540, 380, 300],
        'Export_2023': [560, 395, 310],
        'Import_2020': [300, 210, 165],
        'Import_2021': [310, 220, 172],
        'Import_2022': [320, 230, 179],
        'Import_2023': [330, 240, 186]
    })
    df_trade.to_excel(writer, sheet_name='5-Trade', index=False)

print(f"‚úÖ Created test file: {test_file}")
print(f"   - 7-Qishloq xo'jaligi (Agriculture): {len(df_agr)} rows")
print(f"   - 3-Sanoat (Industry): {len(df_ind)} rows")
print(f"   - 1-Demografiya (Demography): {len(df_demo)} rows")
print(f"   - 5-Trade: {len(df_trade)} rows")

## 3. Initialize Selector Components

In [None]:
# Create selector components
manager, context = create_selector(test_file)

print(f"‚úÖ Selector initialized: {manager}")

## 4. List Available Sheets

In [None]:
# List all available sheets with metadata
sheets = manager.list_sheets()

print("üìã Available Sheets:\n")
for sheet in sheets:
    print(f"  ‚Ä¢ {sheet['name']:<30} | Domain: {sheet['domain']:<15} | {sheet['rows']} rows √ó {sheet['cols']} cols")

# Show available domains
domains = manager.get_available_domains()
print(f"\nüè∑Ô∏è Available Domains: {', '.join(domains)}")

## 5. Preview a Sheet

In [None]:
# Preview Agriculture sheet
sheet_name = "7-Qishloq xo'jaligi"
preview = manager.get_sheet_preview(sheet_name, rows=3)

print(f"üëÅÔ∏è Preview of '{sheet_name}' (first 3 rows):\n")
display(preview)

## 6. Select and Load a Domain

In [None]:
# Select Agriculture domain
sheet_name = "7-Qishloq xo'jaligi"
df = manager.select_sheet(sheet_name)
domain = manager.detect_domain(sheet_name)

print(f"‚úÖ Selected: {sheet_name}")
print(f"   Domain: {domain}")
print(f"   Shape: {df.shape}")
print(f"\nColumns: {', '.join(df.columns)}")
print(f"\nFirst 3 rows:")
display(df.head(3))

## 7. Set Context for Downstream Modules

In [None]:
# Set context
context.set_context(sheet_name, df, domain)

print("üîÑ Context set successfully!")
print(f"   {context}")

# Get context information
ctx = context.get_context()
print(f"\nüìã Context Information:")
for key, value in ctx.items():
    if key != 'columns':
        print(f"   {key}: {value}")

## 8. Generate Domain-Specific Prompt

In [None]:
# Get domain-specific prompt for LLM
prompt = context.get_domain_prompt()

print("ü§ñ Domain-Specific Prompt for LLM:\n")
print("="*70)
print(prompt)
print("="*70)

## 9. Switch to Different Domain

In [None]:
# Switch to Industry domain
sheet_name_ind = "3-Sanoat"
df_ind = manager.select_sheet(sheet_name_ind)
domain_ind = manager.detect_domain(sheet_name_ind)

# Clear old context and set new one
context.clear_context()
context.set_context(sheet_name_ind, df_ind, domain_ind)

print(f"üîÑ Switched to: {domain_ind}")
print(f"   Sheet: {sheet_name_ind}")
print(f"   Shape: {df_ind.shape}")
print(f"\nNew prompt:")
print(context.get_domain_prompt()[:200] + "...")

## 10. Integration with Forecasting (Conceptual)

The selector can be integrated with the Chronos forecasting module:

In [None]:
# Conceptual example (requires forecasting module setup)
print("üìä Forecasting Integration Example:\n")
print("# Get current domain data")
print("df = context.get_dataframe()")
print("")
print("# Initialize forecaster")
print("from src.chronos_forecaster import ChronosForecaster")
print("forecaster = ChronosForecaster()")
print("")
print("# Forecast only for selected domain")
print("forecaster.load_data(df)")
print("predictions = forecaster.predict(horizon=4)")
print("")
print("# Results are scoped to current domain (Industry)")
print(f"print(f'Generated forecasts for {{context.get_domain()}} domain')")

## 11. Summary

The Selector Architecture provides:

‚úÖ **Multi-sheet support** - Handle Excel files with 24+ sheets  
‚úÖ **Domain detection** - Automatic detection from Uzbek/English names  
‚úÖ **Context management** - Centralized context for all modules  
‚úÖ **Focused analysis** - Analyze one domain at a time  
‚úÖ **Easy integration** - Works with existing forecasting and LLM modules

### Key Components:

1. **SheetManager** - Lists, previews, and loads sheets
2. **ContextPropagator** - Shares context across modules
3. **UI Components** - Gradio dropdowns and badges

### Next Steps:

- Integrate with Gradio app (`src/app.py`)
- Connect to forecasting module
- Use domain-specific prompts in LLM chat