# Agentic CFO Copilot: AIML Engine Demonstration

This notebook provides a complete, end-to-end demonstration of the AIML engine. We will walk through each core module, from ingesting a raw CSV file to generating a final, dashboard-ready JSON output.

### Step 0: Setup and Imports

First, let's import all the necessary modules from our `aiml_engine` and other libraries. Make sure you have installed all dependencies from `requirements.txt`.

In [None]:
import pandas as pd
import sys
import os
import json
import plotly.graph_objects as go

# Add the project root to the Python path to allow for module imports
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))

from aiml_engine.core.data_ingestion import DataIngestion
from aiml_engine.core.data_validation import DataValidationQualityAssuranceEngine
from aiml_engine.core.feature_engineering import KPIAutoExtractionDynamicFeatureEngineering
from aiml_engine.core.forecasting import ForecastingModule
from aiml_engine.core.anomaly_detection import AnomalyDetectionModule
from aiml_engine.core.correlation import CrossMetricCorrelationTrendMiningEngine
from aiml_engine.core.simulation import ScenarioSimulationEngine
from aiml_engine.core.dashboard import BusinessDashboardOutputLayer
from aiml_engine.utils.helpers import serialize_to_json

# Define the path to our sample data
DATA_FILE_PATH = '../data/sample_financial_data.csv'

ModuleNotFoundError: No module named 'aiml_engine'

### Step 1: Data Ingestion and Normalization

We start by feeding a raw CSV file to the `DataIngestion` module. This module will automatically detect the column meanings (e.g., mapping `Sales Revenue` to `revenue`) and create a standardized DataFrame.

In [4]:
ingestion_module = DataIngestion()
normalized_df, header_mappings = ingestion_module.ingest_and_normalize(DATA_FILE_PATH)

print("--- Header Mappings ---")
print(json.dumps(header_mappings, indent=2))
print("\n--- Normalized DataFrame Head ---")
display(normalized_df.head())

Error reading CSV file: [Errno 2] No such file or directory: '../data/sample_financial_data.csv'
--- Header Mappings ---
{}

--- Normalized DataFrame Head ---


### Step 2: Data Validation and Quality Assurance

Next, we pass the normalized data through the validation engine. It will handle missing values, coerce data types, and produce a `validation_report` and a `corrections_log` detailing every change made.

In [None]:
validation_module = DataValidationQualityAssuranceEngine()
validated_df, validation_report, corrections_log = validation_module.run_pipeline(normalized_df, header_mappings)

print("--- Validation Report ---")
print(json.dumps(validation_report, indent=2))
print("\n--- Corrections Log (Sample) ---")
print(json.dumps(corrections_log[:3], indent=2))

### Step 3: KPI Auto-Extraction & Feature Engineering

Now, we derive important financial KPIs and features like `profit_margin` and MoM growth rates. The `feature_schema` documents the origin and transformation of each new feature.

In [None]:
feature_module = KPIAutoExtractionDynamicFeatureEngineering()
featured_df, feature_schema = feature_module.extract_and_derive_features(validated_df)

print("--- Feature Schema ---")
print(json.dumps(feature_schema, indent=2))
print("\n--- DataFrame with New Features (Head) ---")
display(featured_df[['date', 'revenue', 'expenses', 'profit', 'profit_margin', 'revenue_mom_growth']].head())

### Step 4: Predictive Forecasting

The forecasting module automatically selects the best model (between AutoARIMA and Prophet) and generates a 3-month forecast for the specified metric, including confidence intervals.

In [None]:
forecasting_module = ForecastingModule(metric='revenue', date_col='date')
forecast, model_health = forecasting_module.generate_forecast(featured_df)

print("--- Model Health Report ---")
print(json.dumps(model_health, indent=2))
print("\n--- Forecast Results ---")
print(json.dumps(forecast, indent=2))

#### Visualizing the Forecast

Let's plot the historical data along with the forecast to visualize the results.

In [None]:
if forecast:
    forecast_df = pd.DataFrame(forecast)
    forecast_df['date'] = pd.to_datetime(forecast_df['date'])

    fig = go.Figure()
    # Historical Data
    fig.add_trace(go.Scatter(x=featured_df['date'], y=featured_df['revenue'], mode='lines', name='Historical Revenue'))
    # Forecast Data
    fig.add_trace(go.Scatter(x=forecast_df['date'], y=forecast_df['predicted'], mode='lines', name='Forecasted Revenue', line=dict(dash='dash')))
    # Confidence Interval
    fig.add_trace(go.Scatter(x=forecast_df['date'], y=forecast_df['upper'], fill=None, mode='lines', line_color='rgba(0,0,0,0)', name='Upper CI'))
    fig.add_trace(go.Scatter(x=forecast_df['date'], y=forecast_df['lower'], fill='tonexty', mode='lines', line_color='rgba(0,0,0,0)', name='Lower CI'))

    fig.update_layout(title='Revenue Forecast', xaxis_title='Date', yaxis_title='Revenue')
    fig.show()
else:
    print("No forecast generated. Skipping visualization.")

### Step 5: Anomaly Detection

The `AnomalyDetectionModule` scans the data for significant outliers and reports them with severity and a reason.

In [None]:
anomaly_module = AnomalyDetectionModule()
anomalies = anomaly_module.detect_anomalies(featured_df, metric='revenue')

print("--- Detected Anomalies ---")
print(json.dumps(anomalies, indent=2))

### Step 6: Cross-Metric Correlation & Trend Mining

Here, we uncover hidden relationships between different financial metrics.

In [None]:
correlation_module = CrossMetricCorrelationTrendMiningEngine()
correlation_report = correlation_module.generate_correlation_report(featured_df)

print("--- Correlation Report ---")
print(json.dumps(correlation_report, indent=2))

### Step 7: Scenario Simulation (What-If Engine)

Let's simulate the impact of a 15% increase in expenses on profit and cashflow.

In [None]:
simulation_module = ScenarioSimulationEngine()
simulation_results = simulation_module.simulate_scenario(df=featured_df, parameter='expenses', change_pct=15.0)

print("--- Simulation Results ---")
print(serialize_to_json(simulation_results))

### Step 8: Final Dashboard Generation

Finally, we bring everything together into a single, cohesive JSON object ready to be consumed by a frontend dashboard. We will generate it in both 'Finance Guardian' and 'Financial Storyteller' modes.

In [None]:
dashboard_module = BusinessDashboardOutputLayer()

# --- Finance Guardian Mode --- #
guardian_output = dashboard_module.generate_dashboard(
    featured_df=featured_df,
    forecast=forecast,
    anomalies=anomalies,
    mode="finance_guardian",
    correlation_report=correlation_report,
    simulation_results=simulation_results
)

print("="*50)
print("          FINANCE GUARDIAN OUTPUT         ")
print("="*50)
print(serialize_to_json(guardian_output))

# --- Financial Storyteller Mode --- #
storyteller_output = dashboard_module.generate_dashboard(
    featured_df=featured_df,
    forecast=forecast,
    anomalies=anomalies,
    mode="financial_storyteller",
)

print("\n" + "="*50)
print("        FINANCIAL STORYTELLER OUTPUT      ")
print("="*50)
print(serialize_to_json(storyteller_output))