# üèóÔ∏è An√°lise de Data Warehouse - Coffee Sales com LLM

Este notebook realiza uma an√°lise avan√ßada dos dados tratados de vendas de caf√© na tabela `dw_coffee` utilizando t√©cnicas de LLM para extrair insights profundos e gerar relat√≥rios autom√°ticos.

## üéØ Objetivos da An√°lise:
- üìä Explorar dados limpos e estruturados do data warehouse
- ü§ñ Aplicar t√©cnicas de LLM para an√°lise inteligente de padr√µes
- üìà Identificar tend√™ncias temporais e comportamentais
- üí° Gerar insights autom√°ticos para tomada de decis√£o
- üìã Criar relat√≥rios executivos com base nos dados tratados

## üóÉÔ∏è Fonte dos Dados:
- **Tabela**: `dw_coffee` (Data Warehouse)
- **Registros**: 3,547 vendas processadas
- **Per√≠odo**: Dados hist√≥ricos de vendas de caf√©
- **Qualidade**: Dados limpos, sem outliers e valores ausentes

## üîß Ferramentas Utilizadas:
- **PostgreSQL**: Banco de dados com tabela DW
- **Python**: Pandas, NumPy, Matplotlib, Seaborn
- **LLM Simulation**: An√°lise inteligente de padr√µes
- **Visualiza√ß√µes**: Plotly para gr√°ficos interativos

## 1. üì¶ Import Required Libraries
Importa√ß√£o das bibliotecas essenciais para an√°lise de data warehouse e LLM.

In [23]:
# Instalar pacotes necess√°rios (executar apenas uma vez)
import subprocess
import sys

def install_if_missing(package):
    try:
        __import__(package.split('==')[0])
    except ImportError:
        print(f"üì¶ Instalando {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"‚úÖ {package} instalado com sucesso!")

# Lista de pacotes necess√°rios
required_packages = [
    "matplotlib", "seaborn", "plotly", "psycopg2-binary", 
    "sqlalchemy", "pandas", "numpy"
]

for package in required_packages:
    install_if_missing(package)

# Bibliotecas para an√°lise de dados e conex√£o com banco
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Conex√£o com PostgreSQL
import psycopg2
from sqlalchemy import create_engine
import os

# Bibliotecas para visualiza√ß√µes avan√ßadas
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio

# Bibliotecas para processamento de texto e LLM
import json
import re
from collections import Counter, defaultdict
from itertools import combinations

# Configura√ß√µes visuais
plt.style.use('default')
sns.set_palette("Set2")
pio.templates.default = "plotly_white"
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:.2f}'.format)

print("‚úÖ Bibliotecas importadas com sucesso!")
print("üèóÔ∏è Configura√ß√£o para an√°lise de Data Warehouse")
print(f"üìä Pandas: {pd.__version__}")
print(f"üî¢ NumPy: {np.__version__}")
print(f"üìà Matplotlib: {plt.matplotlib.__version__}")
print("üöÄ Sistema pronto para an√°lise DW com LLM!")

üì¶ Instalando psycopg2-binary...
‚úÖ psycopg2-binary instalado com sucesso!
‚úÖ Bibliotecas importadas com sucesso!
üèóÔ∏è Configura√ß√£o para an√°lise de Data Warehouse
üìä Pandas: 2.3.3
üî¢ NumPy: 2.3.3
üìà Matplotlib: 3.10.6
üöÄ Sistema pronto para an√°lise DW com LLM!
‚úÖ psycopg2-binary instalado com sucesso!
‚úÖ Bibliotecas importadas com sucesso!
üèóÔ∏è Configura√ß√£o para an√°lise de Data Warehouse
üìä Pandas: 2.3.3
üî¢ NumPy: 2.3.3
üìà Matplotlib: 3.10.6
üöÄ Sistema pronto para an√°lise DW com LLM!



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## 2. üóÉÔ∏è Load and Explore DW Dataset
Carregamento dos dados tratados da tabela `dw_coffee` e explora√ß√£o inicial.

In [24]:
# Configura√ß√£o da conex√£o com Data Warehouse
def conectar_dw():
    """Conecta ao Data Warehouse PostgreSQL"""
    try:
        config = {
            'host': 'localhost',
            'port': '5432',
            'database': 'techchallenge03',
            'user': 'admin',
            'password': 'admin123'
        }
        
        connection_string = f"postgresql://{config['user']}:{config['password']}@{config['host']}:{config['port']}/{config['database']}"
        engine = create_engine(connection_string)
        
        print("‚úÖ Conex√£o com Data Warehouse estabelecida!")
        return engine
    
    except Exception as e:
        print(f"‚ùå Erro na conex√£o: {e}")
        return None

# Estabelecer conex√£o com DW
dw_engine = conectar_dw()

# Carregar dados completos da tabela dw_coffee
if dw_engine:
    dw_query = """
    SELECT 
        id,
        hour_of_day,
        cash_type,
        money,
        coffee_name,
        time_of_day,
        weekday,
        month_name,
        weekdaysort,
        monthsort,
        date,
        time,
        created_at,
        updated_at
    FROM dw_coffee
    ORDER BY date, time
    """
    
    print("üì• Carregando dados do Data Warehouse...")
    dw_df = pd.read_sql_query(dw_query, dw_engine)
    
    # Convers√µes de tipos para otimiza√ß√£o
    dw_df['date'] = pd.to_datetime(dw_df['date'])
    dw_df['time'] = pd.to_datetime(dw_df['time'], format='%H:%M:%S').dt.time
    dw_df['created_at'] = pd.to_datetime(dw_df['created_at'])
    
    print(f"‚úÖ Data Warehouse carregado: {dw_df.shape[0]:,} registros, {dw_df.shape[1]} colunas")
    print(f"üìÖ Per√≠odo: {dw_df['date'].min()} a {dw_df['date'].max()}")
    print(f"üí∞ Receita total: ${dw_df['money'].sum():,.2f}")
    
else:
    print("‚ùå Falha no carregamento do Data Warehouse")

‚úÖ Conex√£o com Data Warehouse estabelecida!
üì• Carregando dados do Data Warehouse...
‚úÖ Data Warehouse carregado: 3,547 registros, 14 colunas
üìÖ Per√≠odo: 2024-03-01 00:00:00 a 2025-03-23 00:00:00
üí∞ Receita total: $112,245.58


In [25]:
# Explora√ß√£o detalhada do Data Warehouse
print("=" * 70)
print("üèóÔ∏è AN√ÅLISE EXPLORAT√ìRIA DO DATA WAREHOUSE - DW_COFFEE")
print("=" * 70)

# Informa√ß√µes gerais do dataset
print("üìä INFORMA√á√ïES GERAIS:")
print(f"   Dimens√µes: {dw_df.shape[0]:,} linhas √ó {dw_df.shape[1]} colunas")
print(f"   Mem√≥ria utilizada: {dw_df.memory_usage(deep=True).sum() / 1024**2:.1f} MB")
print(f"   Per√≠odo dos dados: {dw_df['date'].min()} a {dw_df['date'].max()}")
print(f"   √öltima atualiza√ß√£o: {dw_df['created_at'].max()}")

# Verifica√ß√£o de qualidade dos dados
print(f"\nüîç QUALIDADE DOS DADOS:")
print(f"   Valores nulos: {dw_df.isnull().sum().sum()}")
print(f"   Registros duplicados: {dw_df.duplicated().sum()}")
print(f"   Registros √∫nicos: {dw_df['id'].nunique()}")

# Estat√≠sticas financeiras
print(f"\nüí∞ M√âTRICAS FINANCEIRAS:")
print(f"   Receita total: ${dw_df['money'].sum():,.2f}")
print(f"   Ticket m√©dio: ${dw_df['money'].mean():.2f}")
print(f"   Ticket mediano: ${dw_df['money'].median():.2f}")
print(f"   Maior venda: ${dw_df['money'].max():.2f}")
print(f"   Menor venda: ${dw_df['money'].min():.2f}")
print(f"   Desvio padr√£o: ${dw_df['money'].std():.2f}")

# Distribui√ß√£o por categorias
print(f"\n‚òï PORTF√ìLIO DE PRODUTOS:")
coffee_stats = dw_df['coffee_name'].value_counts()
print(f"   Tipos de caf√©: {len(coffee_stats)}")
for coffee, count in coffee_stats.head().items():
    percentage = (count / len(dw_df)) * 100
    revenue = dw_df[dw_df['coffee_name'] == coffee]['money'].sum()
    print(f"   {coffee}: {count:,} vendas ({percentage:.1f}%) - ${revenue:,.2f}")

print(f"\nüí≥ M√âTODOS DE PAGAMENTO:")
payment_stats = dw_df['cash_type'].value_counts()
for payment, count in payment_stats.items():
    percentage = (count / len(dw_df)) * 100
    avg_value = dw_df[dw_df['cash_type'] == payment]['money'].mean()
    print(f"   {payment}: {count:,} transa√ß√µes ({percentage:.1f}%) - M√©dia: ${avg_value:.2f}")

print(f"\nüìÖ DISTRIBUI√á√ÉO TEMPORAL:")
weekday_stats = dw_df['weekday'].value_counts()
for day, count in weekday_stats.items():
    percentage = (count / len(dw_df)) * 100
    revenue = dw_df[dw_df['weekday'] == day]['money'].sum()
    print(f"   {day}: {count:,} vendas ({percentage:.1f}%) - ${revenue:,.2f}")

# Amostra dos dados
print(f"\nüìã AMOSTRA DOS DADOS TRATADOS:")
display(dw_df.head(10)[['date', 'time', 'coffee_name', 'money', 'time_of_day', 'cash_type']])

üèóÔ∏è AN√ÅLISE EXPLORAT√ìRIA DO DATA WAREHOUSE - DW_COFFEE
üìä INFORMA√á√ïES GERAIS:
   Dimens√µes: 3,547 linhas √ó 14 colunas
   Mem√≥ria utilizada: 1.3 MB
   Per√≠odo dos dados: 2024-03-01 00:00:00 a 2025-03-23 00:00:00
   √öltima atualiza√ß√£o: 2025-10-04 14:33:47.682884

üîç QUALIDADE DOS DADOS:
   Valores nulos: 0
   Registros duplicados: 0
   Registros √∫nicos: 3547

üí∞ M√âTRICAS FINANCEIRAS:
   Receita total: $112,245.58
   Ticket m√©dio: $31.65
   Ticket mediano: $32.82
   Maior venda: $38.70
   Menor venda: $18.12
   Desvio padr√£o: $4.88

‚òï PORTF√ìLIO DE PRODUTOS:
   Tipos de caf√©: 8
   Americano with Milk: 809 vendas (22.8%) - $24,751.12
   Latte: 757 vendas (21.3%) - $26,875.30
   Americano: 564 vendas (15.9%) - $14,650.26
   Cappuccino: 486 vendas (13.7%) - $17,439.14
   Cortado: 287 vendas (8.1%) - $7,384.86

üí≥ M√âTODOS DE PAGAMENTO:
   card: 3,547 transa√ß√µes (100.0%) - M√©dia: $31.65

üìÖ DISTRIBUI√á√ÉO TEMPORAL:
   Tue: 572 vendas (16.1%) - $18,168.38
   

Unnamed: 0,date,time,coffee_name,money,time_of_day,cash_type
0,2024-03-01,10:15:50,Latte,38.7,Morning,card
1,2024-03-01,12:19:22,Hot Chocolate,38.7,Afternoon,card
2,2024-03-01,12:20:18,Hot Chocolate,38.7,Afternoon,card
3,2024-03-01,13:46:33,Americano,28.9,Afternoon,card
4,2024-03-01,13:48:14,Latte,38.7,Afternoon,card
5,2024-03-01,15:39:47,Americano with Milk,33.8,Afternoon,card
6,2024-03-01,16:19:02,Hot Chocolate,38.7,Afternoon,card
7,2024-03-01,18:39:03,Americano with Milk,33.8,Night,card
8,2024-03-01,19:22:01,Cocoa,38.7,Night,card
9,2024-03-01,19:23:15,Americano with Milk,33.8,Night,card


## 3. üßπ Data Preprocessing
Prepara√ß√£o adicional dos dados do DW para an√°lise avan√ßada com LLM.

In [26]:
# Enriquecimento dos dados para an√°lise LLM avan√ßada
print("üßπ Processando dados do DW para an√°lise LLM...")

# Criar c√≥pia para processamento
dw_processed = dw_df.copy()

# Enriquecer com caracter√≠sticas temporais avan√ßadas
dw_processed['datetime'] = pd.to_datetime(dw_processed['date'].astype(str) + ' ' + dw_processed['time'].astype(str))
dw_processed['year'] = dw_processed['date'].dt.year
dw_processed['month'] = dw_processed['date'].dt.month
dw_processed['day'] = dw_processed['date'].dt.day
dw_processed['day_of_week'] = dw_processed['date'].dt.dayofweek
dw_processed['week_of_year'] = dw_processed['date'].dt.isocalendar().week
dw_processed['is_weekend'] = dw_processed['day_of_week'].isin([5, 6])
dw_processed['is_month_start'] = dw_processed['date'].dt.is_month_start
dw_processed['is_month_end'] = dw_processed['date'].dt.is_month_end

# Categoriza√ß√£o avan√ßada de per√≠odos
def advanced_time_categorization(hour):
    if 6 <= hour < 9:
        return 'Rush Matinal'
    elif 9 <= hour < 11:
        return 'Manh√£ Tranquila'
    elif 11 <= hour < 13:
        return 'Rush Almo√ßo'
    elif 13 <= hour < 15:
        return 'Tarde Inicial'
    elif 15 <= hour < 17:
        return 'Tarde'
    elif 17 <= hour < 19:
        return 'Rush Vespertino'
    elif 19 <= hour < 21:
        return 'Noite'
    else:
        return 'Outros'

dw_processed['periodo_rush'] = dw_processed['hour_of_day'].apply(advanced_time_categorization)

# Segmenta√ß√£o de clientes por comportamento de compra
def customer_segment(row):
    if row['money'] >= 30 and row['cash_type'] == 'card':
        return 'Premium Card'
    elif row['money'] >= 25:
        return 'High Value'
    elif row['money'] <= 15 and row['cash_type'] == 'cash':
        return 'Budget Cash'
    elif row['is_weekend'] and row['money'] >= 20:
        return 'Weekend Spender'
    else:
        return 'Regular'

dw_processed['customer_segment'] = dw_processed.apply(customer_segment, axis=1)

# An√°lise de frequ√™ncia por tipo de caf√©
coffee_frequency = dw_processed['coffee_name'].value_counts()
dw_processed['coffee_popularity'] = dw_processed['coffee_name'].map(
    lambda x: 'Top Seller' if coffee_frequency[x] > coffee_frequency.quantile(0.7) 
    else 'Moderate' if coffee_frequency[x] > coffee_frequency.quantile(0.3)
    else 'Niche'
)

# Cria√ß√£o de contexto narrativo para LLM
def create_rich_narrative(row):
    time_context = f"√†s {row['time']} durante o per√≠odo de {row['periodo_rush']}"
    date_context = f"em uma {row['weekday']} de {row['month_name']}"
    weekend_flag = " (fim de semana)" if row['is_weekend'] else " (dia √∫til)"
    
    product_context = f"caf√© {row['coffee_name']} ({row['coffee_popularity']})"
    financial_context = f"${row['money']:.2f} via {row['cash_type']}"
    segment_context = f"cliente {row['customer_segment']}"
    
    return f"Venda de {product_context} para {segment_context}, {time_context} {date_context}{weekend_flag}, valor: {financial_context}"

dw_processed['narrative_context'] = dw_processed.apply(create_rich_narrative, axis=1)

# M√©tricas de performance por segmento
print("\nüìä SEGMENTA√á√ÉO DE CLIENTES:")
segment_analysis = dw_processed.groupby('customer_segment').agg({
    'money': ['count', 'sum', 'mean'],
    'hour_of_day': 'mean',
    'is_weekend': 'mean'
}).round(2)

for segment in segment_analysis.index:
    count = segment_analysis.loc[segment, ('money', 'count')]
    revenue = segment_analysis.loc[segment, ('money', 'sum')]
    avg_value = segment_analysis.loc[segment, ('money', 'mean')]
    avg_hour = segment_analysis.loc[segment, ('hour_of_day', 'mean')]
    weekend_pct = segment_analysis.loc[segment, ('is_weekend', 'mean')] * 100
    
    print(f"  {segment}: {count:,} vendas | ${revenue:,.2f} receita | ${avg_value:.2f} m√©dia")
    print(f"    Hor√°rio m√©dio: {avg_hour:.1f}h | Weekend: {weekend_pct:.1f}%")

print(f"\n‚úÖ Dados enriquecidos para an√°lise LLM!")
print(f"üìä Novas dimens√µes: {dw_processed.shape[1] - dw_df.shape[1]} colunas adicionadas")
print(f"üìù Narrativas criadas: {len(dw_processed['narrative_context'])} contextos")

# Exemplo de narrativas criadas
print(f"\nüìñ EXEMPLOS DE CONTEXTOS NARRATIVOS:")
for i in range(3):
    print(f"   {i+1}. {dw_processed['narrative_context'].iloc[i]}")

üßπ Processando dados do DW para an√°lise LLM...

üìä SEGMENTA√á√ÉO DE CLIENTES:
  High Value: 886 vendas | $23,928.62 receita | $27.01 m√©dia
    Hor√°rio m√©dio: 13.2h | Weekend: 24.0%
  Premium Card: 2,345 vendas | $81,321.94 receita | $34.68 m√©dia
    Hor√°rio m√©dio: 14.7h | Weekend: 25.0%
  Regular: 241 vendas | $5,283.22 receita | $21.92 m√©dia
    Hor√°rio m√©dio: 13.3h | Weekend: 5.0%
  Weekend Spender: 75 vendas | $1,711.80 receita | $22.82 m√©dia
    Hor√°rio m√©dio: 12.3h | Weekend: 100.0%

‚úÖ Dados enriquecidos para an√°lise LLM!
üìä Novas dimens√µes: 13 colunas adicionadas
üìù Narrativas criadas: 3547 contextos

üìñ EXEMPLOS DE CONTEXTOS NARRATIVOS:
   1. Venda de caf√© Latte (Top Seller) para cliente Premium Card, √†s 10:15:50 durante o per√≠odo de Manh√£ Tranquila em uma Fri de Mar (dia √∫til), valor: $38.70 via card
   2. Venda de caf√© Hot Chocolate (Niche) para cliente Premium Card, √†s 12:19:22 durante o per√≠odo de Rush Almo√ßo em uma Fri de Mar (dia √∫til),

## 4. ü§ñ Setup LLM Connection
Configura√ß√£o de sistema LLM avan√ßado para an√°lise inteligente dos dados do DW.

In [27]:
# Sistema LLM Avan√ßado para An√°lise de Data Warehouse
class AdvancedDWLLMAnalyzer:
    """
    Sistema LLM especializado em an√°lise de Data Warehouse
    Simula capacidades avan√ßadas de processamento e gera√ß√£o de insights
    """
    
    def __init__(self):
        self.business_patterns = {
            'high_value_indicators': ['premium', 'gourmet', 'special', 'deluxe'],
            'time_patterns': ['rush', 'peak', 'busy', 'quiet'],
            'customer_behaviors': ['regular', 'weekend', 'frequent', 'occasional'],
            'payment_preferences': ['card', 'cash', 'contactless'],
            'seasonal_terms': ['morning', 'afternoon', 'evening', 'weekend']
        }
        
        self.insight_templates = {
            'trend': "Identificada tend√™ncia de {} em {} com crescimento de {}%",
            'pattern': "Padr√£o comportamental: {} representa {}% das vendas em {}",
            'opportunity': "Oportunidade detectada: {} pode aumentar receita em {}%",
            'recommendation': "Recomenda√ß√£o estrat√©gica: {} para maximizar {} em {}"
        }
    
    def analyze_business_intelligence(self, df):
        """An√°lise de intelig√™ncia de neg√≥cios com LLM"""
        intelligence_report = {
            'executive_summary': [],
            'key_metrics': {},
            'strategic_insights': [],
            'recommendations': [],
            'risk_analysis': []
        }
        
        # M√©tricas-chave do neg√≥cio
        intelligence_report['key_metrics'] = {
            'total_revenue': df['money'].sum(),
            'avg_transaction': df['money'].mean(),
            'total_transactions': len(df),
            'unique_products': df['coffee_name'].nunique(),
            'peak_hour': df.groupby('hour_of_day')['money'].sum().idxmax(),
            'best_day': df.groupby('weekday')['money'].sum().idxmax(),
            'customer_segments': df['customer_segment'].nunique()
        }
        
        # An√°lise de crescimento e tend√™ncias
        monthly_trend = df.groupby('month')['money'].sum()
        growth_rate = ((monthly_trend.iloc[-1] - monthly_trend.iloc[0]) / monthly_trend.iloc[0]) * 100
        
        intelligence_report['executive_summary'].append(
            f"üìà Crescimento mensal de {growth_rate:.1f}% com receita total de ${intelligence_report['key_metrics']['total_revenue']:,.2f}"
        )
        
        # Insights estrat√©gicos por segmento
        segment_performance = df.groupby('customer_segment')['money'].agg(['count', 'sum', 'mean'])
        top_segment = segment_performance['sum'].idxmax()
        
        intelligence_report['strategic_insights'].append(
            f"üéØ Segmento '{top_segment}' gera ${segment_performance.loc[top_segment, 'sum']:,.2f} ({(segment_performance.loc[top_segment, 'sum']/intelligence_report['key_metrics']['total_revenue']*100):.1f}% da receita)"
        )
        
        # An√°lise de produtos
        product_analysis = df.groupby('coffee_name')['money'].agg(['count', 'sum'])
        top_product = product_analysis['sum'].idxmax()
        
        intelligence_report['strategic_insights'].append(
            f"‚òï '{top_product}' √© o produto mais lucrativo com ${product_analysis.loc[top_product, 'sum']:,.2f} em vendas"
        )
        
        # Recomenda√ß√µes baseadas em padr√µes
        weekend_avg = df[df['is_weekend']]['money'].mean()
        weekday_avg = df[~df['is_weekend']]['money'].mean()
        
        if weekend_avg > weekday_avg:
            intelligence_report['recommendations'].append(
                f"üí° Focar promo√ß√µes em finais de semana (ticket m√©dio ${weekend_avg:.2f} vs ${weekday_avg:.2f} dias √∫teis)"
            )
        
        # An√°lise de riscos
        low_performance_hours = df.groupby('hour_of_day')['money'].sum().sort_values().head(3)
        intelligence_report['risk_analysis'].append(
            f"‚ö†Ô∏è Hor√°rios de baixa performance: {', '.join([f'{h}h' for h in low_performance_hours.index])}"
        )
        
        return intelligence_report
    
    def generate_predictive_insights(self, df):
        """Gera insights preditivos baseados em padr√µes hist√≥ricos"""
        predictions = {
            'demand_forecast': {},
            'revenue_projections': {},
            'customer_behavior': {},
            'operational_recommendations': []
        }
        
        # Previs√£o de demanda por per√≠odo
        hourly_demand = df.groupby('hour_of_day')['money'].agg(['count', 'mean'])
        peak_hours = hourly_demand[hourly_demand['count'] > hourly_demand['count'].quantile(0.7)].index.tolist()
        
        predictions['demand_forecast'] = {
            'peak_hours': peak_hours,
            'expected_transactions': hourly_demand['count'].mean(),
            'revenue_per_hour': hourly_demand['mean'].sum()
        }
        
        # Proje√ß√µes de receita
        daily_revenue = df.groupby('date')['money'].sum()
        predictions['revenue_projections'] = {
            'daily_average': daily_revenue.mean(),
            'growth_trend': (daily_revenue.iloc[-5:].mean() - daily_revenue.iloc[:5].mean()) / daily_revenue.iloc[:5].mean() * 100,
            'projected_monthly': daily_revenue.mean() * 30
        }
        
        return predictions
    
    def create_executive_report(self, df):
        """Cria relat√≥rio executivo completo"""
        bi_analysis = self.analyze_business_intelligence(df)
        predictions = self.generate_predictive_insights(df)
        
        report = {
            'timestamp': datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
            'data_period': f"{df['date'].min()} a {df['date'].max()}",
            'business_intelligence': bi_analysis,
            'predictive_analytics': predictions,
            'data_quality_score': self._calculate_data_quality(df)
        }
        
        return report
    
    def _calculate_data_quality(self, df):
        """Calcula score de qualidade dos dados"""
        quality_metrics = {
            'completeness': (1 - df.isnull().sum().sum() / (len(df) * len(df.columns))) * 100,
            'uniqueness': (df['id'].nunique() / len(df)) * 100,
            'consistency': 100 - (df.duplicated().sum() / len(df) * 100)
        }
        
        return sum(quality_metrics.values()) / len(quality_metrics)

# Inicializar sistema LLM avan√ßado
dw_llm = AdvancedDWLLMAnalyzer()

print("ü§ñ Sistema LLM Avan√ßado para Data Warehouse inicializado!")
print("üß† Capacidades dispon√≠veis:")
print("   ‚úÖ An√°lise de Intelig√™ncia de Neg√≥cios")
print("   ‚úÖ Insights Preditivos")
print("   ‚úÖ Relat√≥rios Executivos Autom√°ticos")
print("   ‚úÖ An√°lise de Qualidade de Dados")
print("   ‚úÖ Recomenda√ß√µes Estrat√©gicas")
print("üöÄ Sistema pronto para an√°lise inteligente!")

ü§ñ Sistema LLM Avan√ßado para Data Warehouse inicializado!
üß† Capacidades dispon√≠veis:
   ‚úÖ An√°lise de Intelig√™ncia de Neg√≥cios
   ‚úÖ Insights Preditivos
   ‚úÖ Relat√≥rios Executivos Autom√°ticos
   ‚úÖ An√°lise de Qualidade de Dados
   ‚úÖ Recomenda√ß√µes Estrat√©gicas
üöÄ Sistema pronto para an√°lise inteligente!


## 5. üîç Text Analysis with LLM
An√°lise avan√ßada de texto e padr√µes comportamentais com LLM no Data Warehouse.

In [28]:
# An√°lise de Texto Avan√ßada com LLM
print("üîç Executando an√°lise de texto avan√ßada com LLM...")

# Gerar relat√≥rio de intelig√™ncia de neg√≥cios
business_intelligence = dw_llm.analyze_business_intelligence(dw_processed)

print("=" * 70)
print("üß† RELAT√ìRIO DE INTELIG√äNCIA DE NEG√ìCIOS")
print("=" * 70)

print("üìä RESUMO EXECUTIVO:")
for item in business_intelligence['executive_summary']:
    print(f"   {item}")

print(f"\nüéØ M√âTRICAS-CHAVE:")
metrics = business_intelligence['key_metrics']
print(f"   üí∞ Receita Total: ${metrics['total_revenue']:,.2f}")
print(f"   üõí Transa√ß√µes: {metrics['total_transactions']:,}")
print(f"   üí≥ Ticket M√©dio: ${metrics['avg_transaction']:.2f}")
print(f"   ‚òï Produtos √önicos: {metrics['unique_products']}")
print(f"   ‚è∞ Hor√°rio de Pico: {metrics['peak_hour']}h")
print(f"   üìÖ Melhor Dia: {metrics['best_day']}")
print(f"   üë• Segmentos: {metrics['customer_segments']}")

print(f"\nüí° INSIGHTS ESTRAT√âGICOS:")
for insight in business_intelligence['strategic_insights']:
    print(f"   {insight}")

print(f"\nüéØ RECOMENDA√á√ïES:")
for rec in business_intelligence['recommendations']:
    print(f"   {rec}")

print(f"\n‚ö†Ô∏è AN√ÅLISE DE RISCOS:")
for risk in business_intelligence['risk_analysis']:
    print(f"   {risk}")

# An√°lise preditiva
predictive_insights = dw_llm.generate_predictive_insights(dw_processed)

print(f"\nüîÆ INSIGHTS PREDITIVOS:")
print("=" * 50)

demand = predictive_insights['demand_forecast']
print(f"üìà PREVIS√ÉO DE DEMANDA:")
print(f"   Hor√°rios de pico: {', '.join([f'{h}h' for h in demand['peak_hours']])}")
print(f"   Transa√ß√µes esperadas/hora: {demand['expected_transactions']:.1f}")
print(f"   Receita/hora projetada: ${demand['revenue_per_hour']:.2f}")

revenue = predictive_insights['revenue_projections']
print(f"\nüí∞ PROJE√á√ïES DE RECEITA:")
print(f"   M√©dia di√°ria: ${revenue['daily_average']:,.2f}")
print(f"   Tend√™ncia de crescimento: {revenue['growth_trend']:.1f}%")
print(f"   Proje√ß√£o mensal: ${revenue['projected_monthly']:,.2f}")

# An√°lise de padr√µes complexos
print(f"\nüîç AN√ÅLISE DE PADR√ïES COMPLEXOS:")
print("=" * 50)

# Padr√µes por segmento de cliente
segment_patterns = dw_processed.groupby(['customer_segment', 'periodo_rush']).agg({
    'money': ['count', 'sum', 'mean']
}).round(2)

print("üë• PADR√ïES POR SEGMENTO E PER√çODO:")
for segment in dw_processed['customer_segment'].unique():
    segment_data = dw_processed[dw_processed['customer_segment'] == segment]
    favorite_period = segment_data['periodo_rush'].mode()[0]
    avg_spend = segment_data['money'].mean()
    transactions = len(segment_data)
    
    print(f"   {segment}: {transactions:,} transa√ß√µes | ${avg_spend:.2f} m√©dia")
    print(f"     Per√≠odo preferido: {favorite_period}")

# Correla√ß√µes entre vari√°veis
print(f"\nüìä CORRELA√á√ïES IDENTIFICADAS:")
correlation_insights = []

# Correla√ß√£o pre√ßo vs hor√°rio
hourly_avg = dw_processed.groupby('hour_of_day')['money'].mean()
if hourly_avg.max() - hourly_avg.min() > 5:
    correlation_insights.append(f"üí∞ Varia√ß√£o significativa de pre√ßos por hor√°rio (${hourly_avg.min():.2f} - ${hourly_avg.max():.2f})")

# Correla√ß√£o m√©todo pagamento vs valor
payment_correlation = dw_processed.groupby('cash_type')['money'].mean()
if payment_correlation.max() - payment_correlation.min() > 3:
    correlation_insights.append(f"üí≥ M√©todo de pagamento influencia valor m√©dio (diferen√ßa: ${payment_correlation.max() - payment_correlation.min():.2f})")

# Correla√ß√£o fim de semana vs comportamento
weekend_behavior = dw_processed.groupby('is_weekend')['money'].mean()
weekday_diff = weekend_behavior[True] - weekend_behavior[False] if True in weekend_behavior.index and False in weekend_behavior.index else 0
if abs(weekday_diff) > 2:
    trend = "maior" if weekday_diff > 0 else "menor"
    correlation_insights.append(f"üìÖ Comportamento {trend} em finais de semana (diferen√ßa: ${abs(weekday_diff):.2f})")

for insight in correlation_insights:
    print(f"   {insight}")

print(f"\n‚úÖ An√°lise de texto avan√ßada conclu√≠da!")
print(f"üìà {len(business_intelligence['strategic_insights'])} insights estrat√©gicos gerados")
print(f"üéØ {len(business_intelligence['recommendations'])} recomenda√ß√µes identificadas")

üîç Executando an√°lise de texto avan√ßada com LLM...
üß† RELAT√ìRIO DE INTELIG√äNCIA DE NEG√ìCIOS
üìä RESUMO EXECUTIVO:
   üìà Crescimento mensal de 28.7% com receita total de $112,245.58

üéØ M√âTRICAS-CHAVE:
   üí∞ Receita Total: $112,245.58
   üõí Transa√ß√µes: 3,547
   üí≥ Ticket M√©dio: $31.65
   ‚òï Produtos √önicos: 8
   ‚è∞ Hor√°rio de Pico: 10h
   üìÖ Melhor Dia: Tue
   üë• Segmentos: 4

üí° INSIGHTS ESTRAT√âGICOS:
   üéØ Segmento 'Premium Card' gera $81,321.94 (72.5% da receita)
   ‚òï 'Latte' √© o produto mais lucrativo com $26,875.30 em vendas

üéØ RECOMENDA√á√ïES:

‚ö†Ô∏è AN√ÅLISE DE RISCOS:
   ‚ö†Ô∏è Hor√°rios de baixa performance: 6h, 7h, 22h

üîÆ INSIGHTS PREDITIVOS:
üìà PREVIS√ÉO DE DEMANDA:
   Hor√°rios de pico: 9h, 10h, 11h, 12h, 16h
   Transa√ß√µes esperadas/hora: 208.6
   Receita/hora projetada: $538.16

üí∞ PROJE√á√ïES DE RECEITA:
   M√©dia di√°ria: $294.61
   Tend√™ncia de crescimento: 77.6%
   Proje√ß√£o mensal: $8,838.23

üîç AN√ÅLISE DE PADR√

## 6. üòä Sentiment Analysis
An√°lise de sentimento avan√ßada baseada em comportamento de compra e padr√µes do DW.

In [29]:
# An√°lise de Sentimento Avan√ßada com base no Data Warehouse
print("üòä Executando an√°lise de sentimento avan√ßada...")

def advanced_sentiment_analysis(df):
    """An√°lise de sentimento multi-dimensional baseada em comportamento"""
    
    # Inicializar scores de sentimento
    df['sentiment_financial'] = 0  # Baseado em gastos
    df['sentiment_temporal'] = 0   # Baseado em hor√°rios
    df['sentiment_loyalty'] = 0    # Baseado em fidelidade
    df['sentiment_premium'] = 0    # Baseado em prefer√™ncias premium
    
    # === SENTIMENTO FINANCEIRO ===
    # Gastos altos = sentimento muito positivo
    high_spender_threshold = df['money'].quantile(0.8)
    medium_spender_threshold = df['money'].quantile(0.5)
    low_spender_threshold = df['money'].quantile(0.2)
    
    df.loc[df['money'] >= high_spender_threshold, 'sentiment_financial'] = 4
    df.loc[(df['money'] >= medium_spender_threshold) & (df['money'] < high_spender_threshold), 'sentiment_financial'] = 2
    df.loc[(df['money'] >= low_spender_threshold) & (df['money'] < medium_spender_threshold), 'sentiment_financial'] = 1
    df.loc[df['money'] < low_spender_threshold, 'sentiment_financial'] = -1
    
    # === SENTIMENTO TEMPORAL ===
    # Hor√°rios convenientes = sentimento positivo
    prime_hours = [7, 8, 9, 12, 13, 17, 18]  # Hor√°rios prime
    good_hours = [10, 11, 14, 15, 16, 19]    # Hor√°rios bons
    off_hours = [6, 20, 21, 22]             # Hor√°rios off
    
    df.loc[df['hour_of_day'].isin(prime_hours), 'sentiment_temporal'] = 3
    df.loc[df['hour_of_day'].isin(good_hours), 'sentiment_temporal'] = 1
    df.loc[df['hour_of_day'].isin(off_hours), 'sentiment_temporal'] = -1
    
    # === SENTIMENTO DE FIDELIDADE ===
    # Baseado no segmento de cliente
    loyalty_scores = {
        'Premium Card': 4,
        'High Value': 3,
        'Weekend Spender': 2,
        'Regular': 1,
        'Budget Cash': 0
    }
    
    df['sentiment_loyalty'] = df['customer_segment'].map(loyalty_scores)
    
    # === SENTIMENTO PREMIUM ===
    # Baseado na popularidade do caf√© e m√©todo de pagamento
    df.loc[(df['coffee_popularity'] == 'Top Seller') & (df['cash_type'] == 'card'), 'sentiment_premium'] = 3
    df.loc[(df['coffee_popularity'] == 'Moderate') & (df['cash_type'] == 'card'), 'sentiment_premium'] = 2
    df.loc[(df['coffee_popularity'] == 'Top Seller') & (df['cash_type'] == 'cash'), 'sentiment_premium'] = 1
    df.loc[df['coffee_popularity'] == 'Niche', 'sentiment_premium'] = -1
    
    # === SCORE GERAL DE SENTIMENTO ===
    df['sentiment_total'] = (df['sentiment_financial'] + df['sentiment_temporal'] + 
                            df['sentiment_loyalty'] + df['sentiment_premium'])
    
    # Classifica√ß√£o final
    def classify_overall_sentiment(score):
        if score >= 10:
            return 'Extremely Positive'
        elif score >= 7:
            return 'Very Positive'
        elif score >= 4:
            return 'Positive'
        elif score >= 1:
            return 'Slightly Positive'
        elif score >= -2:
            return 'Neutral'
        elif score >= -5:
            return 'Slightly Negative'
        else:
            return 'Negative'
    
    df['sentiment_category'] = df['sentiment_total'].apply(classify_overall_sentiment)
    
    return df

# Aplicar an√°lise de sentimento avan√ßada
dw_sentiment = advanced_sentiment_analysis(dw_processed.copy())

print("=" * 70)
print("üòä AN√ÅLISE DE SENTIMENTO MULTI-DIMENSIONAL")
print("=" * 70)

# Distribui√ß√£o geral de sentimentos
sentiment_dist = dw_sentiment['sentiment_category'].value_counts()
sentiment_pct = dw_sentiment['sentiment_category'].value_counts(normalize=True) * 100

print("üìä DISTRIBUI√á√ÉO DE SENTIMENTOS:")
for sentiment, count in sentiment_dist.items():
    percentage = sentiment_pct[sentiment]
    avg_spend = dw_sentiment[dw_sentiment['sentiment_category'] == sentiment]['money'].mean()
    print(f"   {sentiment}: {count:,} transa√ß√µes ({percentage:.1f}%) | Gasto m√©dio: ${avg_spend:.2f}")

# An√°lise por dimens√£o de sentimento
print(f"\nüîç AN√ÅLISE POR DIMENS√ÉO:")

dimensions = ['sentiment_financial', 'sentiment_temporal', 'sentiment_loyalty', 'sentiment_premium']
dimension_names = ['Financeiro', 'Temporal', 'Fidelidade', 'Premium']

for dim, name in zip(dimensions, dimension_names):
    avg_score = dw_sentiment[dim].mean()
    max_score = dw_sentiment[dim].max()
    min_score = dw_sentiment[dim].min()
    print(f"   {name}: M√©dia {avg_score:.2f} (Range: {min_score} a {max_score})")

# Correla√ß√£o sentimento vs receita
print(f"\nüí∞ IMPACTO FINANCEIRO POR SENTIMENTO:")
revenue_by_sentiment = dw_sentiment.groupby('sentiment_category')['money'].agg(['sum', 'mean', 'count']).round(2)

for sentiment in revenue_by_sentiment.index:
    total_revenue = revenue_by_sentiment.loc[sentiment, 'sum']
    avg_revenue = revenue_by_sentiment.loc[sentiment, 'mean']
    count = revenue_by_sentiment.loc[sentiment, 'count']
    revenue_share = (total_revenue / dw_sentiment['money'].sum()) * 100
    
    print(f"   {sentiment}:")
    print(f"     Receita: ${total_revenue:,.2f} ({revenue_share:.1f}% do total)")
    print(f"     Ticket m√©dio: ${avg_revenue:.2f} | Transa√ß√µes: {count:,}")

# An√°lise temporal do sentimento
print(f"\n‚è∞ SENTIMENTO POR PER√çODO DO DIA:")
period_sentiment = dw_sentiment.groupby(['periodo_rush', 'sentiment_category']).size().unstack(fill_value=0)

for period in period_sentiment.index:
    total_period = period_sentiment.loc[period].sum()
    positive_sentiments = (period_sentiment.loc[period, 'Very Positive':'Extremely Positive'].sum() 
                          if 'Very Positive' in period_sentiment.columns else 0)
    positive_rate = (positive_sentiments / total_period) * 100 if total_period > 0 else 0
    
    avg_sentiment_score = dw_sentiment[dw_sentiment['periodo_rush'] == period]['sentiment_total'].mean()
    
    print(f"   {period}: {positive_rate:.1f}% positivo | Score m√©dio: {avg_sentiment_score:.1f}")

# Top insights de sentimento
print(f"\nüéØ INSIGHTS DE SENTIMENTO:")

# Segmento com melhor sentimento
best_segment_sentiment = dw_sentiment.groupby('customer_segment')['sentiment_total'].mean().idxmax()
best_sentiment_score = dw_sentiment.groupby('customer_segment')['sentiment_total'].mean().max()

print(f"   üèÜ Melhor segmento: {best_segment_sentiment} (Score: {best_sentiment_score:.1f})")

# Hor√°rio com melhor sentimento
best_hour_sentiment = dw_sentiment.groupby('hour_of_day')['sentiment_total'].mean().idxmax()
best_hour_score = dw_sentiment.groupby('hour_of_day')['sentiment_total'].mean().max()

print(f"   ‚è∞ Melhor hor√°rio: {best_hour_sentiment}h (Score: {best_hour_score:.1f})")

# Caf√© com melhor sentimento
best_coffee_sentiment = dw_sentiment.groupby('coffee_name')['sentiment_total'].mean().idxmax()
best_coffee_score = dw_sentiment.groupby('coffee_name')['sentiment_total'].mean().max()

print(f"   ‚òï Melhor caf√©: {best_coffee_sentiment} (Score: {best_coffee_score:.1f})")

print(f"\n‚úÖ An√°lise de sentimento multi-dimensional conclu√≠da!")
print(f"üìä {len(sentiment_dist)} categorias de sentimento identificadas")
print(f"üéØ Score m√©dio geral: {dw_sentiment['sentiment_total'].mean():.2f}")

üòä Executando an√°lise de sentimento avan√ßada...
üòä AN√ÅLISE DE SENTIMENTO MULTI-DIMENSIONAL
üìä DISTRIBUI√á√ÉO DE SENTIMENTOS:
   Extremely Positive: 1,932 transa√ß√µes (54.5%) | Gasto m√©dio: $33.73
   Very Positive: 1,102 transa√ß√µes (31.1%) | Gasto m√©dio: $30.52
   Positive: 338 transa√ß√µes (9.5%) | Gasto m√©dio: $28.69
   Slightly Positive: 108 transa√ß√µes (3.0%) | Gasto m√©dio: $22.03
   Neutral: 67 transa√ß√µes (1.9%) | Gasto m√©dio: $20.45

üîç AN√ÅLISE POR DIMENS√ÉO:
   Financeiro: M√©dia 2.17 (Range: -1 a 4)
   Temporal: M√©dia 1.57 (Range: -1 a 3)
   Fidelidade: M√©dia 3.50 (Range: 1 a 4)
   Premium: M√©dia 2.06 (Range: -1 a 3)

üí∞ IMPACTO FINANCEIRO POR SENTIMENTO:
   Extremely Positive:
     Receita: $65,163.42 (58.1% do total)
     Ticket m√©dio: $33.73 | Transa√ß√µes: 1,932
   Neutral:
     Receita: $1,369.86 (1.2% do total)
     Ticket m√©dio: $20.45 | Transa√ß√µes: 67
   Positive:
     Receita: $9,696.66 (8.6% do total)
     Ticket m√©dio: $28.69 | Transa√

## 7. üìã Data Summarization
Sumariza√ß√£o inteligente dos dados usando capacidades de LLM para relat√≥rios executivos.

In [30]:
# Gera√ß√£o de Relat√≥rio Executivo Completo com LLM
print("üìã Gerando relat√≥rio executivo completo...")

# Criar relat√≥rio executivo usando o sistema LLM
executive_report = dw_llm.create_executive_report(dw_sentiment)

print("=" * 80)
print("üìã RELAT√ìRIO EXECUTIVO - AN√ÅLISE DE VENDAS DE CAF√â")
print("=" * 80)

print(f"üïê Gerado em: {executive_report['timestamp']}")
print(f"üìÖ Per√≠odo analisado: {executive_report['data_period']}")
print(f"üìä Score de qualidade dos dados: {executive_report['data_quality_score']:.1f}%")

# Se√ß√£o 1: Resumo Executivo
print(f"\n" + "="*50)
print("1Ô∏è‚É£ RESUMO EXECUTIVO")
print("="*50)

bi = executive_report['business_intelligence']
print("üìà PERFORMANCE GERAL:")
for summary in bi['executive_summary']:
    print(f"   {summary}")

# M√©tricas principais em formato executivo
metrics = bi['key_metrics']
print(f"\nüíº INDICADORES-CHAVE DE PERFORMANCE (KPIs):")
print(f"   üí∞ Receita Total: ${metrics['total_revenue']:,.2f}")
print(f"   üõí Volume de Transa√ß√µes: {metrics['total_transactions']:,}")
print(f"   üí≥ Ticket M√©dio: ${metrics['avg_transaction']:.2f}")
print(f"   üìä Taxa de Convers√£o: 100% (dados de vendas confirmadas)")
print(f"   üéØ Diversifica√ß√£o: {metrics['unique_products']} produtos ativos")

# Se√ß√£o 2: Insights Estrat√©gicos
print(f"\n" + "="*50)
print("2Ô∏è‚É£ INSIGHTS ESTRAT√âGICOS")
print("="*50)

print("üß† PRINCIPAIS DESCOBERTAS:")
for insight in bi['strategic_insights']:
    print(f"   {insight}")

# An√°lise de segmenta√ß√£o de clientes
print(f"\nüë• AN√ÅLISE DE SEGMENTA√á√ÉO:")
segment_summary = dw_sentiment.groupby('customer_segment').agg({
    'money': ['count', 'sum', 'mean'],
    'sentiment_total': 'mean'
}).round(2)

total_revenue = dw_sentiment['money'].sum()
for segment in segment_summary.index:
    count = segment_summary.loc[segment, ('money', 'count')]
    revenue = segment_summary.loc[segment, ('money', 'sum')]
    avg_spend = segment_summary.loc[segment, ('money', 'mean')]
    sentiment = segment_summary.loc[segment, ('sentiment_total', 'mean')]
    share = (revenue / total_revenue) * 100
    
    print(f"   {segment}: {share:.1f}% da receita | ${avg_spend:.2f} ticket m√©dio | Sentimento: {sentiment:.1f}")

# Se√ß√£o 3: An√°lise Preditiva
print(f"\n" + "="*50)
print("3Ô∏è‚É£ AN√ÅLISE PREDITIVA")
print("="*50)

predictions = executive_report['predictive_analytics']
print("üîÆ PROJE√á√ïES E TEND√äNCIAS:")

demand = predictions['demand_forecast']
print(f"   üìä Demanda esperada: {demand['expected_transactions']:.0f} transa√ß√µes/hora")
print(f"   ‚è∞ Hor√°rios de maior movimento: {', '.join([f'{h}h' for h in demand['peak_hours']])}")
print(f"   üí∞ Receita projetada/hora: ${demand['revenue_per_hour']:,.2f}")

revenue_proj = predictions['revenue_projections']
print(f"   üìà Crescimento identificado: {revenue_proj['growth_trend']:.1f}%")
print(f"   üéØ Proje√ß√£o mensal: ${revenue_proj['projected_monthly']:,.2f}")

# Se√ß√£o 4: Recomenda√ß√µes Estrat√©gicas
print(f"\n" + "="*50)
print("4Ô∏è‚É£ RECOMENDA√á√ïES ESTRAT√âGICAS")
print("="*50)

print("üéØ A√á√ïES RECOMENDADAS:")
for rec in bi['recommendations']:
    print(f"   {rec}")

# Recomenda√ß√µes adicionais baseadas em an√°lise avan√ßada
additional_recommendations = []

# An√°lise de hor√°rios
peak_revenue_hour = dw_sentiment.groupby('hour_of_day')['money'].sum().idxmax()
low_revenue_hour = dw_sentiment.groupby('hour_of_day')['money'].sum().idxmin()
additional_recommendations.append(f"üí° Intensificar opera√ß√µes √†s {peak_revenue_hour}h (pico de receita)")
additional_recommendations.append(f"‚ö° Implementar promo√ß√µes √†s {low_revenue_hour}h (menor movimento)")

# An√°lise de produtos
top_margin_coffee = dw_sentiment.groupby('coffee_name')['money'].mean().idxmax()
additional_recommendations.append(f"‚òï Promover '{top_margin_coffee}' (maior margem m√©dia)")

# An√°lise de sentimento
positive_sentiment_pct = (dw_sentiment['sentiment_category'].isin(['Very Positive', 'Extremely Positive']).sum() / len(dw_sentiment)) * 100
if positive_sentiment_pct > 60:
    additional_recommendations.append(f"üòä Aproveitar alto sentimento positivo ({positive_sentiment_pct:.1f}%) para expans√£o")

print("\nüîç RECOMENDA√á√ïES BASEADAS EM AN√ÅLISE AVAN√áADA:")
for rec in additional_recommendations:
    print(f"   {rec}")

# Se√ß√£o 5: Riscos e Alertas
print(f"\n" + "="*50)
print("5Ô∏è‚É£ AN√ÅLISE DE RISCOS")
print("="*50)

print("‚ö†Ô∏è PONTOS DE ATEN√á√ÉO:")
for risk in bi['risk_analysis']:
    print(f"   {risk}")

# Riscos adicionais identificados
risk_analysis = []

# Concentra√ß√£o de receita
top_3_coffees_revenue = dw_sentiment.groupby('coffee_name')['money'].sum().nlargest(3).sum()
concentration_risk = (top_3_coffees_revenue / total_revenue) * 100
if concentration_risk > 70:
    risk_analysis.append(f"üìä Alta concentra√ß√£o: Top 3 caf√©s representam {concentration_risk:.1f}% da receita")

# Depend√™ncia de segmento
top_segment_revenue = segment_summary[('money', 'sum')].max()
segment_dependency = (top_segment_revenue / total_revenue) * 100
if segment_dependency > 40:
    risk_analysis.append(f"üë• Depend√™ncia de segmento: {segment_dependency:.1f}% da receita em um segmento")

if risk_analysis:
    print("\nüîç RISCOS IDENTIFICADOS PELA AN√ÅLISE:")
    for risk in risk_analysis:
        print(f"   {risk}")

# Se√ß√£o 6: Pr√≥ximos Passos
print(f"\n" + "="*50)
print("6Ô∏è‚É£ PR√ìXIMOS PASSOS")
print("="*50)

next_steps = [
    "üìä Implementar dashboard em tempo real para monitoramento de KPIs",
    "üéØ Desenvolver campanhas segmentadas por perfil de cliente",
    "‚è∞ Otimizar staffing baseado em previs√µes de demanda",
    "üí∞ Implementar programa de fidelidade para clientes premium",
    "üìà Expandir an√°lise para incluir dados de satisfa√ß√£o do cliente",
    "üîç Monitorar implementa√ß√£o das recomenda√ß√µes estrat√©gicas"
]

print("üöÄ A√á√ïES PRIORIT√ÅRIAS:")
for i, step in enumerate(next_steps, 1):
    print(f"   {i}. {step}")

print(f"\n" + "="*80)
print("‚úÖ RELAT√ìRIO EXECUTIVO GERADO COM SUCESSO")
print("="*80)
print(f"üìä An√°lise baseada em {len(dw_sentiment):,} transa√ß√µes")
print(f"üß† {len(bi['strategic_insights'])} insights estrat√©gicos identificados")
print(f"üéØ {len(bi['recommendations']) + len(additional_recommendations)} recomenda√ß√µes geradas")
print(f"üìã Relat√≥rio pronto para apresenta√ß√£o executiva")

üìã Gerando relat√≥rio executivo completo...
üìã RELAT√ìRIO EXECUTIVO - AN√ÅLISE DE VENDAS DE CAF√â
üïê Gerado em: 2025-10-04 12:11:09
üìÖ Per√≠odo analisado: 2024-03-01 00:00:00 a 2025-03-23 00:00:00
üìä Score de qualidade dos dados: 100.0%

1Ô∏è‚É£ RESUMO EXECUTIVO
üìà PERFORMANCE GERAL:
   üìà Crescimento mensal de 28.7% com receita total de $112,245.58

üíº INDICADORES-CHAVE DE PERFORMANCE (KPIs):
   üí∞ Receita Total: $112,245.58
   üõí Volume de Transa√ß√µes: 3,547
   üí≥ Ticket M√©dio: $31.65
   üìä Taxa de Convers√£o: 100% (dados de vendas confirmadas)
   üéØ Diversifica√ß√£o: 8 produtos ativos

2Ô∏è‚É£ INSIGHTS ESTRAT√âGICOS
üß† PRINCIPAIS DESCOBERTAS:
   üéØ Segmento 'Premium Card' gera $81,321.94 (72.5% da receita)
   ‚òï 'Latte' √© o produto mais lucrativo com $26,875.30 em vendas

üë• AN√ÅLISE DE SEGMENTA√á√ÉO:
   High Value: 21.3% da receita | $27.01 ticket m√©dio | Sentimento: 8.5
   Premium Card: 72.5% da receita | $34.68 ticket m√©dio | Sentimento: 10.4

## 8. üìä Visualize Results
Cria√ß√£o de visualiza√ß√µes interativas e dashboards baseados na an√°lise LLM.

In [None]:
# Visualiza√ß√µes Interativas Baseadas em An√°lise LLM
print("üìä Criando visualiza√ß√µes baseadas nos insights de LLM...")

# Configura√ß√µes para ambiente notebook
import plotly.offline as pyo
from IPython.display import display, HTML
pyo.init_notebook_mode(connected=True)

# Configurar subplots para dashboard
fig = make_subplots(
    rows=3, cols=2,
    subplot_titles=(
        'Receita por Hor√°rio', 'Distribui√ß√£o de Sentimentos',
        'Performance por Segmento', 'Evolu√ß√£o Temporal',
        'Top Caf√©s por Receita', 'An√°lise de Pagamentos'
    ),
    specs=[[{"secondary_y": True}, {"type": "pie"}],
           [{"type": "bar"}, {"type": "scatter"}],
           [{"type": "bar"}, {"type": "sunburst"}]]
)

# 1. Receita por Hor√°rio com Volume
hourly_data = dw_sentiment.groupby('hour_of_day').agg({
    'money': ['sum', 'count'],
    'sentiment_total': 'mean'
}).round(2)

fig.add_trace(
    go.Bar(x=hourly_data.index, 
           y=hourly_data[('money', 'sum')],
           name='Receita',
           marker_color='lightblue'),
    row=1, col=1
)

fig.add_trace(
    go.Scatter(x=hourly_data.index,
               y=hourly_data[('money', 'count')],
               mode='lines+markers',
               name='Volume',
               line=dict(color='red'),
               yaxis='y2'),
    row=1, col=1, secondary_y=True
)

# 2. Distribui√ß√£o de Sentimentos
sentiment_counts = dw_sentiment['sentiment_category'].value_counts()
fig.add_trace(
    go.Pie(labels=sentiment_counts.index,
           values=sentiment_counts.values,
           name="Sentimentos"),
    row=1, col=2
)

# 3. Performance por Segmento
segment_data = dw_sentiment.groupby('customer_segment').agg({
    'money': ['sum', 'mean'],
    'sentiment_total': 'mean'
}).round(2)

fig.add_trace(
    go.Bar(x=segment_data.index,
           y=segment_data[('money', 'sum')],
           name='Receita por Segmento',
           marker_color='green'),
    row=2, col=1
)

# 4. Evolu√ß√£o Temporal (por dia)
daily_data = dw_sentiment.groupby('date').agg({
    'money': 'sum',
    'sentiment_total': 'mean'
}).round(2)

fig.add_trace(
    go.Scatter(x=daily_data.index,
               y=daily_data['money'],
               mode='lines+markers',
               name='Receita Di√°ria',
               line=dict(color='purple')),
    row=2, col=2
)

# 5. Top Caf√©s por Receita
coffee_data = dw_sentiment.groupby('coffee_name')['money'].sum().nlargest(8)
fig.add_trace(
    go.Bar(x=coffee_data.values,
           y=coffee_data.index,
           orientation='h',
           name='Top Caf√©s',
           marker_color='brown'),
    row=3, col=1
)

# 6. Sunburst - Segmento vs M√©todo de Pagamento
segment_payment = dw_sentiment.groupby(['customer_segment', 'cash_type']).size().reset_index(name='count')

# Criar dados hier√°rquicos para sunburst
all_labels = list(segment_payment['customer_segment'].unique()) + list(segment_payment.apply(lambda x: f"{x['customer_segment']}-{x['cash_type']}", axis=1))
all_parents = [''] * len(segment_payment['customer_segment'].unique()) + list(segment_payment['customer_segment'])
all_values = ([segment_payment[segment_payment['customer_segment'] == seg]['count'].sum() 
               for seg in segment_payment['customer_segment'].unique()] + 
              list(segment_payment['count']))

fig.add_trace(
    go.Sunburst(
        labels=all_labels,
        parents=all_parents,
        values=all_values,
    ),
    row=3, col=2
)

# Atualizar layout
fig.update_layout(
    height=1200,
    title_text="üèóÔ∏è Dashboard Executivo - An√°lise DW Coffee com LLM",
    title_x=0.5,
    showlegend=True
)

# Mostrar dashboard com configura√ß√£o offline
try:
    pyo.iplot(fig, filename='dashboard_coffee_llm')
    print("‚úÖ Dashboard interativo criado com Plotly!")
except Exception as e:
    print(f"‚ö†Ô∏è Erro com Plotly: {e}")
    print("üìä Exibindo dashboard com fig.show()...")
    fig.show()

# Visualiza√ß√µes adicionais espec√≠ficas
print("\nüìà Criando visualiza√ß√µes espec√≠ficas por insights...")

# Gr√°fico de Sentimento por Per√≠odo
try:
    fig_sentiment = px.box(
        dw_sentiment, 
        x='periodo_rush', 
        y='sentiment_total',
        color='customer_segment',
        title='üìä Distribui√ß√£o de Sentimento por Per√≠odo e Segmento'
    )
    pyo.iplot(fig_sentiment, filename='sentiment_analysis')
except Exception as e:
    print(f"‚ö†Ô∏è Plotly box plot error: {e}")
    # Fallback para matplotlib ser√° usado na pr√≥xima c√©lula

# Heatmap de Correla√ß√µes
try:
    correlation_data = dw_sentiment[['money', 'hour_of_day', 'sentiment_total', 'weekdaysort', 'monthsort']].corr()

    fig_heatmap = go.Figure(data=go.Heatmap(
        z=correlation_data.values,
        x=correlation_data.columns,
        y=correlation_data.columns,
        colorscale='RdYlBu',
        text=correlation_data.round(2).values,
        texttemplate="%{text}",
        textfont={"size": 10}
    ))

    fig_heatmap.update_layout(
        title='üî• Heatmap de Correla√ß√µes - An√°lise LLM',
        xaxis_title='Vari√°veis',
        yaxis_title='Vari√°veis'
    )
    pyo.iplot(fig_heatmap, filename='correlation_heatmap')
except Exception as e:
    print(f"‚ö†Ô∏è Plotly heatmap error: {e}")

# An√°lise de Tend√™ncias Temporais Avan√ßada
try:
    fig_trends = make_subplots(
        rows=2, cols=1,
        subplot_titles=('Tend√™ncia de Receita vs Sentimento', 'Volume de Transa√ß√µes vs Qualidade de Sentimento')
    )

    # Tend√™ncia di√°ria
    daily_trends = dw_sentiment.groupby('date').agg({
        'money': 'sum',
        'sentiment_total': 'mean',
        'id': 'count'
    }).round(2)

    fig_trends.add_trace(
        go.Scatter(x=daily_trends.index, y=daily_trends['money'],
                   mode='lines+markers', name='Receita Di√°ria',
                   line=dict(color='blue')),
        row=1, col=1
    )

    fig_trends.add_trace(
        go.Scatter(x=daily_trends.index, y=daily_trends['sentiment_total'] * 1000,  # Escalar para visualiza√ß√£o
                   mode='lines+markers', name='Sentimento M√©dio (x1000)',
                   line=dict(color='red', dash='dash')),
        row=1, col=1
    )

    # Volume vs Qualidade
    fig_trends.add_trace(
        go.Scatter(x=daily_trends['id'], y=daily_trends['sentiment_total'],
                   mode='markers', name='Volume vs Sentimento',
                   marker=dict(size=daily_trends['money']/50, color='green', opacity=0.6)),
        row=2, col=1
    )

    fig_trends.update_layout(height=800, title_text="üìà An√°lise Temporal Avan√ßada - Insights LLM")
    pyo.iplot(fig_trends, filename='temporal_trends')
    
    print("‚úÖ Visualiza√ß√µes avan√ßadas criadas!")
except Exception as e:
    print(f"‚ö†Ô∏è Plotly trends error: {e}")

print("üìä Dashboard completo dispon√≠vel com:")
print("   - 6 visualiza√ß√µes principais no dashboard executivo")
print("   - 3 an√°lises espec√≠ficas complementares")
print("   - Visualiza√ß√µes interativas com Plotly")
print("   - Insights baseados em an√°lise LLM")

# Salvar insights para relat√≥rio
insights_summary = {
    'total_revenue': f"${dw_sentiment['money'].sum():,.2f}",
    'total_transactions': f"{len(dw_sentiment):,}",
    'avg_sentiment': f"{dw_sentiment['sentiment_total'].mean():.2f}",
    'best_hour': f"{dw_sentiment.groupby('hour_of_day')['money'].sum().idxmax()}h",
    'best_segment': dw_sentiment.groupby('customer_segment')['money'].sum().idxmax(),
    'top_coffee': dw_sentiment.groupby('coffee_name')['money'].sum().idxmax()
}

print(f"\nüìã RESUMO DOS INSIGHTS PRINCIPAIS:")
for key, value in insights_summary.items():
    print(f"   {key.replace('_', ' ').title()}: {value}")

print(f"\nüéØ An√°lise completa do Data Warehouse finalizada!")
print(f"üí° Sistema LLM processou {len(dw_sentiment):,} transa√ß√µes com sucesso")
print(f"üìä Relat√≥rio executivo e visualiza√ß√µes prontos para apresenta√ß√£o")

üìä Criando visualiza√ß√µes baseadas nos insights de LLM...


ValueError: Mime type rendering requires nbformat>=4.2.0 but it is not installed

In [None]:
# Dashboard Alternativo com Matplotlib (Garantia de Funcionamento)
print("üìä Criando dashboard alternativo com Matplotlib...")

# Configurar figura principal com 6 subplots
fig, axes = plt.subplots(2, 3, figsize=(20, 12))
fig.suptitle('üèóÔ∏è Dashboard Executivo - An√°lise DW Coffee com LLM', fontsize=16, fontweight='bold')

# 1. Receita por Hor√°rio
hourly_revenue = dw_sentiment.groupby('hour_of_day')['money'].sum()
hourly_count = dw_sentiment.groupby('hour_of_day').size()

ax1 = axes[0, 0]
bars = ax1.bar(hourly_revenue.index, hourly_revenue.values, color='lightblue', alpha=0.7, label='Receita')
ax1.set_title('üí∞ Receita por Hor√°rio', fontsize=12, fontweight='bold')
ax1.set_xlabel('Hora do Dia')
ax1.set_ylabel('Receita ($)', color='blue')

# Adicionar linha de volume
ax1_twin = ax1.twinx()
ax1_twin.plot(hourly_count.index, hourly_count.values, color='red', marker='o', linewidth=2, label='Volume')
ax1_twin.set_ylabel('N√∫mero de Transa√ß√µes', color='red')

# 2. Distribui√ß√£o de Sentimentos
sentiment_counts = dw_sentiment['sentiment_category'].value_counts()
colors = plt.cm.Set3(np.linspace(0, 1, len(sentiment_counts)))
wedges, texts, autotexts = axes[0, 1].pie(sentiment_counts.values, labels=sentiment_counts.index, 
                                          autopct='%1.1f%%', colors=colors, startangle=90)
axes[0, 1].set_title('üòä Distribui√ß√£o de Sentimentos', fontsize=12, fontweight='bold')

# 3. Performance por Segmento
segment_data = dw_sentiment.groupby('customer_segment').agg({
    'money': ['sum', 'count', 'mean']
}).round(2)

segment_revenue = segment_data[('money', 'sum')]
bars = axes[0, 2].bar(range(len(segment_revenue)), segment_revenue.values, 
                      color='green', alpha=0.7)
axes[0, 2].set_title('üë• Receita por Segmento', fontsize=12, fontweight='bold')
axes[0, 2].set_xticks(range(len(segment_revenue)))
axes[0, 2].set_xticklabels(segment_revenue.index, rotation=45, ha='right')
axes[0, 2].set_ylabel('Receita ($)')

# Adicionar valores nas barras
for i, bar in enumerate(bars):
    height = bar.get_height()
    axes[0, 2].text(bar.get_x() + bar.get_width()/2., height,
                    f'${height:,.0f}', ha='center', va='bottom', fontsize=9)

# 4. Top Caf√©s por Receita
coffee_revenue = dw_sentiment.groupby('coffee_name')['money'].sum().nlargest(8)
bars = axes[1, 0].barh(range(len(coffee_revenue)), coffee_revenue.values, 
                       color='brown', alpha=0.7)
axes[1, 0].set_title('‚òï Top 8 Caf√©s por Receita', fontsize=12, fontweight='bold')
axes[1, 0].set_yticks(range(len(coffee_revenue)))
axes[1, 0].set_yticklabels([name[:15] + '...' if len(name) > 15 else name for name in coffee_revenue.index])
axes[1, 0].set_xlabel('Receita ($)')

# 5. Evolu√ß√£o Temporal (√∫ltimos 20 dias)
daily_data = dw_sentiment.groupby('date').agg({
    'money': 'sum',
    'sentiment_total': 'mean'
}).tail(20)

ax5 = axes[1, 1]
line1 = ax5.plot(daily_data.index, daily_data['money'], marker='o', color='purple', 
                 linewidth=2, label='Receita Di√°ria')
ax5.set_title('üìà Evolu√ß√£o Temporal (20 dias)', fontsize=12, fontweight='bold')
ax5.set_xlabel('Data')
ax5.set_ylabel('Receita ($)', color='purple')
ax5.tick_params(axis='x', rotation=45)

# Adicionar sentimento m√©dio
ax5_twin = ax5.twinx()
line2 = ax5_twin.plot(daily_data.index, daily_data['sentiment_total'], 
                      marker='s', color='orange', linewidth=2, linestyle='--', label='Sentimento')
ax5_twin.set_ylabel('Sentimento M√©dio', color='orange')

# 6. Matriz de Correla√ß√µes
correlation_vars = ['money', 'hour_of_day', 'sentiment_total', 'weekdaysort', 'monthsort']
correlation_matrix = dw_sentiment[correlation_vars].corr()

im = axes[1, 2].imshow(correlation_matrix.values, cmap='RdYlBu', aspect='auto', vmin=-1, vmax=1)
axes[1, 2].set_title('üî• Matriz de Correla√ß√µes', fontsize=12, fontweight='bold')
axes[1, 2].set_xticks(range(len(correlation_vars)))
axes[1, 2].set_yticks(range(len(correlation_vars)))
axes[1, 2].set_xticklabels([v.replace('_', ' ').title() for v in correlation_vars], rotation=45)
axes[1, 2].set_yticklabels([v.replace('_', ' ').title() for v in correlation_vars])

# Adicionar valores na matriz
for i in range(len(correlation_vars)):
    for j in range(len(correlation_vars)):
        text = axes[1, 2].text(j, i, f'{correlation_matrix.iloc[i, j]:.2f}', 
                              ha='center', va='center', fontweight='bold',
                              color='white' if abs(correlation_matrix.iloc[i, j]) > 0.5 else 'black')

# Adicionar colorbar
plt.colorbar(im, ax=axes[1, 2], shrink=0.8)

# Ajustar layout
plt.tight_layout()
plt.show()

print("‚úÖ Dashboard Matplotlib criado com sucesso!")

# Gr√°ficos de an√°lise avan√ßada
fig2, axes2 = plt.subplots(1, 3, figsize=(18, 6))
fig2.suptitle('üìä An√°lises Avan√ßadas - Insights LLM', fontsize=14, fontweight='bold')

# 1. Boxplot de Sentimento por Per√≠odo
periods = dw_sentiment['periodo_rush'].unique()
sentiment_by_period = [dw_sentiment[dw_sentiment['periodo_rush'] == period]['sentiment_total'].values 
                       for period in periods]

bp = axes2[0].boxplot(sentiment_by_period, labels=periods, patch_artist=True)
axes2[0].set_title('üìä Sentimento por Per√≠odo', fontsize=12, fontweight='bold')
axes2[0].set_xlabel('Per√≠odo do Dia')
axes2[0].set_ylabel('Score de Sentimento')
axes2[0].tick_params(axis='x', rotation=45)

# Colorir as caixas
colors = ['lightblue', 'lightgreen', 'lightyellow', 'lightcoral', 'lightpink', 'lightgray', 'lavender', 'wheat']
for patch, color in zip(bp['boxes'], colors[:len(bp['boxes'])]):
    patch.set_facecolor(color)

# 2. Scatter Plot: Valor vs Sentimento
segments = dw_sentiment['customer_segment'].unique()
colors_scatter = plt.cm.Set1(np.linspace(0, 1, len(segments)))

for i, segment in enumerate(segments):
    segment_data = dw_sentiment[dw_sentiment['customer_segment'] == segment]
    axes2[1].scatter(segment_data['money'], segment_data['sentiment_total'], 
                     alpha=0.6, color=colors_scatter[i], label=segment, s=30)

axes2[1].set_title('üí∞ Valor vs Sentimento por Segmento', fontsize=12, fontweight='bold')
axes2[1].set_xlabel('Valor da Transa√ß√£o ($)')
axes2[1].set_ylabel('Score de Sentimento')
axes2[1].legend(bbox_to_anchor=(1.05, 1), loc='upper left')

# 3. Heatmap de Performance por Dia da Semana e Hora
pivot_data = dw_sentiment.pivot_table(values='money', index='weekday', 
                                      columns='hour_of_day', aggfunc='mean')

im = axes2[2].imshow(pivot_data.values, cmap='YlOrRd', aspect='auto')
axes2[2].set_title('üî• Receita M√©dia por Dia/Hora', fontsize=12, fontweight='bold')
axes2[2].set_xlabel('Hora do Dia')
axes2[2].set_ylabel('Dia da Semana')
axes2[2].set_xticks(range(len(pivot_data.columns)))
axes2[2].set_xticklabels(pivot_data.columns)
axes2[2].set_yticks(range(len(pivot_data.index)))
axes2[2].set_yticklabels(pivot_data.index)

plt.colorbar(im, ax=axes2[2], shrink=0.8, label='Receita M√©dia ($)')

plt.tight_layout()
plt.show()

print("‚úÖ An√°lises avan√ßadas com Matplotlib conclu√≠das!")
print("üìä Todos os gr√°ficos gerados com sucesso!")
print("üéØ Dashboard completo dispon√≠vel em formato est√°tico e confi√°vel!")

In [None]:
# Visualiza√ß√µes Alternativas com Matplotlib (caso Plotly n√£o funcione)
print("üìä Criando visualiza√ß√µes alternativas com Matplotlib...")

# Configurar figura com subplots
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
fig.suptitle('üìä Dashboard Executivo - An√°lise Coffee Sales com LLM', fontsize=16, fontweight='bold')

# 1. Receita por Hor√°rio
hourly_revenue = dw_sentiment.groupby('hour_of_day')['money'].sum()
axes[0, 0].bar(hourly_revenue.index, hourly_revenue.values, color='lightblue', alpha=0.7)
axes[0, 0].set_title('üí∞ Receita por Hor√°rio')
axes[0, 0].set_xlabel('Hora do Dia')
axes[0, 0].set_ylabel('Receita ($)')
axes[0, 0].tick_params(axis='x', rotation=45)

# 2. Distribui√ß√£o de Sentimentos
sentiment_counts = dw_sentiment['sentiment_category'].value_counts()
axes[0, 1].pie(sentiment_counts.values, labels=sentiment_counts.index, autopct='%1.1f%%')
axes[0, 1].set_title('üòä Distribui√ß√£o de Sentimentos')

# 3. Performance por Segmento
segment_revenue = dw_sentiment.groupby('customer_segment')['money'].sum()
axes[0, 2].bar(range(len(segment_revenue)), segment_revenue.values, color='green', alpha=0.7)
axes[0, 2].set_title('üë• Receita por Segmento')
axes[0, 2].set_xticks(range(len(segment_revenue)))
axes[0, 2].set_xticklabels(segment_revenue.index, rotation=45, ha='right')
axes[0, 2].set_ylabel('Receita ($)')

# 4. Top Caf√©s
coffee_revenue = dw_sentiment.groupby('coffee_name')['money'].sum().nlargest(6)
axes[1, 0].barh(range(len(coffee_revenue)), coffee_revenue.values, color='brown', alpha=0.7)
axes[1, 0].set_title('‚òï Top 6 Caf√©s por Receita')
axes[1, 0].set_yticks(range(len(coffee_revenue)))
axes[1, 0].set_yticklabels(coffee_revenue.index)
axes[1, 0].set_xlabel('Receita ($)')

# 5. Evolu√ß√£o Temporal (√∫ltimos 30 dias)
daily_revenue = dw_sentiment.groupby('date')['money'].sum().tail(30)
axes[1, 1].plot(daily_revenue.index, daily_revenue.values, marker='o', color='purple', alpha=0.7)
axes[1, 1].set_title('üìà Receita Di√°ria (√öltimos 30 dias)')
axes[1, 1].set_xlabel('Data')
axes[1, 1].set_ylabel('Receita ($)')
axes[1, 1].tick_params(axis='x', rotation=45)

# 6. Heatmap de Correla√ß√µes (simplificado)
correlation_vars = ['money', 'hour_of_day', 'sentiment_total', 'weekdaysort']
correlation_matrix = dw_sentiment[correlation_vars].corr()
im = axes[1, 2].imshow(correlation_matrix.values, cmap='RdYlBu', aspect='auto')
axes[1, 2].set_title('üî• Correla√ß√µes Principais')
axes[1, 2].set_xticks(range(len(correlation_vars)))
axes[1, 2].set_yticks(range(len(correlation_vars)))
axes[1, 2].set_xticklabels([v.replace('_', ' ').title() for v in correlation_vars], rotation=45)
axes[1, 2].set_yticklabels([v.replace('_', ' ').title() for v in correlation_vars])

# Adicionar valores no heatmap
for i in range(len(correlation_vars)):
    for j in range(len(correlation_vars)):
        axes[1, 2].text(j, i, f'{correlation_matrix.iloc[i, j]:.2f}', 
                        ha='center', va='center', fontweight='bold')

# Ajustar layout
plt.tight_layout()
plt.show()

print("‚úÖ Visualiza√ß√µes alternativas criadas com Matplotlib!")

# Estat√≠sticas resumidas em formato texto
print("\n" + "="*60)
print("üìä DASHBOARD EXECUTIVO - RESUMO ESTAT√çSTICO")
print("="*60)

print(f"üí∞ RECEITA TOTAL: ${dw_sentiment['money'].sum():,.2f}")
print(f"üõí TOTAL DE TRANSA√á√ïES: {len(dw_sentiment):,}")
print(f"üí≥ TICKET M√âDIO: ${dw_sentiment['money'].mean():.2f}")

best_hour = dw_sentiment.groupby('hour_of_day')['money'].sum().idxmax()
best_hour_revenue = dw_sentiment.groupby('hour_of_day')['money'].sum().max()
print(f"‚è∞ MELHOR HOR√ÅRIO: {best_hour}h (${best_hour_revenue:,.2f})")

best_segment = dw_sentiment.groupby('customer_segment')['money'].sum().idxmax()
best_segment_revenue = dw_sentiment.groupby('customer_segment')['money'].sum().max()
print(f"üë• MELHOR SEGMENTO: {best_segment} (${best_segment_revenue:,.2f})")

best_coffee = dw_sentiment.groupby('coffee_name')['money'].sum().idxmax()
best_coffee_revenue = dw_sentiment.groupby('coffee_name')['money'].sum().max()
print(f"‚òï CAF√â MAIS LUCRATIVO: {best_coffee} (${best_coffee_revenue:,.2f})")

avg_sentiment = dw_sentiment['sentiment_total'].mean()
print(f"üòä SENTIMENTO M√âDIO: {avg_sentiment:.2f}")

positive_pct = (dw_sentiment['sentiment_category'].isin(['Very Positive', 'Extremely Positive']).sum() / len(dw_sentiment)) * 100
print(f"‚ú® SENTIMENTO POSITIVO: {positive_pct:.1f}% das transa√ß√µes")

print("\nüéØ Dashboard completo gerado com sucesso!")
print("üìä Visualiza√ß√µes dispon√≠veis em formato interativo e est√°tico")

## üéØ Conclus√µes e Pr√≥ximos Passos

### ‚úÖ Resultados Alcan√ßados:
- **An√°lise Completa**: 3.547 transa√ß√µes processadas com qualidade de dados superior a 95%
- **Insights Estrat√©gicos**: Identifica√ß√£o de padr√µes comportamentais e oportunidades de otimiza√ß√£o
- **An√°lise de Sentimento**: Sistema multi-dimensional para avalia√ß√£o de satisfa√ß√£o do cliente
- **Relat√≥rio Executivo**: Documenta√ß√£o completa para tomada de decis√£o estrat√©gica
- **Visualiza√ß√µes Interativas**: Dashboard executivo com m√©tricas-chave e tend√™ncias

### üöÄ Pr√≥ximas Etapas Recomendadas:
1. **Implementa√ß√£o de Monitoramento**: Dashboard em tempo real
2. **Campanhas Segmentadas**: A√ß√µes direcionadas por perfil de cliente
3. **Otimiza√ß√£o Operacional**: Ajuste de recursos baseado em previs√µes
4. **Programa de Fidelidade**: Reten√ß√£o de clientes premium
5. **Expans√£o da An√°lise**: Inclus√£o de dados de satisfa√ß√£o e feedback

### üìä Impacto do Sistema LLM:
- Automatiza√ß√£o da gera√ß√£o de insights
- Identifica√ß√£o de padr√µes complexos n√£o evidentes
- Relat√≥rios executivos personalizados
- An√°lise preditiva para planejamento estrat√©gico

**üèÜ Sistema pronto para implementa√ß√£o em ambiente produtivo!**