# Notebook 05: Visualización y Análisis de Patrones de la Serie
**Proyecto:** Análisis SARIMAX - Starbucks Corporation (SBUX)  
**Investigador:** Frankli Zeña Zeña (UNI)

---
## Introducción
En este cuaderno realizaremos un análisis estadístico de los patrones gráficos de la serie de tiempo de Starbucks. Nos enfocaremos en identificar los momentos en los cuales la serie tiende a seguir el mismo patrón en los distintos periodos utilizados comoe stacionalidad.

## Carga y Split de Datos

In [1]:
import pandas as pd
import numpy as np
import datetime as dt
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings("ignore")

In [156]:
ruta_archivo = '../data/transformed/sbux_master_sarimax.csv'
data = pd.read_csv(ruta_archivo, index_col='Fecha', parse_dates=True).dropna()
data

Unnamed: 0_level_0,Date,Adj Close,Volume,Vol_Avg_20,Vol_Anomaly,Log_Return,Margen_Operativo_%,Revenue,choque_estructural,shock_extremo,earnings,riesgo_pais,shock_costos
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2021-03-31,2021-03-31,97.421112,6478400,7244615.0,False,-0.009110,14.811039,6668.0,0,0,0,0,0
2021-04-01,2021-04-01,97.519173,5793000,7173675.0,False,0.001006,14.811039,6668.0,0,0,0,0,0
2021-04-05,2021-04-05,98.981331,6913100,7241335.0,False,0.014882,14.811039,6668.0,0,0,0,0,0
2021-04-06,2021-04-06,100.880363,6745200,7322620.0,False,0.019004,14.811039,6668.0,0,0,0,0,0
2021-04-07,2021-04-07,100.916023,5629600,7328555.0,False,0.000353,14.811039,6668.0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2026-02-09,2026-02-09,98.345779,7150600,11434955.0,False,-0.004737,8.984274,9915.1,1,0,0,0,0
2026-02-10,2026-02-10,96.905067,8543500,11497655.0,False,-0.014758,8.984274,9915.1,1,0,0,0,0
2026-02-11,2026-02-11,98.484879,6949100,11567015.0,False,0.016171,8.984274,9915.1,1,0,0,0,0
2026-02-12,2026-02-12,96.139999,9537300,11625220.0,False,-0.024098,8.984274,9915.1,1,0,0,0,0


In [157]:
#Imprimimos todas las columnas de la data original
data.columns

Index(['Date', 'Adj Close', 'Volume', 'Vol_Avg_20', 'Vol_Anomaly',
       'Log_Return', 'Margen_Operativo_%', 'Revenue', 'choque_estructural',
       'shock_extremo', 'earnings', 'riesgo_pais', 'shock_costos'],
      dtype='object')

In [158]:
clean_data = data.copy()
clean_data['Price'] = data['Adj Close']
clean_data['Log Price'] = data['Log_Return']
clean_data.index = data.index

#Borramos todas las columnas de la data original en df
del clean_data['Date']
del clean_data['Adj Close']
del clean_data['Volume']
del clean_data['Vol_Avg_20']
del clean_data['Vol_Anomaly']
del clean_data['Log_Return' ]
del clean_data['Margen_Operativo_%']
del clean_data['Revenue']
del clean_data['choque_estructural']
del clean_data['shock_extremo']
del clean_data['earnings']
del clean_data['riesgo_pais']
del clean_data['shock_costos']

clean_data

Unnamed: 0_level_0,Price,Log Price
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1
2021-03-31,97.421112,-0.009110
2021-04-01,97.519173,0.001006
2021-04-05,98.981331,0.014882
2021-04-06,100.880363,0.019004
2021-04-07,100.916023,0.000353
...,...,...
2026-02-09,98.345779,-0.004737
2026-02-10,96.905067,-0.014758
2026-02-11,98.484879,0.016171
2026-02-12,96.139999,-0.024098


In [159]:
df = clean_data.copy()

In [160]:
# Crear gráfico interactivo de Precio Ajustado
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=df.index, 
    y=df['Price'],
    mode="lines", 
    name="SBUX Adj Close",
    line=dict(color="#00704A", width=2) # Verde Starbucks
))

# Personalización profesional
fig.update_layout(
    title="Evolución Histórica Interactiva: Starbucks (SBUX)",
    xaxis_title="Fecha",
    yaxis_title="Precio de Cierre Ajustado (USD)",
    hovermode="x unified",
    template="plotly_white", # Fondo blanco para informes académicos
    width=1200, 
    height=500,
    xaxis=dict(rangeslider=dict(visible=True)) # Agrega un deslizador de tiempo abajo
)

fig.show()

In [161]:
df

Unnamed: 0_level_0,Price,Log Price
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1
2021-03-31,97.421112,-0.009110
2021-04-01,97.519173,0.001006
2021-04-05,98.981331,0.014882
2021-04-06,100.880363,0.019004
2021-04-07,100.916023,0.000353
...,...,...
2026-02-09,98.345779,-0.004737
2026-02-10,96.905067,-0.014758
2026-02-11,98.484879,0.016171
2026-02-12,96.139999,-0.024098


## 5.1. Visualización de Perfiles de Ciclo
Siguiendo metodologías de visualización de patrones, se ha descompuesto la serie en perfiles superpuestos por años/meses. Esta técnica permite identificar si el comportamiento del precio y volumen en periodos específicos (como el cuarto trimestre) muestra una firma característica que se repite, validando visualmente el componente estacional ($S$) antes de la modelación SARIMA.

In [162]:
import plotly.express as px

`Visualización de Patrones Estacionales Superpuestos Anualmente`

In [163]:
df['Year'] = df.index.year
df['DayOfYear'] = df.index.dayofyear

In [164]:
df

Unnamed: 0_level_0,Price,Log Price,Year,DayOfYear
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2021-03-31,97.421112,-0.009110,2021,90
2021-04-01,97.519173,0.001006,2021,91
2021-04-05,98.981331,0.014882,2021,95
2021-04-06,100.880363,0.019004,2021,96
2021-04-07,100.916023,0.000353,2021,97
...,...,...,...,...
2026-02-09,98.345779,-0.004737,2026,40
2026-02-10,96.905067,-0.014758,2026,41
2026-02-11,98.484879,0.016171,2026,42
2026-02-12,96.139999,-0.024098,2026,43


In [165]:
fig = px.line(df, x='DayOfYear', y='Price', color='Year',
              title="Patrones Anuales Superpuestos: Rendimiento Relativo de SBUX",
              labels={'DayOfYear': 'Día del Año', 'Price': 'Precio'},
              template="plotly_white")

fig.update_layout(width=1200, height=600)
fig.show()

In [175]:
seasonal_year_mean = (
    df.groupby('DayOfYear')['Price']
    .mean()
    .reset_index()
)

seasonal_year_median = (
    df.groupby('DayOfYear')['Price']
    .median()
    .reset_index()
)

seasonal_year_mode = (
    df.groupby('DayOfYear')['Price']
    .agg(lambda x: x.mode().iloc[0] if not x.mode().empty else np.nan)
    .reset_index()
)

In [178]:
# AHora hacemos el Subplots interactivo para comparar las tres medidas de tendencia central
fig = make_subplots(rows=3, cols=1, shared_xaxes=True, subplot_titles=("Media Diaria", "Mediana Diaria", "Moda Diaria"))

fig.add_trace(go.Scatter(x=seasonal_year_mean['DayOfYear'], y=seasonal_year_mean['Price'], mode='lines', name='Media'), row=1, col=1)

fig.add_trace(go.Scatter(x=seasonal_year_median['DayOfYear'], y=seasonal_year_median['Price'], mode='lines', name='Mediana'), row=2, col=1)

fig.add_trace(go.Scatter(x=seasonal_year_mode['DayOfYear'], y=seasonal_year_mode['Price'], mode='lines', name='Moda'), row=3, col=1)

fig.update_layout(height=900, width=1200, title_text="Comparación de Medidas de Tendencia Central por Día del Año")
fig.show()

`Visualización de Patrones Estacionales Superpuestos Mensualmente`

In [179]:
df['Month'] = df.index.month
df['Day'] = df.index.day

In [180]:
df

Unnamed: 0_level_0,Price,Log Price,Year,DayOfYear,Month,Day
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2021-03-31,97.421112,-0.009110,2021,90,3,31
2021-04-01,97.519173,0.001006,2021,91,4,1
2021-04-05,98.981331,0.014882,2021,95,4,5
2021-04-06,100.880363,0.019004,2021,96,4,6
2021-04-07,100.916023,0.000353,2021,97,4,7
...,...,...,...,...,...,...
2026-02-09,98.345779,-0.004737,2026,40,2,9
2026-02-10,96.905067,-0.014758,2026,41,2,10
2026-02-11,98.484879,0.016171,2026,42,2,11
2026-02-12,96.139999,-0.024098,2026,43,2,12


In [181]:
fig = px.line(df, x='Day', y='Price', color='Month',
              title="Patrones Mensuales Superpuestos: Rendimiento Relativo de SBUX",
              labels={'Day': 'Día del Mes', 'Price': 'Precio'},
              template="plotly_white")

fig.update_layout(width=1200, height=600)
fig.show()

In [182]:
seasonal_month_mean = (
    df.groupby('Month')['Price']
    .mean()
    .reset_index()
)

seasonal_month_median = (
    df.groupby('Month')['Price']
    .median()
    .reset_index()
)

seasonal_month_mode = (
    df.groupby('Month')['Price']
    .agg(lambda x: x.mode().iloc[0] if not x.mode().empty else np.nan)
    .reset_index()
)

In [183]:
# AHora hacemos el Subplots interactivo para comparar las tres medidas de tendencia central
fig = make_subplots(rows=3, cols=1, shared_xaxes=True, subplot_titles=("Media Mensual", "Mediana Mensual", "Moda Mensual"))

fig.add_trace(go.Scatter(x=seasonal_month_mean['Month'], y=seasonal_month_mean['Price'], mode='lines', name='Media'), row=1, col=1)

fig.add_trace(go.Scatter(x=seasonal_month_median['Month'], y=seasonal_month_median['Price'], mode='lines', name='Mediana'), row=2, col=1)

fig.add_trace(go.Scatter(x=seasonal_month_mode['Month'], y=seasonal_month_mode['Price'], mode='lines', name='Moda'), row=3, col=1)

fig.update_layout(height=900, width=1200, title_text="Comparación de Medidas de Tendencia Central por Día del Año")
fig.show()

`Visualización de Patrones Estacionales Superpuestos Semanalmente`

In [184]:
# AHora visualizaremos por día de la semana para ver si hay patrones semanales
df['WeekOfYear'] = df.index.isocalendar().week
df['DayOfWeek'] = df.index.dayofweek

In [131]:
df

Unnamed: 0_level_0,Price,Log Price,Year,DayOfYear,Month,Day,WeekOfYear,DayOfWeek
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2021-03-31,97.421112,-0.009110,2021,90,3,31,13,2
2021-04-01,97.519173,0.001006,2021,91,4,1,13,3
2021-04-05,98.981331,0.014882,2021,95,4,5,14,0
2021-04-06,100.880363,0.019004,2021,96,4,6,14,1
2021-04-07,100.916023,0.000353,2021,97,4,7,14,2
...,...,...,...,...,...,...,...,...
2026-02-09,98.345779,-0.004737,2026,40,2,9,7,0
2026-02-10,96.905067,-0.014758,2026,41,2,10,7,1
2026-02-11,98.484879,0.016171,2026,42,2,11,7,2
2026-02-12,96.139999,-0.024098,2026,43,2,12,7,3


In [185]:
fig = px.line(df, x='DayOfWeek', y='Price', color='WeekOfYear',
              title="Patrones Mensuales Superpuestos: Rendimiento Relativo de SBUX",
              labels={'DayOfWeek': 'Día de la Semana', 'Price': 'Precio'},
              template="plotly_white")

fig.update_layout(width=1200, height=600)
fig.show()

In [186]:
seasonal_weekday_mean = (
    df.groupby('DayOfWeek')['Price']
    .mean()
    .reset_index()
)

seasonal_weekday_median = (
    df.groupby('DayOfWeek')['Price']
    .median()
    .reset_index()
)

seasonal_weekday_mode = (
    df.groupby('DayOfWeek')['Price']
    .agg(lambda x: x.mode().iloc[0] if not x.mode().empty else np.nan)
    .reset_index()
)

In [187]:
# AHora hacemos el Subplots interactivo para comparar las tres medidas de tendencia central
fig = make_subplots(rows=3, cols=1, shared_xaxes=True, subplot_titles=("Media Semanal", "Mediana Semanal", "Moda Semanal"))

fig.add_trace(go.Scatter(x=seasonal_weekday_mean['DayOfWeek'], y=seasonal_weekday_mean['Price'], mode='lines', name='Media'), row=1, col=1)

fig.add_trace(go.Scatter(x=seasonal_weekday_median['DayOfWeek'], y=seasonal_weekday_median['Price'], mode='lines', name='Mediana'), row=2, col=1)

fig.add_trace(go.Scatter(x=seasonal_weekday_mode['DayOfWeek'], y=seasonal_weekday_mode['Price'], mode='lines', name='Moda'), row=3, col=1)

fig.update_layout(height=900, width=1200, title_text="Comparación de Medidas de Tendencia Central por Día del Año")
fig.show()

## 5.2. Análisis de Volatilidad Móvil

In [136]:
# Cargar registro de noticias (asegúrate de haberlo creado en la carpeta data/external/)
news = pd.read_csv('../data/external/news_history.csv', index_col=0, parse_dates=True)
news

Unnamed: 0_level_0,Evento,Categoria,Efecto,Fuente,Impacto_Neto
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2021-03-16,Luckin reestructuración deuda,Competencia,negativo,https://investor.luckincoffee.com/news-release...,competencia_global
2021-03-17,50th Anniversary Blend Starbucks,Fundamental,positivo,https://www.gcrmag.com/starbucks-celebrates-it...,evento_marca
2021-03-18,Expansión pickup Dunkin,Competencia,negativo,https://news.dunkindonuts.com/news/dunkin-joy-...,competencia_qsr
2021-03-19,Costa Express inteligente,Competencia,negativo,https://caternewsdigital.com/internacional/cos...,competencia_qsr
2021-03-20,McCafe integrado en app,Competencia,negativo,https://www.infofranquicias.com/fd-1348/franqu...,competencia_precio
2021-04-28,Resultados Q2 FY21,Earnings,positivo,https://investor.starbucks.com/news/financial-...,earnings
2021-06-25,Escasez insumos,Demand,negativo,https://www.businessinsider.com/starbucks-shor...,shock_operativo
2021-09-15,Aumento dividendo,Dividendos,positivo,https://www.nasdaq.com/articles/starbucks-sbux...,politica_accionista
2021-10-29,Subida salarial partners,Costos,negativo,https://www.cnbc.com/2021/10/27/starbucks-to-r...,shock_costos
2021-11-01,CocaCola BodyArmor distribucion,Competencia,negativo,https://www.coca-colacompany.com/news/coca-col...,competencia_retail


In [139]:
window = 21 # Un mes de trading
df['Rolling_Std'] = df['Log Price'].rolling(window=window).std()

fig = go.Figure()

fig.add_trace(go.Scatter(
    x=df.index, y=df['Rolling_Std'],
    mode='lines', name='Volatilidad (21d)',
    line=dict(color='orange')
))

# Añadir las noticias de shocks para ver si coinciden con picos de volatilidad
df_news_vol = pd.merge(df.reset_index(), news.reset_index(), left_on='Fecha', right_on='Fecha', how='inner')

fig.add_trace(go.Scatter(
    x=df_news_vol['Fecha'], y=df_news_vol['Rolling_Std'],
    mode='markers', name='Evento de Shock',
    marker=dict(color='red', size=8),
    text=df_news_vol['Evento']
))

fig.update_layout(
    title="Análisis de Estabilidad de la Varianza (Rolling Std Dev)",
    xaxis_title="Fecha", yaxis_title="Desviación Estándar",
    template="plotly_white"
)
fig.show()