# Real-World Financial Time-Series Analysis with Polars & yfinance

Practical examples using real stock market data to demonstrate Polars' powerful time-series capabilities.

## Topics:
- Fetching real stock data with yfinance
- Converting to Polars DataFrames
- Multi-stock analysis with group_by_dynamic
- Technical analysis indicators
- Portfolio analysis
- Correlation and covariance analysis
- Risk metrics (volatility, Sharpe ratio, drawdowns)
- Market regime detection
- Trading signals and backtesting basics

## Installation
```bash
pip install yfinance polars
```

In [None]:
import polars as pl
import yfinance as yf
from datetime import datetime, timedelta
import numpy as np

# Set display options
pl.Config.set_tbl_rows(15)

## Part 1: Fetching Stock Data

### Single Stock Data

In [None]:
# Download Apple stock data for the last year
ticker = 'AAPL'
stock = yf.Ticker(ticker)

# Get historical data
hist = stock.history(period='1y')

print(f"Downloaded {len(hist)} days of data for {ticker}")
print(hist.head())

In [None]:
# Convert to Polars DataFrame
aapl_df = pl.DataFrame({
    'date': hist.index,
    'open': hist['Open'].values,
    'high': hist['High'].values,
    'low': hist['Low'].values,
    'close': hist['Close'].values,
    'volume': hist['Volume'].values,
}).with_columns([
    pl.lit(ticker).alias('symbol')
])

print("\nPolars DataFrame:")
print(aapl_df.head())
print(f"\nSchema: {aapl_df.schema}")

### Multiple Stocks Data

In [None]:
# Download data for multiple stocks
symbols = ['AAPL', 'GOOGL', 'MSFT', 'AMZN', 'TSLA']

# Download all at once (faster)
data = yf.download(symbols, period='2y', group_by='ticker', progress=False)

print(f"Downloaded data for {len(symbols)} stocks")
print(f"Data shape: {data.shape}")

In [None]:
# Convert to long format Polars DataFrame
stocks_list = []

for symbol in symbols:
    df = pl.DataFrame({
        'date': data[symbol].index,
        'open': data[symbol]['Open'].values,
        'high': data[symbol]['High'].values,
        'low': data[symbol]['Low'].values,
        'close': data[symbol]['Close'].values,
        'volume': data[symbol]['Volume'].values,
    }).with_columns([
        pl.lit(symbol).alias('symbol')
    ])
    stocks_list.append(df)

# Concatenate all stocks
stocks_df = pl.concat(stocks_list)

print(f"\nCombined DataFrame: {len(stocks_df)} records")
print(stocks_df.head(10))

## Part 2: Basic Technical Analysis

### Calculate Returns

In [None]:
# Calculate daily returns for each stock
stocks_with_returns = stocks_df.sort(['symbol', 'date']).with_columns([
    # Simple returns
    (pl.col('close') / pl.col('close').shift(1).over('symbol') - 1).alias('daily_return'),
    
    # Log returns (better for analysis)
    (pl.col('close') / pl.col('close').shift(1).over('symbol')).log().alias('log_return'),
    
    # Intraday range
    ((pl.col('high') - pl.col('low')) / pl.col('close')).alias('daily_range_pct')
])

print("Stocks with returns:")
print(stocks_with_returns.filter(pl.col('symbol') == 'AAPL').head(10))

### Moving Averages and Crossovers

In [None]:
# Calculate multiple moving averages
stocks_with_ma = stocks_with_returns.with_columns([
    # Simple Moving Averages
    pl.col('close').rolling_mean(window_size=20).over('symbol').alias('sma_20'),
    pl.col('close').rolling_mean(window_size=50).over('symbol').alias('sma_50'),
    pl.col('close').rolling_mean(window_size=200).over('symbol').alias('sma_200'),
    
    # Exponential Moving Averages
    pl.col('close').ewm_mean(span=12).over('symbol').alias('ema_12'),
    pl.col('close').ewm_mean(span=26).over('symbol').alias('ema_26'),
]).with_columns([
    # MACD (Moving Average Convergence Divergence)
    (pl.col('ema_12') - pl.col('ema_26')).alias('macd'),
]).with_columns([
    # MACD Signal line
    pl.col('macd').ewm_mean(span=9).over('symbol').alias('macd_signal'),
]).with_columns([
    # MACD Histogram
    (pl.col('macd') - pl.col('macd_signal')).alias('macd_hist')
])

# Detect golden cross (SMA 50 crosses above SMA 200)
stocks_with_signals = stocks_with_ma.with_columns([
    # Previous values
    pl.col('sma_50').shift(1).over('symbol').alias('sma_50_prev'),
    pl.col('sma_200').shift(1).over('symbol').alias('sma_200_prev'),
]).with_columns([
    # Golden cross: SMA50 crosses above SMA200
    ((pl.col('sma_50') > pl.col('sma_200')) & 
     (pl.col('sma_50_prev') <= pl.col('sma_200_prev'))).alias('golden_cross'),
    
    # Death cross: SMA50 crosses below SMA200
    ((pl.col('sma_50') < pl.col('sma_200')) & 
     (pl.col('sma_50_prev') >= pl.col('sma_200_prev'))).alias('death_cross'),
])

print("\nStocks with moving averages and signals:")
print(stocks_with_signals.filter(pl.col('symbol') == 'AAPL').tail(10).select([
    'date', 'symbol', 'close', 'sma_20', 'sma_50', 'sma_200', 'macd', 'macd_signal'
]))

In [None]:
# Find all golden crosses in the dataset
golden_crosses = stocks_with_signals.filter(pl.col('golden_cross'))

print(f"\nFound {len(golden_crosses)} golden cross signals:")
print(golden_crosses.select(['date', 'symbol', 'close', 'sma_50', 'sma_200']))

### Bollinger Bands

In [None]:
# Calculate Bollinger Bands
stocks_with_bb = stocks_with_ma.with_columns([
    pl.col('close').rolling_std(window_size=20).over('symbol').alias('bb_std')
]).with_columns([
    (pl.col('sma_20') + 2 * pl.col('bb_std')).alias('bb_upper'),
    (pl.col('sma_20') - 2 * pl.col('bb_std')).alias('bb_lower'),
    
    # Bollinger Band Width (volatility indicator)
    ((pl.col('bb_std') * 4) / pl.col('sma_20')).alias('bb_width'),
    
    # %B (position within bands)
    ((pl.col('close') - (pl.col('sma_20') - 2 * pl.col('bb_std'))) / 
     (4 * pl.col('bb_std'))).alias('bb_percent')
])

# Identify when price touches or breaks bands
bb_signals = stocks_with_bb.with_columns([
    (pl.col('close') > pl.col('bb_upper')).alias('above_upper_band'),
    (pl.col('close') < pl.col('bb_lower')).alias('below_lower_band')
])

print("\nBollinger Bands for AAPL:")
print(bb_signals.filter(pl.col('symbol') == 'AAPL').tail(10).select([
    'date', 'close', 'sma_20', 'bb_upper', 'bb_lower', 'bb_width', 'bb_percent'
]))

### RSI (Relative Strength Index)

In [None]:
# Calculate RSI (14-period)
def calculate_rsi(df, period=14):
    return df.with_columns([
        # Price changes
        (pl.col('close') - pl.col('close').shift(1)).alias('price_change')
    ]).with_columns([
        # Separate gains and losses
        pl.when(pl.col('price_change') > 0)
          .then(pl.col('price_change'))
          .otherwise(0)
          .alias('gain'),
        pl.when(pl.col('price_change') < 0)
          .then(-pl.col('price_change'))
          .otherwise(0)
          .alias('loss')
    ]).with_columns([
        # Average gains and losses using EWM
        pl.col('gain').ewm_mean(span=period).over('symbol').alias('avg_gain'),
        pl.col('loss').ewm_mean(span=period).over('symbol').alias('avg_loss')
    ]).with_columns([
        # RS and RSI
        (pl.col('avg_gain') / pl.col('avg_loss')).alias('rs')
    ]).with_columns([
        (100 - (100 / (1 + pl.col('rs')))).alias('rsi')
    ])

stocks_with_rsi = calculate_rsi(stocks_with_returns.sort(['symbol', 'date']))

# Identify overbought/oversold conditions
rsi_signals = stocks_with_rsi.with_columns([
    (pl.col('rsi') > 70).alias('overbought'),
    (pl.col('rsi') < 30).alias('oversold')
])

print("\nRSI for AAPL:")
print(rsi_signals.filter(pl.col('symbol') == 'AAPL').tail(10).select([
    'date', 'close', 'rsi', 'overbought', 'oversold'
]))

In [None]:
# Find recent oversold conditions (potential buy signals)
recent_oversold = rsi_signals.filter(
    (pl.col('oversold')) & 
    (pl.col('date') >= (datetime.now() - timedelta(days=30)))
)

print(f"\nOversold signals in the last 30 days:")
print(recent_oversold.select(['date', 'symbol', 'close', 'rsi']))

## Part 3: Volatility Analysis

In [None]:
# Calculate historical volatility at different timeframes
volatility_df = stocks_with_returns.with_columns([
    # 20-day volatility (annualized)
    (pl.col('daily_return').rolling_std(window_size=20).over('symbol') * np.sqrt(252)).alias('volatility_20d'),
    
    # 50-day volatility
    (pl.col('daily_return').rolling_std(window_size=50).over('symbol') * np.sqrt(252)).alias('volatility_50d'),
    
    # ATR (Average True Range) - 14 periods
]).with_columns([
    # True Range components
    (pl.col('high') - pl.col('low')).alias('hl'),
    (pl.col('high') - pl.col('close').shift(1).over('symbol')).abs().alias('hc'),
    (pl.col('low') - pl.col('close').shift(1).over('symbol')).abs().alias('lc')
]).with_columns([
    # True Range is the maximum of the three
    pl.max_horizontal(['hl', 'hc', 'lc']).alias('true_range')
]).with_columns([
    # ATR is the moving average of True Range
    pl.col('true_range').rolling_mean(window_size=14).over('symbol').alias('atr')
])

print("\nVolatility metrics:")
print(volatility_df.filter(pl.col('symbol') == 'TSLA').tail(10).select([
    'date', 'symbol', 'close', 'daily_return', 'volatility_20d', 'volatility_50d', 'atr'
]))

In [None]:
# Compare current volatility across stocks
latest_volatility = volatility_df.group_by('symbol').agg([
    pl.col('date').max().alias('latest_date'),
    pl.col('volatility_20d').last().alias('current_vol_20d'),
    pl.col('volatility_50d').last().alias('current_vol_50d')
]).sort('current_vol_20d', descending=True)

print("\nCurrent volatility ranking:")
print(latest_volatility)

## Part 4: Using group_by_dynamic for Time-Based Analysis

### Weekly and Monthly Aggregations

In [None]:
# Calculate weekly OHLC for each stock
weekly_ohlc = stocks_df.sort(['symbol', 'date']).group_by_dynamic(
    'date',
    every='1w',
    by='symbol'
).agg([
    pl.col('open').first().alias('open'),
    pl.col('high').max().alias('high'),
    pl.col('low').min().alias('low'),
    pl.col('close').last().alias('close'),
    pl.col('volume').sum().alias('volume'),
]).with_columns([
    # Weekly return
    ((pl.col('close') - pl.col('open')) / pl.col('open')).alias('weekly_return')
])

print("\nWeekly OHLC:")
print(weekly_ohlc.filter(pl.col('symbol') == 'AAPL').tail(10))

In [None]:
# Monthly aggregations
monthly_stats = stocks_df.sort(['symbol', 'date']).group_by_dynamic(
    'date',
    every='1mo',
    by='symbol'
).agg([
    pl.col('close').first().alias('month_open'),
    pl.col('close').last().alias('month_close'),
    pl.col('high').max().alias('month_high'),
    pl.col('low').min().alias('month_low'),
    pl.col('volume').mean().alias('avg_daily_volume'),
    pl.len().alias('trading_days')
]).with_columns([
    ((pl.col('month_close') - pl.col('month_open')) / pl.col('month_open')).alias('monthly_return')
])

print("\nMonthly statistics:")
print(monthly_stats.filter(pl.col('symbol') == 'AAPL').tail(12))

### Quarterly Performance

In [None]:
# Quarterly performance comparison
quarterly_perf = stocks_df.sort(['symbol', 'date']).group_by_dynamic(
    'date',
    every='1q',  # Quarterly
    by='symbol'
).agg([
    pl.col('close').first().alias('q_open'),
    pl.col('close').last().alias('q_close'),
    pl.col('high').max().alias('q_high'),
    pl.col('low').min().alias('q_low'),
]).with_columns([
    ((pl.col('q_close') - pl.col('q_open')) / pl.col('q_open') * 100).alias('q_return_pct')
])

# Pivot to compare stocks side by side
quarterly_comparison = quarterly_perf.pivot(
    values='q_return_pct',
    index='date',
    columns='symbol'
).sort('date')

print("\nQuarterly returns comparison (%):")
print(quarterly_comparison.tail(8))

## Part 5: Portfolio Analysis

In [None]:
# Create an equal-weighted portfolio
portfolio_weights = {symbol: 1.0 / len(symbols) for symbol in symbols}

print("Portfolio weights:")
for symbol, weight in portfolio_weights.items():
    print(f"{symbol}: {weight:.2%}")

In [None]:
# Calculate portfolio daily returns
portfolio_df = stocks_with_returns.select([
    'date', 'symbol', 'close', 'daily_return'
]).with_columns([
    pl.col('symbol').replace_strict(
        old=list(portfolio_weights.keys()),
        new=list(portfolio_weights.values())
    ).alias('weight')
]).with_columns([
    (pl.col('daily_return') * pl.col('weight')).alias('weighted_return')
])

# Aggregate to get portfolio returns
portfolio_returns = portfolio_df.group_by('date').agg([
    pl.col('weighted_return').sum().alias('portfolio_return')
]).sort('date')

print("\nPortfolio daily returns:")
print(portfolio_returns.tail(10))

In [None]:
# Calculate cumulative portfolio value (starting with $10,000)
portfolio_value = portfolio_returns.with_columns([
    (pl.col('portfolio_return') + 1).alias('growth_factor')
]).with_columns([
    (pl.col('growth_factor').cum_prod() * 10000).alias('portfolio_value')
])

# Portfolio statistics
total_return = (portfolio_value['portfolio_value'][-1] / 10000 - 1) * 100
avg_daily_return = portfolio_returns['portfolio_return'].mean() * 100
volatility = portfolio_returns['portfolio_return'].std() * np.sqrt(252) * 100

print(f"\nPortfolio Performance:")
print(f"Initial Investment: $10,000")
print(f"Final Value: ${portfolio_value['portfolio_value'][-1]:,.2f}")
print(f"Total Return: {total_return:.2f}%")
print(f"Average Daily Return: {avg_daily_return:.4f}%")
print(f"Annualized Volatility: {volatility:.2f}%")

### Sharpe Ratio Calculation

In [None]:
# Calculate Sharpe Ratio (assuming 4% risk-free rate)
risk_free_rate = 0.04
daily_rf_rate = risk_free_rate / 252

portfolio_metrics = portfolio_returns.with_columns([
    (pl.col('portfolio_return') - daily_rf_rate).alias('excess_return')
])

# Calculate Sharpe Ratio
avg_excess_return = portfolio_metrics['excess_return'].mean()
std_excess_return = portfolio_metrics['excess_return'].std()
sharpe_ratio = (avg_excess_return / std_excess_return) * np.sqrt(252)

print(f"\nSharpe Ratio: {sharpe_ratio:.3f}")
print(f"Annualized Excess Return: {avg_excess_return * 252 * 100:.2f}%")

### Maximum Drawdown

In [None]:
# Calculate drawdown
drawdown_df = portfolio_value.with_columns([
    pl.col('portfolio_value').cum_max().alias('running_max')
]).with_columns([
    ((pl.col('portfolio_value') - pl.col('running_max')) / pl.col('running_max')).alias('drawdown')
])

# Find maximum drawdown
max_drawdown = drawdown_df['drawdown'].min() * 100
max_dd_date = drawdown_df.filter(
    pl.col('drawdown') == drawdown_df['drawdown'].min()
)['date'][0]

print(f"\nMaximum Drawdown: {max_drawdown:.2f}%")
print(f"Occurred on: {max_dd_date}")

# Show worst drawdown periods
print("\nWorst 5 drawdown periods:")
print(drawdown_df.sort('drawdown').head(5).select(['date', 'portfolio_value', 'running_max', 'drawdown']))

## Part 6: Correlation Analysis

In [None]:
# Pivot returns to wide format for correlation calculation
returns_wide = stocks_with_returns.select([
    'date', 'symbol', 'daily_return'
]).filter(
    pl.col('daily_return').is_not_null()
).pivot(
    values='daily_return',
    index='date',
    columns='symbol'
).sort('date')

print("Returns in wide format:")
print(returns_wide.head())

In [None]:
# Calculate correlation matrix
# Note: Polars doesn't have built-in correlation matrix, so we'll use select with corr
print("\nCorrelation Analysis:")

for i, symbol1 in enumerate(symbols):
    for symbol2 in symbols[i+1:]:
        corr = returns_wide.select(
            pl.corr(symbol1, symbol2).alias('correlation')
        )['correlation'][0]
        print(f"{symbol1} vs {symbol2}: {corr:.3f}")

### Rolling Correlation

In [None]:
# Calculate 60-day rolling correlation between AAPL and MSFT
aapl_msft_corr = returns_wide.with_columns([
    pl.rolling_corr('AAPL', 'MSFT', window_size=60).alias('aapl_msft_corr_60d')
])

print("\n60-day rolling correlation between AAPL and MSFT:")
print(aapl_msft_corr.select(['date', 'AAPL', 'MSFT', 'aapl_msft_corr_60d']).tail(15))

## Part 7: Advanced Analysis - Intraday Data

In [None]:
# Download 1-minute intraday data (last 7 days)
ticker_intraday = 'AAPL'
intraday_data = yf.download(
    ticker_intraday, 
    period='7d', 
    interval='1m',
    progress=False
)

print(f"Downloaded {len(intraday_data)} minutes of intraday data for {ticker_intraday}")

In [None]:
# Convert to Polars
intraday_df = pl.DataFrame({
    'timestamp': intraday_data.index,
    'open': intraday_data['Open'].values,
    'high': intraday_data['High'].values,
    'low': intraday_data['Low'].values,
    'close': intraday_data['Close'].values,
    'volume': intraday_data['Volume'].values,
})

print("\nIntraday data:")
print(intraday_df.head(10))

### Aggregate to 5-minute and 15-minute Bars

In [None]:
# Create 5-minute OHLC bars
bars_5m = intraday_df.group_by_dynamic(
    'timestamp',
    every='5m'
).agg([
    pl.col('open').first().alias('open'),
    pl.col('high').max().alias('high'),
    pl.col('low').min().alias('low'),
    pl.col('close').last().alias('close'),
    pl.col('volume').sum().alias('volume'),
    pl.len().alias('num_ticks')
])

print("\n5-minute OHLC bars:")
print(bars_5m.tail(20))

In [None]:
# Create 15-minute bars with VWAP
bars_15m = intraday_df.group_by_dynamic(
    'timestamp',
    every='15m'
).agg([
    pl.col('open').first().alias('open'),
    pl.col('high').max().alias('high'),
    pl.col('low').min().alias('low'),
    pl.col('close').last().alias('close'),
    pl.col('volume').sum().alias('volume'),
    # VWAP calculation
    ((pl.col('close') * pl.col('volume')).sum() / pl.col('volume').sum()).alias('vwap')
])

print("\n15-minute bars with VWAP:")
print(bars_15m.tail(20))

### Intraday Volume Profile

In [None]:
# Analyze volume by hour of day
intraday_with_time = intraday_df.with_columns([
    pl.col('timestamp').dt.hour().alias('hour'),
    pl.col('timestamp').dt.date().alias('date')
])

hourly_volume = intraday_with_time.group_by('hour').agg([
    pl.col('volume').mean().alias('avg_volume'),
    pl.col('volume').sum().alias('total_volume'),
    pl.len().alias('num_minutes')
]).sort('hour')

print("\nAverage volume by hour of day:")
print(hourly_volume)

### Opening and Closing Price Movements

In [None]:
# Analyze first and last 30 minutes of trading
daily_open_close = intraday_with_time.group_by_dynamic(
    'timestamp',
    every='1d'
).agg([
    pl.col('open').first().alias('day_open'),
    pl.col('close').last().alias('day_close'),
]).with_columns([
    ((pl.col('day_close') - pl.col('day_open')) / pl.col('day_open') * 100).alias('daily_change_pct')
])

print("\nDaily open to close changes:")
print(daily_open_close)

## Part 8: Market Regime Detection

In [None]:
# Detect market regimes based on volatility and trend
regime_df = stocks_with_returns.filter(
    pl.col('symbol') == 'AAPL'
).with_columns([
    # Calculate rolling metrics
    pl.col('daily_return').rolling_mean(window_size=20).alias('trend_20d'),
    pl.col('daily_return').rolling_std(window_size=20).alias('vol_20d'),
    pl.col('close').rolling_mean(window_size=50).alias('sma_50'),
    pl.col('close').rolling_mean(window_size=200).alias('sma_200'),
]).with_columns([
    # Classify regime
    pl.when((pl.col('close') > pl.col('sma_200')) & (pl.col('vol_20d') < pl.col('vol_20d').median()))
      .then(pl.lit('Bull - Low Vol'))
      .when((pl.col('close') > pl.col('sma_200')) & (pl.col('vol_20d') >= pl.col('vol_20d').median()))
      .then(pl.lit('Bull - High Vol'))
      .when((pl.col('close') <= pl.col('sma_200')) & (pl.col('vol_20d') < pl.col('vol_20d').median()))
      .then(pl.lit('Bear - Low Vol'))
      .otherwise(pl.lit('Bear - High Vol'))
      .alias('regime')
])

# Count days in each regime
regime_counts = regime_df.group_by('regime').agg([
    pl.len().alias('num_days'),
    pl.col('daily_return').mean().alias('avg_daily_return')
])

print("\nMarket regime distribution for AAPL:")
print(regime_counts)

In [None]:
# Show recent regime
print("\nRecent market regime:")
print(regime_df.tail(10).select(['date', 'close', 'sma_200', 'vol_20d', 'regime']))

## Part 9: Simple Trading Signal Backtest

In [None]:
# Simple SMA crossover strategy
backtest_df = stocks_with_ma.filter(
    pl.col('symbol') == 'AAPL'
).with_columns([
    # Generate signals
    pl.when(pl.col('sma_20') > pl.col('sma_50'))
      .then(pl.lit(1))  # Long signal
      .otherwise(pl.lit(0))  # No position
      .alias('signal')
]).with_columns([
    # Strategy returns
    (pl.col('signal').shift(1) * pl.col('daily_return')).alias('strategy_return')
])

# Calculate cumulative returns
backtest_results = backtest_df.with_columns([
    ((pl.col('daily_return') + 1).cum_prod() * 100).alias('buy_hold_value'),
    ((pl.col('strategy_return').fill_null(0) + 1).cum_prod() * 100).alias('strategy_value')
])

# Performance metrics
final_bh = backtest_results['buy_hold_value'][-1]
final_strategy = backtest_results['strategy_value'][-1]

strategy_returns = backtest_results['strategy_return'].fill_null(0)
strategy_sharpe = (strategy_returns.mean() / strategy_returns.std()) * np.sqrt(252)

print("\nBacktest Results (SMA 20/50 Crossover):")
print(f"Buy & Hold Return: {final_bh - 100:.2f}%")
print(f"Strategy Return: {final_strategy - 100:.2f}%")
print(f"Strategy Sharpe Ratio: {strategy_sharpe:.3f}")

# Show recent signals
print("\nRecent signals:")
print(backtest_results.tail(10).select([
    'date', 'close', 'sma_20', 'sma_50', 'signal', 'daily_return', 'strategy_return'
]))

## Part 10: Comparative Performance Dashboard

In [None]:
# Create comprehensive performance summary for all stocks
performance_summary = stocks_with_returns.group_by('symbol').agg([
    # Return metrics
    pl.col('close').first().alias('start_price'),
    pl.col('close').last().alias('end_price'),
    ((pl.col('close').last() / pl.col('close').first()) - 1).alias('total_return'),
    pl.col('daily_return').mean().alias('avg_daily_return'),
    
    # Risk metrics
    pl.col('daily_return').std().alias('daily_volatility'),
    pl.col('daily_return').min().alias('worst_day'),
    pl.col('daily_return').max().alias('best_day'),
    
    # Volume
    pl.col('volume').mean().alias('avg_volume'),
]).with_columns([
    # Annualized metrics
    (pl.col('avg_daily_return') * 252).alias('annualized_return'),
    (pl.col('daily_volatility') * np.sqrt(252)).alias('annualized_volatility'),
]).with_columns([
    # Sharpe Ratio (assuming 4% risk-free rate)
    ((pl.col('annualized_return') - 0.04) / pl.col('annualized_volatility')).alias('sharpe_ratio')
]).sort('total_return', descending=True)

print("\nComprehensive Performance Summary:")
print(performance_summary)

In [None]:
# Risk-adjusted performance comparison
print("\nRisk-Adjusted Performance (Sharpe Ratio):")
print(performance_summary.select([
    'symbol', 'total_return', 'annualized_volatility', 'sharpe_ratio'
]).sort('sharpe_ratio', descending=True))

## Summary

### Key Polars Time-Series Techniques Demonstrated:

1. **Data Integration**: Converting yfinance data to Polars DataFrames
2. **group_by_dynamic**: Weekly, monthly, quarterly aggregations
3. **Rolling Windows**: Moving averages, RSI, Bollinger Bands, volatility
4. **Window Functions**: Over() for grouped calculations
5. **Time Components**: Extracting hour, date for intraday analysis
6. **Pivoting**: Creating correlation matrices
7. **Complex Aggregations**: VWAP, OHLC bars, cumulative metrics
8. **Conditional Logic**: Signal generation, regime detection

### Financial Concepts Covered:

- **Technical Indicators**: SMA, EMA, MACD, RSI, Bollinger Bands, ATR
- **Risk Metrics**: Volatility, Sharpe Ratio, Maximum Drawdown
- **Portfolio Analysis**: Returns, diversification, correlation
- **Market Analysis**: Volume profiles, regime detection
- **Trading Strategies**: Crossover signals, backtesting

### Best Practices:

1. Always sort by date/timestamp before time-series operations
2. Use `over()` for grouped rolling calculations
3. Handle null values appropriately (especially for returns)
4. Annualize metrics for proper comparison (252 trading days)
5. Use `group_by_dynamic` for flexible time-based aggregations
6. Leverage lazy evaluation for large datasets

### Next Steps:

- Add more sophisticated indicators (Ichimoku, Fibonacci, etc.)
- Implement advanced strategies (mean reversion, momentum)
- Add transaction costs and slippage to backtests
- Create visualization layers with plotly or matplotlib
- Implement portfolio optimization (Markowitz, Black-Litterman)
- Add options and derivatives analysis