# 🪙 Cryptocurrency Time Series Analysis (Mini Project)

This notebook performs time series analysis and simple forecasting on **Bitcoin (BTC)** data using Python.

**Objectives:**
- Download BTC historical data from Yahoo Finance
- Explore and visualize price trends, returns, and volatility
- Decompose the time series into trend/seasonality/residuals
- Fit a simple ARIMA model and produce short-term forecasts

---
**Instructions:** Run the cells sequentially. Install dependencies if needed (first cell).

In [None]:
### Install dependencies (run once)
!pip install -q yfinance pandas matplotlib seaborn statsmodels pmdarima requests

# Optional: for Prophet forecasting (not used by default)
# !pip install prophet


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import pmdarima as pm

# plotting style
sns.set(style='whitegrid')


In [None]:
# ----------------------
# 1) Fetch historical BTC data (Yahoo Finance)
# ----------------------
import yfinance as yf

# You can change period to '5y' or 'max' if you want more history
ticker = 'BTC-USD'
df = yf.download(ticker, period='2y', interval='1d', progress=False)
df = df[['Open','High','Low','Close','Adj Close','Volume']]
df.index = pd.to_datetime(df.index)
df.rename(columns={'Adj Close':'Adj_Close'}, inplace=True)
print('Data loaded: rows =', len(df))
display(df.tail())


In [None]:
# ----------------------
# 2) Preprocessing & Feature Engineering
# ----------------------
if 'Adj_Close' in df.columns:
    price = df['Adj_Close'].copy()
elif 'Close' in df.columns:
    price = df['Close'].copy()
else:
    price = df.iloc[:,0].copy()

# Ensure daily frequency and fill missing
price = price.asfreq('D').ffill()

# Returns and log returns
returns = price.pct_change().fillna(0)
log_returns = np.log(price).diff().fillna(0)

# Moving averages
ma_short = price.rolling(window=20).mean()
ma_long  = price.rolling(window=50).mean()

# Volatility (rolling std of log returns, annualized approx)
volatility_20 = log_returns.rolling(window=20).std() * np.sqrt(365)

print('Price series from', price.index.min().date(), 'to', price.index.max().date())


In [None]:
# ----------------------
# 3) Exploratory Plots
# ----------------------
plt.figure(figsize=(14,5))
plt.plot(price, label='Close')
plt.plot(ma_short, label='MA 20')
plt.plot(ma_long, label='MA 50')
plt.title('BTC Price with Moving Averages')
plt.legend()
plt.show()

plt.figure(figsize=(14,4))
plt.plot(log_returns, color='tab:orange')
plt.title('Log Returns (Daily)')
plt.show()

plt.figure(figsize=(10,4))
plt.plot(volatility_20, color='tab:red')
plt.title('Estimated Volatility (20-day Rolling)')
plt.show()


In [None]:
# ----------------------
# 4) Time Series Decomposition
# ----------------------
series_for_decomp = np.log(price.dropna())
# choose seasonal period (365 for yearly seasonality if enough data)
period = 365 if len(series_for_decomp) > 365 else 30

decomp = seasonal_decompose(series_for_decomp, model='additive', period=period, extrapolate_trend='freq')
fig = decomp.plot()
fig.set_size_inches(12,9)
plt.suptitle('Decomposition of log(price)')
plt.show()


In [None]:
# ----------------------
# 5) ACF & PACF (for lag analysis)
# ----------------------
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

plt.figure(figsize=(12,4))
plot_acf(log_returns.dropna(), lags=40)
plt.title('ACF - Log Returns')
plt.show()

plt.figure(figsize=(12,4))
plot_pacf(log_returns.dropna(), lags=40, method='ywm')
plt.title('PACF - Log Returns')
plt.show()


In [None]:
# ----------------------
# 6) Auto ARIMA Forecasting on log(price)
# ----------------------
y = series_for_decomp.dropna()
train_len = int(len(y) * 0.9)
y_train, y_test = y.iloc[:train_len], y.iloc[train_len:]

print('Training points:', len(y_train), 'Test points:', len(y_test))

print('Fitting auto_arima (may take a moment)...')
model = pm.auto_arima(y_train, seasonal=False, stepwise=True, suppress_warnings=True, maxorder=10)
print(model.summary())

n_periods = len(y_test)
fc, confint = model.predict(n_periods=n_periods, return_conf_int=True)
fc_index = y_test.index

# back-transform forecasts
fc_series = pd.Series(np.exp(fc), index=fc_index)
lower_series = pd.Series(np.exp(confint[:,0]), index=fc_index)
upper_series = pd.Series(np.exp(confint[:,1]), index=fc_index)

plt.figure(figsize=(12,6))
plt.plot(price, label='Actual Price (full)')
plt.plot(fc_series, label='ARIMA Forecast', color='red')
plt.fill_between(fc_index, lower_series, upper_series, color='pink', alpha=0.3)
plt.title('BTC Price Forecast (ARIMA on log-price)')
plt.legend()
plt.show()

# Evaluation: MAPE
actual = price.loc[y_test.index]
mape = np.mean(np.abs((actual - fc_series) / actual)) * 100
print(f'MAPE on test set: {mape:.2f}%')


---

## Next steps & extensions

- Add exogenous variables (on-chain volume, Google Trends, macro variables).
- Try SARIMA or Prophet for seasonal patterns.
- Use hourly data for short-term forecasting (requires more data and compute).
- Deploy visualization as Streamlit app to show live forecasts.

---

*Author: Akingbade Serifat Bukola*