# Time Series Modeling Walkthrough

In this notebook, we will explore how to create a time series forecasting model using Python. The dataset used here contains a time series with regular intervals, and we will walk through key steps such as stationarity checks, model selection, and evaluation.

## Step 1: Load the Dataset

In [None]:

# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset (replace 'your_data.csv' with your actual file path)
df = pd.read_csv('your_data.csv')

# Convert the date column to datetime and set as index
df['date_column'] = pd.to_datetime(df['date_column'])
df.set_index('date_column', inplace=True)

# Display the first few rows of the dataset
df.head()


## Step 2: Data Exploration and Visualization

In [None]:

# Plot the time series data
plt.figure(figsize=(10, 6))
plt.plot(df['value_column'], label='Time Series')
plt.title('Time Series Plot')
plt.xlabel('Date')
plt.ylabel('Values')
plt.legend()
plt.show()


## Step 3: Stationarity Check and Transformation

In [None]:

# Import the Augmented Dickey-Fuller test from statsmodels
from statsmodels.tsa.stattools import adfuller

# Perform the ADF test
result = adfuller(df['value_column'])

# Print the ADF statistic and p-value
print(f'ADF Statistic: {result[0]}')
print(f'p-value: {result[1]}')

# If p-value > 0.05, the data is not stationary. Differencing the data to remove trend:
df['value_diff'] = df['value_column'].diff().dropna()
df['value_diff'].plot(title='Differenced Time Series')
plt.show()


## Step 4: Train-Test Split

In [None]:

# Split the data into training and testing sets (80% train, 20% test)
train_size = int(len(df) * 0.8)
train, test = df.iloc[:train_size], df.iloc[train_size:]

# Plot the training and test sets
plt.figure(figsize=(10, 6))
plt.plot(train['value_column'], label='Train Set')
plt.plot(test['value_column'], label='Test Set', color='orange')
plt.title('Train-Test Split')
plt.legend()
plt.show()


## Step 5: Build an ARIMA Model

In [None]:

# Import ARIMA from statsmodels
from statsmodels.tsa.arima.model import ARIMA

# Build the ARIMA model (replace p, d, q with actual values after inspecting ACF/PACF plots)
model = ARIMA(train['value_column'], order=(5, 1, 0))  # Example: (p, d, q) = (5, 1, 0)
fitted_model = model.fit()

# Display the summary of the model
print(fitted_model.summary())


## Step 6: Forecasting and Evaluation

In [None]:

# Forecast on the test data
predictions = fitted_model.forecast(steps=len(test))

# Import mean_squared_error to calculate RMSE
from sklearn.metrics import mean_squared_error
import numpy as np

# Calculate RMSE
rmse = np.sqrt(mean_squared_error(test['value_column'], predictions))
print(f'RMSE: {rmse}')

# Plot actual vs predicted values
plt.figure(figsize=(10, 6))
plt.plot(test['value_column'], label='Actual')
plt.plot(predictions, label='Predicted', color='red')
plt.title('Actual vs Predicted')
plt.legend()
plt.show()


## Step 7: Model Tuning and Saving

In [None]:

# Save the model using joblib
import joblib

# Save the fitted model
joblib.dump(fitted_model, 'arima_model.pkl')
print("Model saved as arima_model.pkl")


## Step 8: Export CSV

In [None]:
# Save the predictions to a CSV file
output = pd.DataFrame({'Actual': test['value_column'], 'Predicted': predictions})
output.to_csv('predictions_output.csv', index=True)

print("Predictions saved to predictions_output.csv")