# Forecasting Hospital Stock Requirements using SARIMA

This notebook demonstrates how to use the SARIMA model to forecast the consumption of medical items for Hulhumale Hospital. The goal is to predict the monthly stock requirements for the next year using historical consumption data.

In [1]:
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.sarimax import SARIMAX

## Load the Dataset

We start by loading the dataset containing historical monthly consumption data for various medical items over the past three years. The dataset should be in a CSV file format with columns for item details and monthly usage.

In [2]:
# Define the file path to the dataset
file_path = 'ML_Group\data.csv'  # Update this path as needed

# Load the dataset into a DataFrame
data = pd.read_csv(file_path)

# Display the first few rows of the dataset to understand its structure
data.head()

## Prepare for Forecasting

We will prepare to forecast the next 12 months for each item. This involves iterating over each row of the dataset, where each row corresponds to a single item. For each item, we'll use its monthly consumption data to fit a SARIMA model.

In [3]:
# Initialize a list to store prediction results for each item
prediction_list = []

# Define the date range for the existing data
# We assume the data spans from January 2021 to December 2023 (3 years of monthly data)
dates = pd.date_range(start='2021-01-01', periods=36, freq='MS')

## Forecast with SARIMA

For each item in the dataset, we will:
- Extract the monthly consumption data.
- Fit a SARIMA model to this time series data.
- Forecast the consumption for the next 12 months.
- Store the forecast results.

In [4]:
# Iterate over each item to forecast the next 12 months
for index, row in data.iterrows():
    item_code = row['ITEM CODE']
    item_description = row['ITEM DESCRIPTION']
    category = row['CATEGORY']
    sub_category = row['SUB-CATEGORY']
    unit = row['UNIT']
    
    # Extract the consumption data and ensure it is numeric
    consumption_data = row.iloc[5:-4].replace(['', 'NA', 'N/A'], np.nan).astype(float).fillna(0)
    consumption_series = pd.Series(consumption_data.values[:36], index=dates)  # Ensure it matches the date index
    
    # Define the SARIMA model
    try:
        model = SARIMAX(consumption_series, order=(1, 1, 1), seasonal_order=(0, 1, 1, 12), 
                       enforce_stationarity=False, enforce_invertibility=False)
        sarima_model = model.fit(disp=False, method='powell')  # Using a different optimizer
        
        # Forecast the next 12 months
        forecast = sarima_model.get_forecast(steps=12)
        forecast_values = forecast.predicted_mean.round().astype(int)
        
        # Add the forecast to the list
        prediction_list.append({
            'ITEM CODE': item_code,
            'ITEM DESCRIPTION': item_description,
            'CATEGORY': category,
            'SUB-CATEGORY': sub_category,
            'UNIT': unit,
            **{f'{month}_2024': value for month, value in zip(['JANUARY', 'FEBRUARY', 'MARCH', 'APRIL', 'MAY', 'JUNE', 
                                                              'JULY', 'AUGUST', 'SEPTEMBER', 'OCTOBER', 'NOVEMBER', 'DECEMBER'], 
                                                             forecast_values)}
        })
    except Exception as e:
        print(f"Error processing item {item_description}: {e}")

## Save Forecast Results

Once we have generated the forecasts for all items, we'll save the results into a CSV file for easy access and further analysis.

In [5]:
# Convert the list of predictions to a DataFrame
predictions_df = pd.DataFrame(prediction_list)

# Save the predictions to a CSV file
csv_path = 'forecasted_stock_2024.csv'
predictions_df.to_csv(csv_path, index=False)

print(f"CSV file saved to: {csv_path}")