# Part 2 - Warehouse Analytics

Building on the simulated product-level data generated in the [previous notebook](01_stock_simulation.ipynb), this section aggregates across all product IDs to provide a daily view of warehouse operations. This allows us to evaluate staffing needs, delivery patterns and operational costs at an operational level.

In this section, we will examine for each day:

- Total number of items entering the warehouse.
- Total number of items leaving the warehouse.
- The total inventory levels across all product IDs.
- The total number of orders that went unfulfiled.
- Total inbound and outbound shipments.
- Truck and van utilization.
- Workers needed to meet demand.
- Simulated human errors. 

We will consider these metrics for both replenishment strategies, weekly scheduled and JIT deliveries, to allow for a more robust comparison.


## Set-up

Import the required modules and the product-level dataset:

In [1]:
import pandas as pd
import numpy as np
import math

# set randomness to be constant
np.random.seed(16)

products_df = pd.read_csv("../data/warehouse_products.csv")

Select only the relevant columns:

In [2]:
selected_columns = ["date",
                    "warehouse_id",
                    "inbound_units_weekly",
                    "actual_outbound_weekly",
                    "inventory_level_weekly",
                    "unmet_demand_weekly",
                    "inbound_units_jit",
                    "actual_outbound_jit",
                    "inventory_level_jit",
                    "unmet_demand_jit"]

df = products_df[selected_columns]

Aggregate the columns by grouping by both date and warehouse ID:

In [3]:
group_df = df.groupby(["date", "warehouse_id"])

group_df = group_df.agg({
    "inbound_units_weekly": "sum",
    "actual_outbound_weekly": "sum",
    "inventory_level_weekly": "sum",
    "unmet_demand_weekly": "sum",
    "inbound_units_jit": "sum",
    "actual_outbound_jit": "sum",
    "inventory_level_jit": "sum",
    "unmet_demand_jit": "sum"})

group_df = group_df.reset_index()

Rename the aggregated columns to more appropriate names:

In [4]:
group_df = group_df.rename(columns={
    "actual_outbound_weekly": "orders_fulfilled_weekly",
    "unmet_demand_weekly": "missed_sales_weekly",
    "actual_outbound_jit": "orders_fulfilled_jit",
    "unmet_demand_jit": "missed_sales_jit"
})

We have aggregated the data from the previous dataframe to give a high level overview of the warehouse operations.

The dataframe is structured as follows:

In [5]:
group_df.head()

Unnamed: 0,date,warehouse_id,inbound_units_weekly,orders_fulfilled_weekly,inventory_level_weekly,missed_sales_weekly,inbound_units_jit,orders_fulfilled_jit,inventory_level_jit,missed_sales_jit
0,2024-01-01,WH1,0,223,1627,0,0,223,1627,0
1,2024-01-02,WH1,0,201,1426,0,0,201,1426,0
2,2024-01-03,WH1,0,222,1204,0,0,222,1204,0
3,2024-01-04,WH1,0,216,988,0,0,216,988,0
4,2024-01-05,WH1,0,258,730,0,0,258,730,0


## Warehouse Operations

Based on the aggregated data, we can model additional aspects of warehouse operations, including inbound and outbound shipments, as well as total warehouse utilization. These metrics are highly informative when assessing the cost-to-serve and carbon footprint of each replenishment strategy and play a key role in the overall decision making. 

### Outbound Shipments

Every day, customers orders are dispatched from the warehouse and delivered by vans, each with a maximum capacity of 100 products. In the model, there are always enough vans to meet demand. This  represents a flexible contractor system, where vans are hired as needed, with each contractor providing their own vehicle.

Here, there are some limitations to the simulation that are contrary to reality. In practice, businesses operate with a fixed fleet size, and exceeding capacity may incur additional costs, delayed dispatch or only partial fulfilment.

The other assumption is that orders are shipped the same day, no matter how poorly the vans may be utilized. In reality, most operations may defer marginal excess orders (e.g. 101 units requires a second van) to the next day to maximize van utilization and minimize costs. A more refined model should incorporate queuing or consolidation logic to improve realism and strengthen any insights gained.

In [6]:
van_capacity = 100
group_df["outbound_shipments_weekly"] = group_df["orders_fulfilled_weekly"] / van_capacity
group_df["outbound_shipments_jit"] = group_df["orders_fulfilled_jit"] / van_capacity

group_df["outbound_shipments_weekly"] = group_df["outbound_shipments_weekly"].apply(np.ceil).astype("int")
group_df["outbound_shipments_jit"] = group_df["outbound_shipments_jit"].apply(np.ceil).astype("int")

Calculate van utilization for each strategy:

In [7]:
group_df["van_utilization_weekly"] = group_df["orders_fulfilled_weekly"] / (group_df["outbound_shipments_weekly"] * van_capacity)

group_df["van_utilization_jit"] = group_df["orders_fulfilled_jit"] / (group_df["outbound_shipments_jit"] * van_capacity)

### Inbound Shipments

We can also simulate inbound shipments used for stock replenishment. The type of vehicle depends on the replenishment strategy being used: large more economical trucks are used for consolidated weekly deliveries, whereas smaller more agile trucks are used to replenish stock under the JIT strategy.

This difference has important implications for both cost and environmental impact. The larger trucks are more fuel-efficient per unit when fully loaded, while smaller trucks may result in higher costs and emission due to their reduced carrying capacity.

Again, we encounter the same limitation in this model. In practice, logistics operations often defer low-volume orders and consolidate them into shared or later shipments. This greatly improves truck utilization at the expense of small delays. This strategy is not currently modeled in the simulation but may be explored in future versions to improve realism and accuracy.

Calculate the inbound shipments and truck utilization for weekly scheduled deliveries (1000 units per truck):

In [8]:
large_truck_capacity = 1000

group_df["inbound_shipments_weekly"] = group_df["inbound_units_weekly"] / large_truck_capacity
group_df["inbound_shipments_weekly"] = group_df["inbound_shipments_weekly"].apply(np.ceil).astype("int")
group_df["truck_utilization_weekly"] = group_df["inbound_units_weekly"] / (group_df["inbound_shipments_weekly"] * large_truck_capacity)

Calculate the inbound shipments for JIT deliveries (500 units per truck):

In [9]:
small_truck_capacity = 500

group_df["inbound_shipments_jit"] = group_df["inbound_units_jit"] / small_truck_capacity
group_df["inbound_shipments_jit"] = group_df["inbound_shipments_jit"].apply(np.ceil).astype("int")
group_df["truck_utilization_jit"] = group_df["inbound_units_jit"] / (group_df["inbound_shipments_jit"] * small_truck_capacity)

### Warehouse Utilization

Finally, we can model warehouse utilization, how much storage space is currently occupied. In the simulation, we assume that each item is the same size, therefore the current utilization can be calculated by dividing the total inventory by the warehouse capacity which is set at 2200 units.

In practice, optimal warehouse utilization is normally somewhere between 70 and 85%, balancing efficient use of space with robust safety stock that can respond to surges in demand and prevent congestion. Consistently running below this capacity may indicate excess storage, and running above this can lead to bottlenecks and reduced responsiveness making this an important metric to track. 

In [10]:
warehouse_capacity = 2200

group_df["warehouse_utilization_weekly"] = group_df["inventory_level_weekly"] / warehouse_capacity
group_df["warehouse_utilization_jit"] = group_df["inventory_level_jit"] / warehouse_capacity

## Staffing and Human Factors

### Staff Count

In [11]:
def determine_staff_count(row, outbound_column):
    outbound_orders = row[outbound_column]
    workers_needed = math.ceil(outbound_orders / 50)
    return workers_needed

group_df["staff_count_weekly"] = group_df.apply(determine_staff_count, args=("orders_fulfilled_weekly",), axis=1)
group_df["staff_count_jit"] = group_df.apply(determine_staff_count, args=("orders_fulfilled_jit",), axis=1)

### Human Error

## Final Warehouse-level Dataset

In [12]:
group_df.to_csv("../data/warehouse_daily.csv", index=False)