## The Problem - Strategic Asset Production Plan
The new demand forecast and projected orders for the next 5 years just arrived. In order to ensure our companies success, we need to assess the capability of our existing supply chain to meet the forecasted demand.

If we are able to see any issues with the given production plan, we need to come up with mitigating actions in order to ensure we can supply our end customers reliably throughout the entire time horizon.

In order to run this analysis, you are given these 5 data sets: asset_uptime.json, asset_rates.json, skus.json, orders.json, allocation_plan.json


In [1]:
import os
import json
import pandas as pd
import altair as alt
import pulp
import numpy as np

##### Import data from JSON files and convert to pandas

In [2]:
# Checking to ensure json files are in current directory
obj = os.scandir('.')
files = [item.name for item in obj if '.json' in item.name]
print(files)

['allocation_plan.json', 'asset_rates.json', 'asset_uptime.json', 'orders.json', 'skus.json']


In [3]:
def import_data(file_name: str):
    try:
        with open(file_name, 'r') as file:
            data = json.load(file)
        return pd.DataFrame(data)
    
    except FileNotFoundError:
        print(f"Error: The file '{file_name}' was not found.")

In [4]:
allocation_plan = import_data('allocation_plan.json')
asset_rates = import_data('asset_rates.json')
asset_uptime = import_data('asset_uptime.json')
orders = import_data('orders.json')
skus = import_data('skus.json')

#### Explore Data Set

In [5]:
asset_rates.head()

Unnamed: 0,asset_id,Product,run_rate,cleanup_time
0,L007,A,45,1.0
1,L007,B,45,1.0
2,L007,C,45,1.5
3,L007,D,45,1.5
4,L007,E,45,3.0


In [6]:
print(orders.shape[0])
orders.head()

891


Unnamed: 0,proj_id,Year,Month,SKU,Demand
0,PO_001,2027,11,D,1636
1,PO_002,2028,5,A,7778
2,PO_003,2029,10,B,1827
3,PO_004,2028,2,D,254
4,PO_005,2029,6,D,8600


In [7]:
orders['Date'] = pd.to_datetime(orders[['Year', 'Month']].assign(day=1))

In [8]:
chart = alt.Chart(orders).mark_line(point=True).encode(
    alt.X('Date', timeUnit='yearquarter', title = 'Year'),  # change time unit to year for less granularity
    alt.Y('sum(Demand):Q', title = 'Aggregate Demand in Units'), # Aggregate demand by SKU
    color='SKU',
).properties(width=1200, height=500,  title = 'Projected Demand over time')

chart

#### Calculate Capactiy

In [9]:
# Calulating asset working hours per year
asset_uptime['hours_per_year'] = (asset_uptime['days_per_week'] * asset_uptime['weeks_per_year'] * asset_uptime['hours_per_shift'] * asset_uptime['shifts_per_day'])
asset_uptime

Unnamed: 0,asset_id,days_per_week,weeks_per_year,hours_per_shift,shifts_per_day,hours_per_year
0,L007,7,48,8.0,3,8064.0
1,L042,5,48,6.5,3,4680.0
2,L451,5,50,8.0,3,6000.0
3,L673,7,52,8.0,2,5824.0


Calculate how long it takes each asset to produce one batch - assumption that must produce products in batches not by number of units\
run_rate = units/hour, lot_size = units/batch\
hr/unit * unit/batch = hr/batch\
**(lot_size / run_rate) + cleanup_time = hr/batch**

In [10]:
asset_rates_merged = asset_rates.merge(skus, on='Product', how='left')
asset_rates_merged['hours_per_batch'] = ((asset_rates_merged['Lot Size'] / asset_rates_merged['run_rate']) + asset_rates_merged['cleanup_time'])
asset_rates_merged.head()

Unnamed: 0,asset_id,Product,run_rate,cleanup_time,Lot Size,hours_per_batch
0,L007,A,45,1.0,4000,89.888889
1,L007,B,45,1.0,6000,134.333333
2,L007,C,45,1.5,1000,23.722222
3,L007,D,45,1.5,6500,145.944444
4,L007,E,45,3.0,15000,336.333333


#### Calculate Demand

In [11]:
demand_merged = orders.merge(allocation_plan, on='proj_id', how='left')
demand_merged = demand_merged.rename(columns = {'Asset' : 'Asset Allocated'})
demand_merged.head()

Unnamed: 0,proj_id,Year,Month,SKU,Demand,Date,Asset Allocated
0,PO_001,2027,11,D,1636,2027-11-01,L451
1,PO_002,2028,5,A,7778,2028-05-01,L042
2,PO_003,2029,10,B,1827,2029-10-01,L673
3,PO_004,2028,2,D,254,2028-02-01,L042
4,PO_005,2029,6,D,8600,2029-06-01,L007


In [12]:
#calculate annual demand by asset (as granularity requested)
demand_by_asset = demand_merged.groupby(['Year', 'SKU', 'Asset Allocated'])['Demand'].sum().reset_index()
demand_by_asset.head(6)

Unnamed: 0,Year,SKU,Asset Allocated,Demand
0,2025,A,L007,45295
1,2025,A,L042,45242
2,2025,A,L451,37759
3,2025,A,L673,41984
4,2025,B,L007,27258
5,2025,B,L042,35812


In [13]:
#Calculate demand in terms of hours instead of in terms of units
# merging to have asset information in same df for calculations
demand_by_asset = demand_by_asset.merge(asset_rates_merged, left_on=['SKU', 'Asset Allocated'], right_on=['Product', 'asset_id'], how='left').drop(columns=['Product', 'asset_id'])

In [14]:
# Calculate batches
def calculate_batches(row):
    batches = (row['Demand']//row['Lot Size']) + (row['Demand'] % row['Lot Size'] > 0)
    return batches
demand_by_asset['Batches'] = demand_by_asset.apply(calculate_batches, axis=1)

In [15]:
# Calculate how many hours are needed for each SKU
def aggregate_hrs_per_sku(row):
    hours = row['Batches']* row['hours_per_batch']
    return hours
demand_by_asset['Hours Required'] = demand_by_asset.apply(aggregate_hrs_per_sku, axis=1)


In [16]:
demand_by_asset.head()

Unnamed: 0,Year,SKU,Asset Allocated,Demand,run_rate,cleanup_time,Lot Size,hours_per_batch,Batches,Hours Required
0,2025,A,L007,45295,45,1.0,4000,89.888889,12,1078.666667
1,2025,A,L042,45242,75,2.0,4000,55.333333,12,664.0
2,2025,A,L451,37759,65,3.0,4000,64.538462,10,645.384615
3,2025,A,L673,41984,70,3.0,4000,60.142857,11,661.571429
4,2025,B,L007,27258,45,1.0,6000,134.333333,5,671.666667


#### Assess Plan

In [17]:
asset_utilization = demand_by_asset.groupby([ 'Asset Allocated', 'Year'])['Hours Required'].sum().reset_index()
extracted = asset_uptime.loc[:, ['hours_per_year', 'asset_id']]
asset_utilization = asset_utilization.merge(extracted, left_on='Asset Allocated', right_on='asset_id').drop(columns='asset_id')
asset_utilization['Utilization Ratio'] = asset_utilization['Hours Required'] / asset_utilization['hours_per_year']


In [18]:
asset_utilization[asset_utilization['Utilization Ratio'] >=1] # oh no.....

Unnamed: 0,Asset Allocated,Year,Hours Required,hours_per_year,Utilization Ratio
6,L042,2025,5381.333333,4680.0,1.149858
7,L042,2026,4714.0,4680.0,1.007265
8,L042,2027,4798.333333,4680.0,1.025285
12,L451,2025,6336.940171,6000.0,1.056157
13,L451,2026,7492.508547,6000.0,1.248751
14,L451,2027,6903.91453,6000.0,1.150652
15,L451,2028,7093.918803,6000.0,1.18232
16,L451,2029,8610.089744,6000.0,1.435015
17,L451,2030,8416.474359,6000.0,1.402746


In [19]:

Utilization_chart =  alt.Chart(asset_utilization).mark_line().encode(
        alt.X('Year:O', title='Year' ),
        y = alt.Y('Utilization Ratio:Q', title = 'Utilization Ratio'),
        color = 'Asset Allocated').properties(width = 500, title = 'Asset Capacity Utilization by Asset and Year')

over_utilized = pd.DataFrame({'y': [1.0]})
horizontal_line = alt.Chart(over_utilized).mark_rule(color='black', strokeDash=[5, 5]).encode(
    y = alt.Y('y:Q'))

Utilization_chart + horizontal_line

# can multiply ratio by 100 to speak to percentages for business presentations

#### Explore a Solution

In [20]:
# finding the most efficient assets for each SKU
efficient_indices = asset_rates_merged.groupby('Product')['hours_per_batch'].idxmin()
efficient_assets  = asset_rates_merged.loc[efficient_indices]
efficient_assets = efficient_assets.reset_index(drop=True)
efficient_assets

Unnamed: 0,asset_id,Product,run_rate,cleanup_time,Lot Size,hours_per_batch
0,L042,A,75,2.0,4000,55.333333
1,L042,B,120,4.0,6000,54.0
2,L007,C,45,1.5,1000,23.722222
3,L042,D,150,2.0,6500,45.333333
4,L007,E,45,3.0,15000,336.333333


In [21]:
# finding the most efficient product for each asset
efficient_indices = asset_rates_merged.groupby('asset_id')['hours_per_batch'].idxmin()
efficient_products = asset_rates_merged.loc[efficient_indices]
efficient_products = efficient_products.reset_index(drop=True)
efficient_products

Unnamed: 0,asset_id,Product,run_rate,cleanup_time,Lot Size,hours_per_batch
0,L007,C,45,1.5,1000,23.722222
1,L042,D,150,2.0,6500,45.333333
2,L451,C,45,2.5,1000,24.722222
3,L673,C,20,4.0,1000,54.0


Mixed Integer Linear Programming
https://www.datacamp.com/tutorial/linear-programming \
pulp documentation: https://coin-or.github.io/pulp/

Must define Decision Variables, Constraints, and Objective Function:\
**Decision Variables:**
1. X<sub>i,j,y</sub> :
Number of Units of product *i* on asset *j* in year *y*
2. B<sub>i,j,y</sub> : Number of Batches of product *i* on asset *j* in year *y*

**Constraints:**
1. Demand: demand must be met; sum(X<sub>i</sub> in asset j, year y) must == total demand of product i in year y
2. Capacity: total working hours per asset cannot be exceeded; Hours per batch<sub>i,y</sub> * Batches<sub>i,j,y</sub>  >= Annual Hours<sub>j</sub> across all products
3. Batch Count: ensure correct number of batches assigned based on units (rounding up); Lot Size * B >= X

**Objective Function:**
Minimization of numbers of hours across all assets in order produce predicted demand\
Min(Hours_per_Batch * B) agg across *i*, *j*, and *y*

In [22]:
demand_by_product = orders.groupby(['Year', 'SKU'])['Demand'].sum().drop(columns = ['proj_id', 'Date']).reset_index()
demand_by_product.head(6)

Unnamed: 0,Year,SKU,Demand
0,2025,A,170280
1,2025,B,111400
2,2025,C,111278
3,2025,D,102029
4,2025,E,108737
5,2026,A,136457


In [23]:
demand_map = demand_by_product.set_index(['SKU', 'Year'])['Demand'].to_dict()
capacity_map = asset_uptime.set_index('asset_id')['hours_per_year'].to_dict()


# Create Sets (Indexes)
products = skus['Product'].unique()
assets = asset_uptime['asset_id'].unique()
years = demand_by_product['Year'].unique() 

# Create mapping dictionaries for referencing lot size and hours per batch
lot_size_map = skus.set_index('Product')['Lot Size'].to_dict()
hours_per_batch_map = asset_rates_merged.set_index(['asset_id', 'Product'])['hours_per_batch'].to_dict()

In [24]:

# Setup Optimization Problem - minimzation (of total number of hours)
model = pulp.LpProblem("Asset_Allocation", pulp.LpMinimize)

# Define Decision Variables
X = pulp.LpVariable.dicts("Units", (products, assets, years), lowBound=0, cat='Continuous')  # keep as continuous so that it is easier to solve (not integer-integer)
B = pulp.LpVariable.dicts("Batches", (products, assets, years), lowBound=0, cat='Integer')

# Objective Function: Minimize total production time
model += (
    pulp.lpSum([
        B[i][j][y] * hours_per_batch_map.get((j, i), 0)
        for i in products for j in assets for y in years
    ]), "Total_Production_Hours"
)

# Constraints:
# Demand Constraint: Must meet all demand in a given year (X >= total demand)
for i in products:
    for y in years:
        demand = demand_map.get((i, y), 0)
        model += (
            pulp.lpSum([X[i][j][y] for j in assets]) == demand,
            f"Demand_Met_{i}_{y}"
        )

# Capacity Constraint: Cannot exceed annual capacity on any asset within the year
for j in assets:
    capacity = capacity_map.get(j, 0)
    for y in years:
        model += (
            pulp.lpSum([
                B[i][j][y] * hours_per_batch_map.get((j, i), 0)
                for i in products ]) <= capacity,
            f"Capacity_Limit_{j}_{y}"
        )

# Batch constraint: Must have sufficient batches for number of units (round up) Batches >= X/(lot size)
for i in products:
    lot_size = lot_size_map.get(i, np.inf) 
    if lot_size <= 0 or lot_size == np.inf:
        continue
        
    for j in assets:
        for y in years:
            model += (
                lot_size * B[i][j][y] >= X[i][j][y],
                f"Batch_Link_{i}_{j}_{y}"
            )


# Solver
model.solve()   
total_hours = pulp.value(model.objective)
print(f'Total Hours (minimized): {total_hours}')

Total Hours (minimized): 57145.27777777777


In [25]:
# Extract and format results into a DataFrame
results = []
for i in products:
    for j in assets:
        for y in years:
            batches = B[i][j][y].varValue
            if batches > 0:
                hours_required = batches * hours_per_batch_map.get((j, i))
                units = X[i][j][y].varValue
                results.append({'Asset': j, 'Product': i, 'Year': y, 'Units_Allocated': units,'Batches_Run': batches,'Hours_Required': hours_required })
                
df_results = pd.DataFrame(results)
df_results.head(6)

Unnamed: 0,Asset,Product,Year,Units_Allocated,Batches_Run,Hours_Required
0,L042,A,2025,170280.0,43.0,2379.333333
1,L042,A,2026,136457.0,35.0,1936.666667
2,L042,A,2027,106339.0,27.0,1494.0
3,L042,A,2028,96768.0,25.0,1383.333333
4,L042,A,2029,133424.0,34.0,1881.333333
5,L042,A,2030,181551.0,46.0,2545.333333


In [26]:
df_results['Asset'].unique() # looks as though L042 and L007 are not only most efficient but are sufficient for all demand through 2030.... 
                             # consider operations cost of other assets, locations and cost of transportation from assets, 
                             # as well as expansion and reserve plans, and costs of shutting down vs remaining operational


array(['L042', 'L007'], dtype=object)

Validate Results

In [27]:
df_results[df_results['Year'] == 2027]

Unnamed: 0,Asset,Product,Year,Units_Allocated,Batches_Run,Hours_Required
2,L042,A,2027,106339.0,27.0,1494.0
8,L042,B,2027,131884.0,22.0,1188.0
14,L007,C,2027,102962.0,103.0,2443.388889
20,L042,D,2027,153408.0,24.0,1088.0
26,L007,E,2027,114242.0,8.0,2690.666667


In [28]:
demand_by_product[demand_by_product['Year'] == 2027]

Unnamed: 0,Year,SKU,Demand
10,2027,A,106339
11,2027,B,131884
12,2027,C,102962
13,2027,D,153408
14,2027,E,114242


In [29]:
# very rough checks for total demand and total hours across all assets
if (int(df_results['Units_Allocated'].sum())) != int(demand_by_product['Demand'].sum()): print('ERROR: demand not met')
else: print('Demand met')

capacity_check = True
for year in years:
    if (int(df_results[df_results['Year'] == year]['Hours_Required'].sum())) > int(asset_uptime['hours_per_year'].sum()): 
        print(f'ERROR: hours error for year {year}')
        capacity_check = False
if capacity_check: print('Capacity met')

Demand met
Capacity met


Visualize Results

In [30]:
# prepare new utilization ratios
asset_utilization = df_results.groupby([ 'Asset', 'Year'])['Hours_Required'].sum().reset_index()
extracted = asset_uptime.loc[:, ['hours_per_year', 'asset_id']]
asset_utilization = asset_utilization.merge(extracted, left_on='Asset', right_on='asset_id').drop(columns='asset_id')
asset_utilization['Utilization Ratio'] = asset_utilization['Hours_Required'] / asset_utilization['hours_per_year']
asset_utilization


Unnamed: 0,Asset,Year,Hours_Required,hours_per_year,Utilization Ratio
0,L007,2025,5347.555556,8064.0,0.663139
1,L007,2026,5039.166667,8064.0,0.624897
2,L007,2027,5134.055556,8064.0,0.636664
3,L007,2028,4711.277778,8064.0,0.584236
4,L007,2029,7068.222222,8064.0,0.876516
5,L007,2030,6443.0,8064.0,0.798983
6,L042,2025,4130.666667,4680.0,0.882621
7,L042,2026,4048.666667,4680.0,0.8651
8,L042,2027,3770.0,4680.0,0.805556
9,L042,2028,3404.666667,4680.0,0.727493


In [31]:

Utilization_chart =  alt.Chart(asset_utilization).mark_line().encode(
        alt.X('Year:O', title='Year' ),
        y = alt.Y('Utilization Ratio:Q', title = 'Utilization Ratio'),
        color = 'Asset').properties(width = 500, title = 'Suggested Asset Capacity Utilization by Asset and Year')

over_utilized = pd.DataFrame({'y': [1.0]})
horizontal_line = alt.Chart(over_utilized).mark_rule(color='black', strokeDash=[5, 5]).encode(
    y = alt.Y('y:Q',  scale=alt.Scale(domain=[0, 1.1])))

Utilization_chart + horizontal_line

#### OTHER CONSIDERATIONS
- If you have excess product in batches, will you retain stock?
- Initial demand projections show that demand changes dramatically over months, must ensure that capacity of each asset is not exceeded at high volume quarters
- Continue with optimization with month granularity
  - consider building inventory in slow months to supplement for overloaded months 
- Other parameters may be considered for optimization (other than hours) such as: 
  - Downtime for machines and required maintenance 
  - Transportation costs/location of assets 
  - Operation costs per asset (labor, power, taxes, etc) 
  - Warehouse capacities
