# **Machine Learning for Supply Chains**

## **Course 1: Fundamentals of Machine Learning for Supply Chains**

### **What is PuLP**

PuLP is a Python package that specializes in discrete optimization. For example, suppose we want to maximize $x \cdot y$ given that $2x + y = 3$. 

If we allow any number as a solution, then calculus tells us that we would get $x = \frac{3}{4}$ and $y = \frac{3}{2}$. 

However, suppose we are only allowed to have integers for solutions (i.e., no fractions). One might expect the answer to be integers close to these fractions (e.g., $x = 1$ and $y = 1$), but it's not always this simple. We would use PuLP in place of calculus for problems like this.

#### **Case Study: Simple Scheduling Application using Linear Programming (PuLP)**

With reference to [Medium article about PuLP](#Optimization-with-PuLP-in-Python---Getting-Started), I have applied the code mentioned in the article as follows:

##### **Problem Overview**

We run a 24-hour lemonade stand with two products:

- **Iced Lemonade**

- **Frozen Lemonade Slushies**

Each product has a **processing time** and a **forecasted hourly demand**.

Our goal is to find out:

- How many staff members are needed for each hour to **meet customer demand**.

- How to **minimize staffing costs**, while ensuring all customer orders can be fulfilled.

##### **Step 1: Import Libraries and Define Inputs**

In [26]:
# importing the necessary libraries
import pandas as pd 
import pulp
from pulp import LpProblem, LpVariable, lpSum, LpMinimize, LpStatus


In [6]:
# Define the 24 hours in a day
hours = range(24)

# Define expected hourly demand for each product
demand_iced = pd.Series(
    [7, 11, 8, 8, 5, 3, 8, 20, 52, 56, 85, 76, 102, 67, 82, 68, 65, 56, 50, 43, 47, 23, 29, 18]
)

demand_slushy = pd.Series(
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 38, 84, 93, 82, 93, 75, 70, 62, 22, 27, 17, 22, 0, 0, 0]
)

# Processing time in hours per product
processing_time_iced = 2 / 60   # 2 minutes per iced lemonade
processing_time_slushy = 5 / 60 # 5 minutes per slushy


##### **Step 2: Define the Optimization Problem**

We are going to use **PuLP** to minimize the total staff cost, assuming:

- Each staff member costs $15 per hour.

- We need to satisfy customer demand using available staff hours.

In [7]:
# Define the LP problem
prob = LpProblem("Simple_Scheduling_Application", LpMinimize)

# Define decision variables for number of staff needed each hour
staff_needed = LpVariable.dicts("staff_hour", hours, lowBound=0, cat='Continuous')


##### **Step 3: Set Objective Function**

Our objective is to **minimize the total hourly cost** of staff.

In [8]:
# Objective: Minimize total staffing cost
prob += lpSum([15 * staff_needed[i] for i in hours]), "Total_Cost"

##### **Step 4: Add Constraints**

We add two main constraints for each hour:

1. At least one person must be present at all times.

2. The total staffing must be enough to handle demand.

In [9]:
# Add constraints for each hour
for hour in hours:
    # Minimum 1 staff member at all hours
    prob += staff_needed[hour] >= 1, f"MinStaff_{hour}"
    
    # Staff must be enough to meet demand
    total_demand = processing_time_iced * demand_iced[hour] + processing_time_slushy * demand_slushy[hour]
    prob += staff_needed[hour] >= total_demand, f"DemandConstraint_{hour}"


##### **Step 5: Solve the Problem**

In [10]:
# Solve the LP problem
status = prob.solve()

# Print the solution status
print("Status:", LpStatus[status])

# Print staff needed for each hour
for v in prob.variables():
    print(v.name, "=", round(v.varValue, 2))


Status: Optimal
staff_hour_0 = 1.0
staff_hour_1 = 1.0
staff_hour_10 = 9.83
staff_hour_11 = 10.28
staff_hour_12 = 10.23
staff_hour_13 = 9.98
staff_hour_14 = 8.98
staff_hour_15 = 8.1
staff_hour_16 = 7.33
staff_hour_17 = 3.7
staff_hour_18 = 3.92
staff_hour_19 = 2.85
staff_hour_2 = 1.0
staff_hour_20 = 3.4
staff_hour_21 = 1.0
staff_hour_22 = 1.0
staff_hour_23 = 1.0
staff_hour_3 = 1.0
staff_hour_4 = 1.0
staff_hour_5 = 1.0
staff_hour_6 = 1.0
staff_hour_7 = 1.0
staff_hour_8 = 1.73
staff_hour_9 = 5.03


##### **Improved Version with Staffing Constraints**

##### **Problem Update**

We now introduce **limited staff availability**:

- We have only **5 employees**.

- Each can work **up to 8 hours**, totaling 40 hours/day.

We also account for **lost sales** when demand is not met.

##### Define New Decision Variables

In [11]:
# Re-initialize the problem with new constraints
prob = LpProblem("Scheduling_With_Staff_Limit", LpMinimize)

# Define integer decision variables for staff and lost sales
staff_needed = LpVariable.dicts("staff_hour", hours, lowBound=0, cat='Integer')
lost_iced = LpVariable.dicts("lost_iced", hours, lowBound=0, cat='Integer')
lost_slushy = LpVariable.dicts("lost_slushy", hours, lowBound=0, cat='Integer')

# Lost sales penalty cost
cost_iced = 3
cost_slushy = 5


##### Update the Objective Function

Now, we minimize both:

- The cost of staff, and

- The cost of lost sales.

In [12]:
# Updated objective: Minimize staff + lost sales cost
prob += lpSum([
    15 * staff_needed[i] + cost_iced * lost_iced[i] + cost_slushy * lost_slushy[i]
    for i in hours
]), "Total_Cost_with_Lost_Sales"


##### Update Constraints

In [13]:
# Total working hours for all staff should not exceed 40 (5 employees * 8 hours)
prob += lpSum([staff_needed[i] for i in hours]) <= 5 * 8, "Total_Work_Hours"

for i in hours:
    # Minimum 1 staff per hour
    prob += staff_needed[i] >= 1, f"MinStaff_{i}"
    
    # Max 5 staff at any hour
    prob += staff_needed[i] <= 5, f"MaxStaff_{i}"
    
    # Demand constraint with allowance for lost sales
    adjusted_demand = processing_time_iced * (demand_iced[i] - lost_iced[i]) + \
                      processing_time_slushy * (demand_slushy[i] - lost_slushy[i])
    
    prob += staff_needed[i] >= adjusted_demand, f"DemandConstraint_{i}"


#####  Solve the New Problem

In [14]:
# Solve the LP problem
status = prob.solve()

# Print status
print("Status:", LpStatus[status])

# Print results
for v in prob.variables():
    print(v.name, "=", round(v.varValue, 2))


Status: Optimal
lost_iced_0 = 0.0
lost_iced_1 = 0.0
lost_iced_10 = 0.0
lost_iced_11 = 1.0
lost_iced_12 = 0.0
lost_iced_13 = 0.0
lost_iced_14 = 0.0
lost_iced_15 = 1.0
lost_iced_16 = 0.0
lost_iced_17 = 1.0
lost_iced_18 = 0.0
lost_iced_19 = 1.0
lost_iced_2 = 0.0
lost_iced_20 = 0.0
lost_iced_21 = 0.0
lost_iced_22 = 0.0
lost_iced_23 = 0.0
lost_iced_3 = 0.0
lost_iced_4 = 0.0
lost_iced_5 = 0.0
lost_iced_6 = 0.0
lost_iced_7 = 0.0
lost_iced_8 = 0.0
lost_iced_9 = 1.0
lost_slushy_0 = 0.0
lost_slushy_1 = 0.0
lost_slushy_10 = 106.0
lost_slushy_11 = 111.0
lost_slushy_12 = 111.0
lost_slushy_13 = 60.0
lost_slushy_14 = 48.0
lost_slushy_15 = 85.0
lost_slushy_16 = 76.0
lost_slushy_17 = 8.0
lost_slushy_18 = 11.0
lost_slushy_19 = 22.0
lost_slushy_2 = 0.0
lost_slushy_20 = 5.0
lost_slushy_21 = 0.0
lost_slushy_22 = 0.0
lost_slushy_23 = 0.0
lost_slushy_3 = 0.0
lost_slushy_4 = 0.0
lost_slushy_5 = 0.0
lost_slushy_6 = 0.0
lost_slushy_7 = 0.0
lost_slushy_8 = 9.0
lost_slushy_9 = 24.0
staff_hour_0 = 1.0
staff_hour_1

#### Module 1 - Video: Linear Programming with PuLP I and II

##### Case Study Example: 

- **Objective**: Minimize cost while filling product demand

- **4 Plants (Supply)**: 
    - Riyadh
    - Jeddah
    - Dammam
    - Tabuk 

- **4 Stores (Demand)**: 
    - Kharj
    - Abha
    - Khubar
    - Qurayyat

- **Costs**: 
    - Fixed cost to build plant
    - Variable cost to shp products per route

- **Questions**:
    - Which plants should we build?
    - Which products should we send on which route?

--- 

**Coding with PuLP**:

- **Setting up Environment**

```Python
# LpVariable is used to create variables
# Need to define before constraints
x = LpVariable("x", 0, 2)
y = LpVariable("y", 0, 2)

# LpProblem is a vase class - we add objective and constraints
prob = LpProblem("My Problem", LpMinimize)

# Creating constraints example
prob += x + y <= 2
prob += (x/2) + y <= 2

# Solving Problem
prob.solve

# Get status
pulp.LpStatus[prob.status]

# Get optimized objective value
pulp.value(prob.objective)
```

Let's apply steps showed in the lecture

In [31]:
# Define plants location in a list of strings
plants = ["Riyadh", "Jeddah", "Dammam", "Tabuk"]

# Define supply values for each plant in a dictionary
supply = {
    "Riyadh": 900,
    "Jeddah": 2400,
    "Dammam": 1300,
    "Tabuk": 1800
}

# Define fixed costs for each plant in a dictionary
fixed_costs = {
    "Riyadh": 75000,
    "Jeddah": 72000,
    "Dammam": 100000,
    "Tabuk": 74000
}

# Define stores location in a list of strings
stores = ["Kharj", "Abha", "Khubar", "Qurayyat"]

# Define demand values for each store in a dictionary
demand = {
    "Kharj": 1700,
    "Abha": 1000,
    "Khubar": 1500,
    "Qurayyat": 1200
}

# Define cost matrix as a list of lists
cost_matrix = [
    [2, 7, 4, 6],  # Costs from Riyadh to each store
    [6, 3, 4, 5],  # Costs from Jeddah to each store
    [8, 4, 6, 5], # Costs from Dammam to each store
    [7, 6, 5, 1]  # Costs from Tabuk to each store
]


# looping through the plants and stores to create a dictionary for costs
costs = {}

# Copilot suggestion to create a dictionary of costs
# for i, plant in enumerate(plants):
#     for j, store in enumerate(stores):
#         costs[(plant, store)] = cost_matrix[i][j]

# code applied in the lecture
for i in range(len(plants)):
    temp_dict = {}
    for j in range(len(stores)):
        temp_dict[stores[j]] = cost_matrix[i][j]
    costs[plants[i]] = temp_dict

# Define loop that shows routes
routes = []
for plant in plants:
    for store in stores:
        routes.append((plant, store))

In [40]:
# Use pulp
route = pulp.LpVariable.dicts("Route", (plants, stores), 0, None, pulp.LpInteger)
route

{'Riyadh': {'Kharj': Route_Riyadh_Kharj,
  'Abha': Route_Riyadh_Abha,
  'Khubar': Route_Riyadh_Khubar,
  'Qurayyat': Route_Riyadh_Qurayyat},
 'Jeddah': {'Kharj': Route_Jeddah_Kharj,
  'Abha': Route_Jeddah_Abha,
  'Khubar': Route_Jeddah_Khubar,
  'Qurayyat': Route_Jeddah_Qurayyat},
 'Dammam': {'Kharj': Route_Dammam_Kharj,
  'Abha': Route_Dammam_Abha,
  'Khubar': Route_Dammam_Khubar,
  'Qurayyat': Route_Dammam_Qurayyat},
 'Tabuk': {'Kharj': Route_Tabuk_Kharj,
  'Abha': Route_Tabuk_Abha,
  'Khubar': Route_Tabuk_Khubar,
  'Qurayyat': Route_Tabuk_Qurayyat}}

In [41]:
# pulp.LpVariable.dicts("name", indices, lowBound, upperBound, category)
build = pulp.LpVariable.dicts("Build_Plant", plants, 0, 1, pulp.LpInteger)
build

{'Riyadh': Build_Plant_Riyadh,
 'Jeddah': Build_Plant_Jeddah,
 'Dammam': Build_Plant_Dammam,
 'Tabuk': Build_Plant_Tabuk}

In [42]:
obj = ""

for (p,s) in routes:
    obj += route[p][s] * costs[p][s]

obj

4*Route_Dammam_Abha + 8*Route_Dammam_Kharj + 6*Route_Dammam_Khubar + 5*Route_Dammam_Qurayyat + 3*Route_Jeddah_Abha + 6*Route_Jeddah_Kharj + 4*Route_Jeddah_Khubar + 5*Route_Jeddah_Qurayyat + 7*Route_Riyadh_Abha + 2*Route_Riyadh_Kharj + 4*Route_Riyadh_Khubar + 6*Route_Riyadh_Qurayyat + 6*Route_Tabuk_Abha + 7*Route_Tabuk_Kharj + 5*Route_Tabuk_Khubar + 1*Route_Tabuk_Qurayyat + 0

In [43]:
for p in plants:
    obj += fixed_costs[p] * build[p]

obj

100000*Build_Plant_Dammam + 72000*Build_Plant_Jeddah + 75000*Build_Plant_Riyadh + 74000*Build_Plant_Tabuk + 4*Route_Dammam_Abha + 8*Route_Dammam_Kharj + 6*Route_Dammam_Khubar + 5*Route_Dammam_Qurayyat + 3*Route_Jeddah_Abha + 6*Route_Jeddah_Kharj + 4*Route_Jeddah_Khubar + 5*Route_Jeddah_Qurayyat + 7*Route_Riyadh_Abha + 2*Route_Riyadh_Kharj + 4*Route_Riyadh_Khubar + 6*Route_Riyadh_Qurayyat + 6*Route_Tabuk_Abha + 7*Route_Tabuk_Kharj + 5*Route_Tabuk_Khubar + 1*Route_Tabuk_Qurayyat + 0

In [44]:
prob = pulp.LpProblem("Supply_Demand", pulp.LpMinimize)
prob += obj, "Total_Cost"
prob

Supply_Demand:
MINIMIZE
100000*Build_Plant_Dammam + 72000*Build_Plant_Jeddah + 75000*Build_Plant_Riyadh + 74000*Build_Plant_Tabuk + 4*Route_Dammam_Abha + 8*Route_Dammam_Kharj + 6*Route_Dammam_Khubar + 5*Route_Dammam_Qurayyat + 3*Route_Jeddah_Abha + 6*Route_Jeddah_Kharj + 4*Route_Jeddah_Khubar + 5*Route_Jeddah_Qurayyat + 7*Route_Riyadh_Abha + 2*Route_Riyadh_Kharj + 4*Route_Riyadh_Khubar + 6*Route_Riyadh_Qurayyat + 6*Route_Tabuk_Abha + 7*Route_Tabuk_Kharj + 5*Route_Tabuk_Khubar + 1*Route_Tabuk_Qurayyat + 0
VARIABLES
0 <= Build_Plant_Dammam <= 1 Integer
0 <= Build_Plant_Jeddah <= 1 Integer
0 <= Build_Plant_Riyadh <= 1 Integer
0 <= Build_Plant_Tabuk <= 1 Integer
0 <= Route_Dammam_Abha Integer
0 <= Route_Dammam_Kharj Integer
0 <= Route_Dammam_Khubar Integer
0 <= Route_Dammam_Qurayyat Integer
0 <= Route_Jeddah_Abha Integer
0 <= Route_Jeddah_Kharj Integer
0 <= Route_Jeddah_Khubar Integer
0 <= Route_Jeddah_Qurayyat Integer
0 <= Route_Riyadh_Abha Integer
0 <= Route_Riyadh_Kharj Integer
0 <= Rou

In [45]:
# Supply/demand constraints

for p in plants:
    product_out = ""
    for s in stores:
        product_out += route[p][s]
    prob += product_out <= supply[p] * build[p], 'Total product out of plant_' + str(p)

prob

Supply_Demand:
MINIMIZE
100000*Build_Plant_Dammam + 72000*Build_Plant_Jeddah + 75000*Build_Plant_Riyadh + 74000*Build_Plant_Tabuk + 4*Route_Dammam_Abha + 8*Route_Dammam_Kharj + 6*Route_Dammam_Khubar + 5*Route_Dammam_Qurayyat + 3*Route_Jeddah_Abha + 6*Route_Jeddah_Kharj + 4*Route_Jeddah_Khubar + 5*Route_Jeddah_Qurayyat + 7*Route_Riyadh_Abha + 2*Route_Riyadh_Kharj + 4*Route_Riyadh_Khubar + 6*Route_Riyadh_Qurayyat + 6*Route_Tabuk_Abha + 7*Route_Tabuk_Kharj + 5*Route_Tabuk_Khubar + 1*Route_Tabuk_Qurayyat + 0
SUBJECT TO
Total_product_out_of_plant_Riyadh: - 900 Build_Plant_Riyadh
 + Route_Riyadh_Abha + Route_Riyadh_Kharj + Route_Riyadh_Khubar
 + Route_Riyadh_Qurayyat <= 0

Total_product_out_of_plant_Jeddah: - 2400 Build_Plant_Jeddah
 + Route_Jeddah_Abha + Route_Jeddah_Kharj + Route_Jeddah_Khubar
 + Route_Jeddah_Qurayyat <= 0

Total_product_out_of_plant_Dammam: - 1300 Build_Plant_Dammam
 + Route_Dammam_Abha + Route_Dammam_Kharj + Route_Dammam_Khubar
 + Route_Dammam_Qurayyat <= 0

Total_produc

In [46]:
for s in stores:
    product_in = ""
    for p in plants:
        product_in += route[p][s]
    prob += product_in >= demand[s], 'Total product in store_' + str(s)

prob

Supply_Demand:
MINIMIZE
100000*Build_Plant_Dammam + 72000*Build_Plant_Jeddah + 75000*Build_Plant_Riyadh + 74000*Build_Plant_Tabuk + 4*Route_Dammam_Abha + 8*Route_Dammam_Kharj + 6*Route_Dammam_Khubar + 5*Route_Dammam_Qurayyat + 3*Route_Jeddah_Abha + 6*Route_Jeddah_Kharj + 4*Route_Jeddah_Khubar + 5*Route_Jeddah_Qurayyat + 7*Route_Riyadh_Abha + 2*Route_Riyadh_Kharj + 4*Route_Riyadh_Khubar + 6*Route_Riyadh_Qurayyat + 6*Route_Tabuk_Abha + 7*Route_Tabuk_Kharj + 5*Route_Tabuk_Khubar + 1*Route_Tabuk_Qurayyat + 0
SUBJECT TO
Total_product_out_of_plant_Riyadh: - 900 Build_Plant_Riyadh
 + Route_Riyadh_Abha + Route_Riyadh_Kharj + Route_Riyadh_Khubar
 + Route_Riyadh_Qurayyat <= 0

Total_product_out_of_plant_Jeddah: - 2400 Build_Plant_Jeddah
 + Route_Jeddah_Abha + Route_Jeddah_Kharj + Route_Jeddah_Khubar
 + Route_Jeddah_Qurayyat <= 0

Total_product_out_of_plant_Dammam: - 1300 Build_Plant_Dammam
 + Route_Dammam_Abha + Route_Dammam_Kharj + Route_Dammam_Khubar
 + Route_Dammam_Qurayyat <= 0

Total_produc

In [47]:
# solve!
prob.solve()

1

In [48]:
print("Status:", pulp.LpStatus[prob.status])

Status: Optimal


In [49]:
pulp.LpStatus

{0: 'Not Solved',
 1: 'Optimal',
 -1: 'Infeasible',
 -2: 'Unbounded',
 -3: 'Undefined'}

In [50]:
prob.status

1

In [52]:
for v in prob.variables():
    print(v.name, "=", v.varValue)

Build_Plant_Dammam = 1.0
Build_Plant_Jeddah = 1.0
Build_Plant_Riyadh = 0.0
Build_Plant_Tabuk = 1.0
Route_Dammam_Abha = 1000.0
Route_Dammam_Kharj = 200.0
Route_Dammam_Khubar = 0.0
Route_Dammam_Qurayyat = 0.0
Route_Jeddah_Abha = 0.0
Route_Jeddah_Kharj = 1500.0
Route_Jeddah_Khubar = 900.0
Route_Jeddah_Qurayyat = 0.0
Route_Riyadh_Abha = 0.0
Route_Riyadh_Kharj = 0.0
Route_Riyadh_Khubar = 0.0
Route_Riyadh_Qurayyat = 0.0
Route_Tabuk_Abha = 0.0
Route_Tabuk_Kharj = 0.0
Route_Tabuk_Khubar = 600.0
Route_Tabuk_Qurayyat = 1200.0


In [53]:
pulp.value(prob.objective)

268400.0

### Course 1 Project: Filling Demand while Optimizing Cost



**Goal:** For all orders on 12/31/2019, we need to decide which plant (warehouse) to produce and ship the product from in order to optimize cost. Note that all products on this day need to go to `PORT_09`. See the description of files below.

#### Part 1 (20 points):

**a)** Create a new column in the `orders` dataframe called `allowed_plants`. To do this, you'll need to apply the defined `get_plants` function using a lambda function. (10)

**b)** Set the index of the `orders` dataframe to be the `Order ID`. Make sure you set the index in place. (10)

#### Part 2 (60 points):

**a)** Return the production cost for a given `order_id` and plant (warehouse) name. From the order ID, you should first get the associated product ID, which can be used to get the cost per unit. From here, multiply the cost per unit by the unit quantity to get the total production cost. (25)

Let:

- $q$ be the unit quantity for the order
- $c_{pu}$ be the cost per unit for a product at a specific plant

Then:

$$
\text{Production Cost} = q \cdot c_{pu}
$$

**b)** Return the shipping cost for a given `order_id` and plant (warehouse) name. From the plant name, you should first get the associated port, which can be used to get the shipping cost per lb. From here, multiply the cost per lb by the weight to get the total shipping cost. (25)

Let:

- $w$ be the weight of the order (in lbs)
- $c_{lb}$ be the cost per lb of shipping from the plant's port to the destination

Then:

$$
\text{Shipping Cost} = w \cdot c_{lb}
$$

**c)** Return the total cost for a given `order_id` and plant (warehouse) name. You should add the results of the two functions above. (10)

Let:

- $\text{Prod}_\text{Cost}$ be the production cost
- $\text{Ship}_\text{Cost}$ be the shipping cost

Then:

$$
\text{Total Cost} = \text{Prod}_\text{Cost} + \text{Ship}_\text{Cost}
$$

#### Part 3 (20 points):

**a)** Solve the linear programming problem and store its status in a variable called `status`. (10)

**b)** Return the total cost for a given `order_id` and plant (warehouse) name. You should add the results of the two functions above. (10)

---

#### Description of Files:

1. **OrderList**: List of orders and required destination.
    - `Order ID`: Unique order ID.
    - `Order Date`: Date order was placed.
    - `Product ID`: Unique product ID.
    - `Destination Port`: Port location of final product destination.
    - `Unit Quantity`: Number of units ordered.
    - `Weight`: Total weight of product in order (lbs).

2. **FreightRates**: Shipping rates from various ports.
    - `orig_port_cd`: Port product is shipped from.
    - `dest_port_cd`: Port product arrives at.
    - `rate`: Shipping rate per lb.

3. **WhCapacities**: Production capacities per plant (warehouse).
    - `Plant ID`: Unique plant (warehouse) ID.
    - `Daily Capacity`: Max number of orders processed per day. Note that this capacity corresponds to number of orders, not units.

4. **ProductsPerPlant**: Production cost of each product per plant.
    - `Plant Code`: Unique plant (warehouse) ID.
    - `Product ID`: Unique product ID.
    - `Cost per unit`: Cost to produce 1 unit of product.

5. **PlantPorts**: Table linking plants (warehouses) to associated ports.
    - `Plant Code`: Unique plant (warehouse) ID.
    - `Port`: Unique Port ID


In [1]:
# --Full Python script--

import pandas as pd
import numpy as np
import pulp

orders = pd.read_excel('C1_Project.xlsx')
orders.dropna(axis = 1, how = 'all', inplace = True)
orders.dropna(axis = 0, how = 'all', inplace = True)

freight_rates = pd.read_excel('C1_Project.xlsx', sheet_name  = 1)
freight_rates.dropna(axis = 1, how = 'all', inplace = True)
freight_rates.dropna(axis = 0, how = 'all', inplace = True)

wh_capacities = pd.read_excel('C1_Project.xlsx', sheet_name  = 2)
wh_capacities.dropna(axis = 1, how = 'all', inplace = True)
wh_capacities.dropna(axis = 0, how = 'all', inplace = True)

products_per_plant = pd.read_excel('C1_Project.xlsx', sheet_name  = 3)
products_per_plant.dropna(axis = 1, how = 'all', inplace = True)
products_per_plant.dropna(axis = 0, how = 'all', inplace = True)

ports = pd.read_excel('C1_Project.xlsx', sheet_name  = 4)
ports.dropna(axis = 1, how = 'all', inplace = True)
ports.dropna(axis = 0, how = 'all', inplace = True)

# We need to turn the shipping costs into a dictionary for easy lookup. We use the 'dict(zip(column1, column2))' paradigm.
shipping_costs = dict(zip(freight_rates['orig_port_cd'], freight_rates['rate']))

# Next, we create a list of all unique products per plant. For now, you can treat the 'tuple' data type as a list.
def get_plants(product_id):
    
    temp = products_per_plant[products_per_plant['Product ID'] == product_id]
    return tuple(np.unique(temp['Plant Code']))

''' 
a) Create a new column in the 'orders' dataframe called 'allowed_plants'. 
To do this, you'll need to apply the defined get_plants function using a lambda function.
'''

# your code here
orders['allowed_plants'] = orders['Product ID'].apply(lambda x: get_plants(x))


'''
b) Set the index of the 'orders' dataframe to be the 'Order ID'. Make sure you set the index in place. 
'''
# your code here

orders.set_index('Order ID', inplace = True)

#Next, we create a dictionary to connect plants (warehouses) with the associated ports. Again, we use the 'dict(zip(column1, column2))' paradigm.
plant_ports = dict(zip(ports['Plant Code'], ports['Port']))


def production_cost(order_id, plant):
    '''
    a) Return the production cost for a given order_id and plant (wahrehouse) name. 
    From the order id, you should first get the associated product id, which can be used to get the cost per unit.
    From here, multiply the cost per unit by the unit quantity to get the total production cost.
    '''
    
    t = orders.loc[order_id]
    prod_id = t['Product ID']
    pt = products_per_plant[products_per_plant['Product ID'] == prod_id]
    pt = pt[pt['Plant Code'] == plant]
    cpu = pt['Cost per unit']
    production_cost = cpu * t['Unit quantity']
    return production_cost.iloc[0]

def shipping_cost(order_id, plant):
    '''
    b) Return the shipping cost for a given order_id and plant (warehouse) name. 
    From the plant name, you should first get the associated port, which can be used to get the shipping cost per lb.
    From here, multiply the cost per lb by the weight to get the total shipping cost.
    '''
    
    t = orders.loc[order_id]
    w = t['Weight']
    port = plant_ports[plant]
    cp = shipping_costs[port]
    ship_cost = cp * w
    return ship_cost

def total_cost(order_id, plant):
    '''
    c) Return the total cost for a given order_id and plant (warehouse) name. 
    You should add the results of the two functions above. 
    '''

    return shipping_cost(order_id, plant) + production_cost(order_id, plant)


# We create a dictionary with the key-value pair 'orderId_plantName': total_cost.
order_costs = {}
for name, row in orders.iterrows():
    order_id = name
    for plant in row['allowed_plants']:   
        order_costs[str(order_id) + '_' + str(plant)] = total_cost(order_id, plant)


# We create a dictionary with the key-value pair 'plantName': list_of_orders.
plants = np.unique(ports['Plant Code'])

plant_orders = {}
for plant in plants:
    temp_list = []
    for name, row in orders.iterrows():
        if plant in row['allowed_plants']:  
            temp_list.append(str(name) + '_' + plant)
    plant_orders[plant] = temp_list


# We create a dictionary with the key-value pair 'plantName': capacity.
plant_cap = dict(zip(wh_capacities['Plant ID'], wh_capacities['Daily Capacity'])) 


# We create a dictionary with the key-value pair 'orderID': orderID_plantName.
order_plants = {}
temp_dict = dict(zip(orders.index, orders['allowed_plants']))
for key in temp_dict:
    temp_list = []
    for pl in temp_dict[key]:
        temp_list.append(str(key) + '_' + pl)
    order_plants[key] = temp_list


'''
Creating linear programming constraints
In this section, we build the linear programming problem and solve.
'''

build = pulp.LpVariable.dicts("Route",order_costs.keys(),0,None, pulp.LpInteger)
prob = pulp.LpProblem("Problem",pulp.LpMinimize)
prob += pulp.lpSum([build[b] * order_costs[b] for b in order_costs.keys()]), "Total Cost"

for plant in plant_orders:
    if len(plant_orders[plant]) > 0:
        prob += pulp.lpSum(build[p] for p in plant_orders[plant]) <= plant_cap[plant], "Total orders out of plant %s"%plant

for o in order_plants:
    prob += pulp.lpSum(build[p] for p in order_plants[o]) == 1, "Order_" + str(o) + "_filled"


''' 
a) Solve the linear programming problem and store its status in a variable called 'status'.
'''

status = pulp.LpStatus[prob.status]
for i in range(3):
    if status == 'Optimal':
        break
    prob.solve()

print("Status:", status)


'''
b) Find the total cost to produce and ship all products and store the answer in a variable called 'total_cost'
Round the final answer to 2 decimal places (https://docs.python.org/3/library/functions.html#round).
'''
total_cost = round(pulp.value(prob.objective), 2)

print("Total Cost = ", str(total_cost))

Status: Not Solved
Total Cost =  27140675.14


## **Course2: Demand Forecasting Using Time Series**

### Implementing an ARIMA Model in Python

In this notebook, we'll use an arima model to make predict gdp levels in future years.


In [None]:
### Import libraries and load in data into a dataframe called 'df'.
### This dataset gives US GDP levels from 1947-2017. 
### In this assignment, our goal is to predict future levels (2018+)

import statsmodels.api as sm
import pandas as pd
import numpy as np

df = pd.read_csv('C2_M3_data.csv')
df.head()

Unnamed: 0,date,level-current,level-chained,change-current,change-chained
0,1947-04-01,246.3,1932.3,6.4,-0.4
1,1947-07-01,250.1,1930.3,17.3,6.4
2,1947-10-01,260.3,1960.7,9.3,6.0
3,1948-01-01,266.2,1989.5,10.5,6.7
4,1948-04-01,272.9,2021.9,10.0,2.3


In [None]:


## Part A (30 pts)

''' 
i) Convert the values in the 'date' column into datetime objects. Set the index of the dataframe to the 'date' column. 

ii) Delete all columns except for 'level-current', making sure the data is still in a DataFrame format instead of a series.

Both i-ii should be done in place to the 'df' dataframe.
'''

df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace = True)
df = pd.DataFrame(df['level-current'])


## Part B (50 pts)

''' 
i) Import the 'statsmodels.api' library. We'll be using the SARIMA model from here. 
Refer to the documentation here: https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html

ii) Create a SARIMAX model on the dataframe with the following parameters:
order=(0, 0, 1),=(1, 1, 1, 12), enforce_stationarity=False, enforce_invertibility=False.

iii) Fit the model on the data and store it in a variable called 'results'.
'''



y = df['level-current']

mod = sm.tsa.statespace.SARIMAX(y,
                                order=(0, 0, 1),
                                seasonal_order=(1, 1, 1, 12),
                                enforce_stationarity=False,
                                enforce_invertibility=False)
results = mod.fit()


## Part C (20 pts)

''' 
i) Generate a 95% confidence interval for predictions starting on 1/1/2018. 
Store it in a tuple variable called 'pred_ci'.
'pred_ci' should be of the form (lower_bound, upper_bound) where lower_bound and upper_bound are decimals.
'''

pred = results.get_prediction(start=pd.to_datetime('1/1/2018'))
pred_cit = pred.conf_int()
pred_ci = (pred_cit.iloc[0][0], pred_cit.iloc[0][1])

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  pred_ci = (pred_cit.iloc[0][0], pred_cit.iloc[0][1])


### Predicting Deliveries

In this final project, we'll implement a SARIMA model to predict the number of deliveries in South Africa

In [7]:
### Load in libraries and read in data.
### The data is stored in a dataframe called 'df'.

import pandas as pd
import numpy as np
import statsmodels.api as sm

df = pd.read_csv('C2_M4_data.csv')
df.head()

Unnamed: 0,ID,Project Code,PQ #,PO / SO #,ASN/DN #,Country,Managed By,Fulfill Via,Vendor INCO Term,Shipment Mode,...,Unit of Measure (Per Pack),Line Item Quantity,Line Item Value,Pack Price,Unit Price,Manufacturing Site,First Line Designation,Weight (Kilograms),Freight Cost (USD),Line Item Insurance (USD)
0,1,100-CI-T01,Pre-PQ Process,SCMS-4,ASN-8,C�te d'Ivoire,PMO - US,Direct Drop,EXW,Air,...,30,19,551.0,29.0,0.97,Ranbaxy Fine Chemicals LTD,Yes,13,780.34,
1,3,108-VN-T01,Pre-PQ Process,SCMS-13,ASN-85,Vietnam,PMO - US,Direct Drop,EXW,Air,...,240,1000,6200.0,6.2,0.03,"Aurobindo Unit III, India",Yes,358,4521.5,
2,4,100-CI-T01,Pre-PQ Process,SCMS-20,ASN-14,C�te d'Ivoire,PMO - US,Direct Drop,FCA,Air,...,100,500,40000.0,80.0,0.8,ABBVIE GmbH & Co.KG Wiesbaden,Yes,171,1653.78,
3,15,108-VN-T01,Pre-PQ Process,SCMS-78,ASN-50,Vietnam,PMO - US,Direct Drop,EXW,Air,...,60,31920,127360.8,3.99,0.07,"Ranbaxy, Paonta Shahib, India",Yes,1855,16007.06,
4,16,108-VN-T01,Pre-PQ Process,SCMS-81,ASN-55,Vietnam,PMO - US,Direct Drop,EXW,Air,...,60,38000,121600.0,3.2,0.05,"Aurobindo Unit III, India",Yes,7590,45450.08,


In [8]:
### Part A (20 Pts):


### Convert the values in the column 'Scheduled Delivery Date' into DateTime objects.
### Set the index of the dataframe to the 'Scheduled Delivery Date' column.

df['Scheduled Delivery Date'] = pd.to_datetime(df['Scheduled Delivery Date'])
df.set_index('Scheduled Delivery Date', inplace = True)

### Part B (20 Pts)

### Find the number of deliveries scheduled for South Africa in 2010.
### Store it in a variable called "sa".
### Create a varaible named temp_df to store the dataframe only containing values for South Africa.
### Hint: Countries are stored in a column called 'Country' and years are stored in a column called 'year'.

temp_df = df[df['Country'] == 'South Africa']
df['year'] = df.index.year
sa = df.groupby('year').count().loc[2010]['ID']

### Part C (20 pts)

### Create a new dataframe with the variable name 'sr' by using the corrsponding values in the 'Line Item Quantity' column.
### Then, add three new colums corresponding to different shift amounts: 2, 10, and 20.
### The index of the 'sr' dataframe should be 'Scheduled Delivery Date' 
### with the columns: ['Line Item Quantity', 'shift_2', 'shift_10', 'shift_20']

sr = pd.DataFrame(df['Line Item Quantity'])
sr['shift_2'] = sr['Line Item Quantity'].shift(2)
sr['shift_10'] = sr['Line Item Quantity'].shift(10)
sr['shift_20'] = sr['Line Item Quantity'].shift(20)

### Part D (20 pts)

### Find the autocorrelation value on the first column in sr. Use a lag of 20. 
### Store the value in variable called ac.

ac = sr['Line Item Quantity'].autocorr(10)


### Part E (20 pts)

### Use the SARIMA model to generate predictions from the sr dataframe. 
### Use the model to find the mean predicted value. Store this mean in a variable called 'pred'.

mod = sm.tsa.statespace.SARIMAX(sr['Line Item Quantity'],
                                order=(0, 0, 1),
                                seasonal_order=(1, 1, 1, 12),
                                enforce_stationarity=False,
                                enforce_invertibility=False)
results = mod.fit()

pred = results.get_prediction().predicted_mean.mean()

  df['Scheduled Delivery Date'] = pd.to_datetime(df['Scheduled Delivery Date'])
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)


## **Course 3: Advanced AI Techniques for Supply Chain**

- The course discussed basics of neural networks and deep learning techniques. 

- I don't prefer to apply deep learning and neural networks for tabular data problems. There many reasons:

    - Complexity of the algorithm
    - Usually, we will not have massive amount of numeric data. Such that deep learning techniques works better on massive amount of data. 
    - GPU, and processing resources. Whether working on local machines that do not have GPUs, or the cost of using cloud GPUs. 
    - Most importantly is the interpretation of the predicted estimators. However, if the business goal is to predict a numeric value as is, we might think about DL. But, if the goal is to interpret and apply hypotheses testing, then traditional ML algorithms are to be applied

- Finally, from my perspective, the course is a waste of time and does not belong to supply chains industry. Topics discussed are with reference to open sources data and documentation of the ML/DL packages like ScikitLearn and Keras. No custom datasets, topics, or technical techniques where discuss in this course compared to previous courses within the specialization. 

## **Course 4: Capstone Project - Predicting Safety Stock**

# References

##### [Optimization with PuLP in Python - Getting Started](https://medium.com/@goldkamp.j16/optimization-with-pulp-in-python-getting-started-f6c5b678bf15)
