### Optimization Algorithms

#### Scenario
Suppose that I have a business with three job categories of engineer, baker, and accountant. Each category has five promotion levels of 1 through 5. There are also four city locations where each of these categories of employees can work from. The company allows workers to change locations about every three years and will pay for it, but a level and position must be available. How can I use python with either scipy optimize or pytorch optimize to optimize the number of moves each year by maximizing the available moves while minimizing the total cost to staying within a certain budget?

In [1]:
import pandas as pd
import numpy as np
import itertools
from itertools import product
from scipy.optimize import milp, LinearConstraint, Bounds, linprog

In [5]:
# Categories
cities = ['Seattle', 'Los Angeles', 'Denver', 'Austin', 'Tokyo', 'London']
job_categories = ['Engineer', 'Scientist', 'Accountant']
levels = [1, 2, 3, 4, 5]

# Combining categories and adding people per category
combinations = list(itertools.product(cities, job_categories, levels))
df = pd.DataFrame(combinations, columns=['City', 'Position', 'Level'])
df['People'] = [100, 200, 200, 50, 5,
               25, 45, 55, 15, 3,
               3, 4, 5, 2, 1,
               200, 300, 350, 150, 15,
               50, 80, 95, 30, 6,
               9, 12, 9, 4, 2,
               80, 150, 150, 50, 5,
               30, 45, 55, 15, 3,
               3, 4, 5, 2, 1,
               120, 130, 140, 80, 10,
               60, 50, 50, 30, 5,
               6, 10, 9, 4, 2,
               12, 13, 14, 8, 1,
               6, 5, 5, 3, 1,
               3, 3, 3, 2, 1,
               10, 10, 10, 3, 1,
               10, 10, 10, 3, 1,
               2, 3, 2, 2, 1]

# Functions to apply additional categories 
def level_cat(level):
    '''Returns Senior if the level is 4 or above; else the level is junior'''
    if level >= 4:
        return 'Senior'
    else:
        return 'Junior'
        
def us_city(city):
    '''Returns True is the city is in the United States; else False'''
    if city in ['Seattle', 'Los Angeles', 'Denver', 'Austin']:
        return True
    else:
        return False

def cost_per_move(df):
    '''Returns the cost per move based on conditions'''
    if df.City_in_US == True and df.Level_Category == 'Junior':
        return 8000
    elif df.City_in_US == True and df.Level_Category == 'Senior':
        return 9500
    elif df.City_in_US == False and df.Level_Category == 'Junior':
        return 10000
    else:
        return 12000

# Applying functions
df['Level_Category'] = df.Level.apply(level_cat)
df['City_in_US'] = df.City.apply(us_city)
df['Cost_per_Move'] = df.apply(cost_per_move, axis=1)

# Final DataFrame
df

# Expanded rows of dataframe by number of people
expanded_df = df.reindex(df.index.repeat(df['People']))

# Reset the index if desired
expanded_df = expanded_df.reset_index(drop=True)

# Drop people
expanded_df = expanded_df.drop(columns='People')

expanded_df

Unnamed: 0,City,Position,Level,Level_Category,City_in_US,Cost_per_Move
0,Seattle,Engineer,1,Junior,True,8000
1,Seattle,Engineer,1,Junior,True,8000
2,Seattle,Engineer,1,Junior,True,8000
3,Seattle,Engineer,1,Junior,True,8000
4,Seattle,Engineer,1,Junior,True,8000
...,...,...,...,...,...,...
3482,London,Accountant,3,Junior,False,10000
3483,London,Accountant,3,Junior,False,10000
3484,London,Accountant,4,Senior,False,12000
3485,London,Accountant,4,Senior,False,12000


In [12]:
print(f'Total Number of Positions: {len(expanded_df):,}; Total Cost: ${expanded_df.Cost_per_Move.sum():,}') 

Total Number of Positions: 3,487; Total Cost: $29,001,000


In [14]:
sample_df = expanded_df.sample(frac=0.25)
print(f'Total Number of Positions: {len(sample_df):,}; Total Cost: ${sample_df.Cost_per_Move.sum():,}') 

Total Number of Positions: 872; Total Cost: $7,242,000


In [30]:
# Cost of 25% Sample size each iteration
cost_list = []
for n in range(0, 1000):
    sample_df = expanded_df.sample(frac=0.25)
    cost_list.append(sample_df.Cost_per_Move.sum()) 

cost_df = pd.DataFrame(cost_list, columns=['cost'])

print('Out of 100 iterations')
print(f'Min Cost: ${cost_df.cost.min():,}; Max Cost: ${cost_df.cost.max():,}; Avg Cost: ${cost_df.cost.mean():,.0f}; STDev Cost: ${cost_df.cost.std():,.0f}') 

Out of 100 iterations
Min Cost: $7,194,000; Max Cost: $7,311,500; Avg Cost: $7,252,458; STDev Cost: $18,129


In [33]:
# Cost of between 20% and 30% Sample size each iteration
cost_list = []
for n in range(0, 1000):
    # Random percentage of movers
    random_mover_percentage = np.random.uniform(low=0.2, high=0.3)
    sample_df = expanded_df.sample(frac=random_mover_percentage)
    cost_list.append(sample_df.Cost_per_Move.sum()) 

cost_df = pd.DataFrame(cost_list, columns=['cost'])

print('Out of 100 iterations')
print(f'Min Cost: ${cost_df.cost.min():,}; Max Cost: ${cost_df.cost.max():,}; Avg Cost: ${cost_df.cost.mean():,.0f}; STDev Cost: ${cost_df.cost.std():,.0f}')

Out of 100 iterations
Min Cost: $5,796,500; Max Cost: $8,701,500; Avg Cost: $7,256,860; STDev Cost: $843,654
