# AFCCP Parameters

This jupyter notebook is going to serve as a temporary documentation example of how the python module "afccp" works. There is another .ipynb file "afccp_example" that serves more as an overview of everything. This is going to outline many of the names of parameters I use. Let's get right to it. The code block below is what we need to get the "CadetCareerProblem" object working properly. (Just have to change the working directory and import the class, that's all). 

In [1]:
import os

# Obtain initial working directory
dir_path = os.getcwd() + '/'
print('initial working directory:', dir_path)

# Get main afccp folder path
index = dir_path.find('afccp') 
dir_path = dir_path[:index + 6]

# Update working directory
os.chdir(dir_path)
print('updated working directory:', dir_path)

# Import the problem class
from afccp.core.problem_class import CadetCareerProblem

initial working directory: /Users/griffenlaird/Desktop/Coding Projects/afccp/afccp/executables/examples/
updated working directory: /Users/griffenlaird/Desktop/Coding Projects/afccp/
Running on Griffen's Macbook
Sensitive data folder not found.
Pyomo module found.
SDV module found.
Sklearn Manifold module found.


Now let's play around with an example problem instance. I used "C" in the other one but I could use any of them, including a real problem instance. This file should not be included in github. Let's use the class of 2023 as an example since I just finished matching them.

In [2]:
instance = CadetCareerProblem("2023", printing=True)  # Printing just print out status updates

Importing 2023 problem instance...
Imported.


Hopefully when you run the above code it works! It should say that the data was imported. With the above line, you now have access to all the data you could possibly need in order to match a set of cadets to their AFSCs. Check out the other notebook "afccp_example" for more on some of that stuff. This notebook is meant to go into detail on the parameters and "value parameters" I use on the various objectives. You could translate these to your own set of parameters if you want too (that's what I did with Rebecca's model's parameters) or you could just use them directly, or even add to them if you wish.

## "parameters" dictionary
The CadetCareerProblem attribute "parameters" is all the stuff that we can't really change. Stuff about the AFSCs and cadets that is "given", if you will. This is a dictionary of all sorts of information that is unique to the problem instance and won't change. Let's explore this dictionary.

### Direct parameters from Excel
The first parameters in the dictionary that I want to discuss are the ones that are essentially direct imports from excel. These ones are pretty straightforward. From these parameters, I have another function that creates sets of cadets and AFSCs based on these parameters.

In [3]:
# vector of AFSCs for this instance 
print(instance.parameters["afsc_vector"])  # 1-d numpy array of length M

['13H' '13M' '13N' '14F' '14N' '15A' '15W' '17X' '21A' '21M' '21R' '31P'
 '32EXA' '32EXC' '32EXE' '32EXF' '32EXG' '32EXJ' '35P' '38F' '61C' '61D'
 '62EXA' '62EXB' '62EXC' '62EXE' '62EXG' '62EXH' '62EXI' '63A' '64P' '65F']


In [4]:
# Number of preferences the cadets' get (historically 6)
print("P:", instance.parameters["P"])

# Number of AFSCs
print("M:", instance.parameters["M"])

# Number of Cadets
print("N:", instance.parameters["N"])

P: 6
M: 32
N: 1575


In [10]:
# Qual Matrix
print("Qual Matrix:", instance.parameters["qual"])  # 2-d numpy array of size N x M

Qual Matrix: [['I' 'P' 'P' ... 'D' 'D' 'D']
 ['I' 'P' 'P' ... 'I' 'D' 'P']
 ['I' 'P' 'P' ... 'D' 'D' 'D']
 ...
 ['I' 'P' 'M' ... 'M' 'D' 'D']
 ['I' 'P' 'M' ... 'M' 'D' 'D']
 ['I' 'P' 'M' ... 'M' 'D' 'D']]


There are a few functions in the "preprocessing.py" script that take an array of CIP codes (unique academic degree codes) and create the cadets' qualifications based on the AFOCD (mandatory, desired, permitted, ineligible: M, D, P, I). I used this for 2023's stuff (and I had to update it -> latest version is April 2022). This is also used for generated class years. Just FYI!

From this qual matrix, I break it up even further and create binary matrices representing the different pieces of information in the qual matrix. Four new matrices are shown below

In [37]:
# Cadets with degrees in the mandatory tier for each AFSC!  ("qual" == "M")
print("mandatory:", instance.parameters["mandatory"])  # 2-d numpy array of size N x M

# Cadets with degrees in the desired tier for each AFSC!  ("qual" == "D")
print("desired:", instance.parameters["desired"])  # 2-d numpy array of size N x M

# Cadets with degrees in the permitted tier for each AFSC!  ("qual" == "P")
print("permitted:", instance.parameters["permitted"])  # 2-d numpy array of size N x M

# Cadets with degrees that are ineligible for each AFSC!  ("qual" == "I")
print("ineligible:", instance.parameters["ineligible"])  # 2-d numpy array of size N x M

mandatory: [[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 1 ... 1 0 0]
 [0 0 1 ... 1 0 0]
 [0 0 1 ... 1 0 0]]
desired: [[0 0 0 ... 1 1 1]
 [0 0 0 ... 0 1 0]
 [0 0 0 ... 1 1 1]
 ...
 [0 0 0 ... 0 1 1]
 [0 0 0 ... 0 1 1]
 [0 0 0 ... 0 1 1]]
permitted: [[0 1 1 ... 0 0 0]
 [0 1 1 ... 0 0 1]
 [0 1 1 ... 0 0 0]
 ...
 [0 1 0 ... 0 0 0]
 [0 1 0 ... 0 0 0]
 [0 1 0 ... 0 0 0]]
ineligible: [[1 0 0 ... 0 0 0]
 [1 0 0 ... 1 0 0]
 [1 0 0 ... 0 0 0]
 ...
 [1 0 0 ... 0 0 0]
 [1 0 0 ... 0 0 0]
 [1 0 0 ... 0 0 0]]


In [11]:
# Utility Matrix
print("Utility Matrix:", instance.parameters["utility"])  # 2-d numpy array of size N x M

Utility Matrix: [[0.   0.   0.   ... 0.17 0.75 1.  ]
 [0.   0.75 0.   ... 0.   0.   0.  ]
 [0.   0.   0.   ... 0.   0.33 0.  ]
 ...
 [0.   0.   0.   ... 0.   0.   0.4 ]
 [0.   0.   0.   ... 0.25 0.   0.  ]
 [0.   0.   0.   ... 0.17 0.65 0.  ]]


The cadets' preferences and utility measures are converted to an NxM utility matrix (rows are cadets, columns are AFSCs)

In [13]:
import numpy as np  # Needed to calculate some means and whatnot (np.mean below)

In [15]:
# Cadet unique ID array (not the same as "I")
print("Cadet:", instance.parameters["ID"])  # 1-d numpy array of length N

# USAFA binary indicator array
print("USAFA:", instance.parameters["usafa"])  # 1-d numpy array of length N

# Male  (Not all class years have this -> sometimes I didn't have that information)
print("Male:", instance.parameters["male"])  # 1-d numpy array of length N

# Merit  (or percentile) This is relative to just the Non-Rated cadets and therefore the average is right around 0.5
print("Merit (NonRated):", instance.parameters["merit"])  # 1-d numpy array of length N
print("Merit (NonRated) Average:", np.mean(instance.parameters["merit"]))

# Merit  (or percentile) This is the cadet's overall standing (when compared to Rated and Space Force Cadets)
print("Merit (All):", instance.parameters["merit_all"])  # 1-d numpy array of length N
print("Merit (All) Average:", np.mean(instance.parameters["merit_all"]))

Cadet: [   0    1    2 ... 2566 2567 2568]
USAFA: [0 0 0 ... 1 1 1]
Male: [1 1 1 ... 0 0 1]
Merit (NonRated): [0.26982759 0.93103448 0.42931034 ... 0.97029703 0.86386139 0.5990099 ]
Merit (NonRated) Average: 0.4996524356869184
Merit (All): [0.19291527 0.89516515 0.31833413 ... 0.9661191  0.85523614 0.5523614 ]
Merit (All) Average: 0.4357796160240567


Just for this year (2023) I wanted to distinguish the "relative" merit from the "absolute" merit. To appease the AFSCs, we use the re-scaled percentiles based on the cadets' standings within the ones we're matching (just the NonRated cadets). The average is therefore 0.5. Since Rated and Space Force as a whole tend to take higher-performers, the average merit of the non-rated cadets when compared to the entire class is much lower (down to about 0.436). Therefore I use the relative merit in the "balancing merit" objective but I use the cadets' "real" percentiles in the cadet weight function.

In [18]:
# CIP codes  (Not all class years have this -> sometimes I didn't have that information)
print("CIP:", instance.parameters["cip1"])  # 1-d numpy array of length N

# Second Degree CIP codes (vast majority don't have 2 degrees)
print("CIP (Second Degree):", instance.parameters["cip2"])  # 1-d numpy array of length N

CIP: ['520801' '260202' '522101' ... '141001' '141001' '141001']
CIP (Second Degree): ['None' 'None' 'None' ... 'None' 'None' 'None']


In [19]:
# Already Assigned degrees (This is new, I wanted a column of already assigned AFSCs so I could fix x_ij)
print("Already Assigned AFSCs:", instance.parameters["assigned"])  # 1-d numpy array of length N

Already Assigned AFSCs: [nan nan nan ... nan nan nan]


This already assigned AFSC column only has a few people in it that have assigned AFSCs already. For whatever reason, I got a list of individuals that are in this year's "match" but have already been given an AFSC. I wanted to include them in the model so the model knows and can factor that information into the objectives. They're fixed variables. I also used this to fix certain people into certain AFSCs based on various factors (sometimes the model messed up and didn't put someone somewhere correctly)

In [35]:
indices = np.where(instance.parameters["assigned"].astype("str") != "nan")[0]

print("Cadet Indices with assigned AFSCs:", indices)
print("Assigned AFSCs:", instance.parameters["assigned"][indices])

Cadet Indices with assigned AFSCs: [  41  370  569  572  851  927  996 1166 1530]
Assigned AFSCs: ['62EXE' '32EXF' '13N' '62EXB' '13H' '61C' '62EXC' '15W' '17X']


We talked a lot about the cadets, but I also have some information on the AFSCs

In [38]:
# AFSC Quotas 
print("AFSC Quota (PGL):", instance.parameters["pgl"])  # 1-d numpy array of length M
print("AFSC Target (Estimated):", instance.parameters["quota"])  # 1-d numpy array of length M
print("AFSC Min:", instance.parameters["quota_min"])  # 1-d numpy array of length M
print("AFSC Max:", instance.parameters["quota_max"])  # 1-d numpy array of length M

AFSC Quota (PGL): [  8  19 105   7 195  35  25 181  84  29  61  29   3   7   3   3  42   3
  18  84   1  13  15  12  20  63  34  12   2  69  50  34]
AFSC Target (Estimated): [ 14  27 166   8 210  70  36 193  91  38  67  32   5  10   3   5  60   3
  20  91   1  13  29  17  24  53  48  24   4  85  75  37]
AFSC Min: [ 10  19 169   7 195  60  25 181  84  29  61  29   3   7   3   3  42   3
  18  84   1  13  15  12  24  53  34  12   2  69  50  34]
AFSC Max: [ 14  27 210   8 210  72  50 193  92  38  67  35   6  10   6   5  60   5
  22  92   1  14  30  24  40 126  48  24   4  95  75  37]


The AFSC ranges for the number of cadets that can be assigned to them are admittedly a little vague. Every year, the "Performance Guidance Letter" (PGL) outlines the target number of cadets that are to be "accessed" into each AFSC. For this year (CY2023), the sum of the PGL targets was 1266, when the total number of cadets is 1575. That's 309 extra cadets that we need to figure out where to put.

In [43]:
print("Sum of PGL targets:", np.sum(instance.parameters["pgl"]))
print("Total Number of cadets:", instance.parameters["N"])
print("Extra cadets (Surplus):", instance.parameters["N"] - np.sum(instance.parameters["pgl"]))

Sum of PGL targets: 1266
Total Number of cadets: 1575
Extra cadets (Surplus): 309


I had a lot of conversations with the assignments' officers upstairs and even some career field managers to determine who could handle more cadets. Every career field is different and that's why I wanted to break this thing down to the career field (AFSC) level. 13N always needs more people which is why their "min" is higher than their PGL. Some other career fields had similar situations. Some can't take too many more, it all depends! The min, max vectors are therefore the "real" ranges on the numbers of cadets that could/should be accessed into the career fields. Those numbers are a little flexible though! The estimated target is basically how many cadets are probably going to be assigned to each of the AFSCs and I use that number in my objective function as an approximation for the real number of cadets assigned.

In [46]:
# USAFA Quota (From the PGL! This is used to create the "Combined Quota")
print("USAFA Quota (PGL):", instance.parameters["usafa_quota"])  # 1-d numpy array of length M

# ROTC Quota (From the PGL! This is used to create the "Combined Quota")
print("ROTC Quota (PGL):", instance.parameters["rotc_quota"])  # 1-d numpy array of length M

USAFA Quota (PGL): [ 2  3 26  2 71 14  9 52 15 17 12  9  1  2  1  1 10  1  2 14  0  4  3  4
  5 15  5  3  2 15  8  8]
ROTC Quota (PGL): [  6  16  79   5 124  21  16 129  69  12  49  20   2   5   2   2  32   2
  16  70   1   9  12   8  15  48  29   9   0  54  42  26]


The USAFA and ROTC Quotas are provided by A1 that make up the PGL's target for each of the AFSCs. Historically, the original model didn't really use this information, but it actually was necessary this year for some of the smaller AFSCs to make sure they at least got one or two USAFA cadets since that's what they wanted. These were incorporated as other objectives in my VFT model: the USAFA/ROTC quota objectives since they're separate from the proportion balancing objectives.

### Parameter "Set" Additions

I have all the information I need to create some sets of cadets and AFSCs to use in the optimization models. These sets follow my thesis formulation pretty closely.

In [48]:
# Set of all cadets (different from ID) these are the cadet indices! (0...N - 1 because python)
print("I:", instance.parameters["I"])  # 1-d numpy array of length N

# Set of all AFSCs (different from afsc vector) these are the AFSC indices! (0...M - 1 because python)
print("J:", instance.parameters["J"])  # 1-d numpy array of length M

I: [   0    1    2 ... 1572 1573 1574]
J: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31]


In [54]:
# Indexed set of all AFSCs that cadet 'i' is eligible for (indexed by I)
print("J^E [0]:", instance.parameters["J^E"][0])  # list (length N) of numpy arrays (length <= M)
# (showing set of AFSCs for cadet at index 0)

J^E [0]: [ 1  2  4  7  8  9 10 11 18 19 29 30 31]


I have a lot of indexed sets which I represent in python using a list of numpy arrays. If I were to print out all of J^E, it would be a big mess so I'm just printing out the set of AFSCs that the cadet at index 0 is eligible for. For all my other indexed sets, I will do something similar...

In [55]:
# Indexed set of all AFSCs that cadet 'i' is eligible for and has placed a preference for (indexed by I)
print("J^P [0]:", instance.parameters["J^P"][0])  # list (length N) of numpy arrays (length <= P)
# (showing set of AFSCs for cadet at index 0)

J^P [0]: [ 4 10 29 30 31]


The above two indexed sets are sets of AFSCs for each cadet. The remaining sets are going to be the reverse (sets of cadets for each AFSC).

In [56]:
# Indexed set of all cadets that are eligible for each AFSC (indexed by J)
print("I^E [0]:", instance.parameters["I^E"][0])  # list (length M) of numpy arrays (length <= N)
# (showing set of cadets for AFSC at index 0)

I^E [0]: [  29   47   54   58  201  256  317  362  378  394  421  456  495  513
  585  646  673  785  789  851  878  886  982 1050 1076 1130 1131 1138]


In [57]:
# Indexed set of all cadets that are eligible for each AFSC and have placed a preference for the AFSC (indexed by J)
print("I^P [0]:", instance.parameters["I^P"][0])  # list (length M) of numpy arrays (length <= N)
# (showing set of cadets for AFSC at index 0)

I^P [0]: [  58  256  317  362  378  394  421  456  513  585  646  851  878  886
  982 1050 1076 1130 1131]


These next series of sets pertain to cadets with certain demographics. These demographics directly correspond to objectives in my set of "value parameters". I talk about this in the thesis, and it'll probably be more clear when you get to my value parameters further below.

In [58]:
print(instance.parameters["I^D"].keys())

dict_keys(['USAFA Proportion', 'Mandatory', 'Desired', 'Permitted', 'Male', 'Minority'])


"I^D" contains the cadets that are eligible for each AFSC and have the "demographic" associated with a particular objective (see above) for that AFSC. This is in my thesis formulation. The only difference with its implementation in python is that I^D is indexed as "I^D[objective][AFSC]" rather than "I^D[AFSC][objective]"

In [63]:
# Indexed set of all USAFA cadets that are eligible for each AFSC (indexed by J)
print("I^D [USAFA] [3]:", instance.parameters["I^D"]["USAFA Proportion"][3])  
# list (length M) of numpy arrays (length <= N)
# (showing set of cadets for AFSC at index 3)

I^D [USAFA] [3]: [1175 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197
 1198 1199 1200 1201 1202 1203 1204 1205 1206 1382 1388 1421 1422 1423
 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437
 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451
 1452]


In [64]:
# Indexed set of all cadets with Mandatory-tiered degrees for each AFSC (indexed by J)
print("I^D [Mandatory] [3]:", instance.parameters["I^D"]["Mandatory"][3])  
# list (length M) of numpy arrays (length <= N)
# (showing set of cadets for AFSC at index 3)

I^D [Mandatory] [3]: [  37   59   75  155  170  189  201  207  228  233  258  313  363  372
  419  429  434  470  482  526  543  561  567  577  607  630  664  679
  684  690  700  733  739  748  751  755  775  792  808  828  840  842
  850  887  891  918  932  960  992  994 1011 1034 1039 1041 1059 1088
 1091 1098 1111 1122 1139 1141 1175 1382 1421 1422 1423 1424 1425 1426
 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440
 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452]


In [65]:
# Indexed set of all cadets with Desired-tiered degrees for each AFSC (indexed by J)
print("I^D [Desired] [3]:", instance.parameters["I^D"]["Desired"][3])  
# list (length M) of numpy arrays (length <= N)
# (showing set of cadets for AFSC at index 3)

I^D [Desired] [3]: [ 425  563  592  812  823  893  900  943  961 1092 1101 1107]


In [67]:
# Indexed set of all cadets with Permitted-tiered degrees for each AFSC (indexed by J)
print("I^D [Permitted] [3]:", instance.parameters["I^D"]["Permitted"][3])  
# list (length M) of numpy arrays (length <= N)
# (showing set of cadets for AFSC at index 3)

I^D [Permitted] [3]: [  25   35   49   76   78   79   84   85   94  119  121  124  137  154
  159  208  211  214  261  275  292  322  334  339  347  380  397  404
  427  435  451  475  518  575  584  589  601  605  636  654  656  669
  682  685  698  746  764  779  795  797  804  837  843  970 1006 1018
 1031 1035 1046 1081 1118 1146 1159 1185 1186 1187 1188 1189 1190 1191
 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205
 1206 1388]


In [68]:
# Indexed set of all male cadets that are eligible for each AFSC (indexed by J)
print("I^D [Male] [3]:", instance.parameters["I^D"]["Male"][3])  
# list (length M) of numpy arrays (length <= N)
# (showing set of cadets for AFSC at index 3)

I^D [Male] [3]: [  37   49   79   94  121  154  155  207  208  261  334  363  372  380
  419  425  427  434  435  475  518  543  563  567  575  592  605  607
  664  682  698  733  746  748  755  764  779  795  797  804  823  842
  843  887  893  900  961  970  992 1006 1011 1018 1034 1035 1046 1059
 1098 1122 1187 1188 1191 1192 1193 1195 1196 1198 1201 1202 1203 1382
 1388 1422 1425 1432 1438 1443 1450 1452]


The last set I use is just a set of cadets that are already assigned to an AFSC, as well as the AFSC that they are assigned to. This is a dictionary where the keys are the cadet indices and the values are the AFSC indices that they're assigned to.

In [69]:
print("J^Fixed:", instance.parameters["J^Fixed"]) 

J^Fixed: {41: 25, 370: 15, 569: 2, 572: 23, 851: 0, 927: 20, 996: 24, 1166: 6, 1530: 7}


The cadet at index 41 is already assigned to the AFSC at index 25, the cadet at index 370 is already assigned to the AFSC at index 15, the cadet...

### Summary

This is the magic of dictionaries! The dictionary called "parameters" contains all of this information! I don't need to keep track of all of the variables that I'm using across many different functions. I just use one variable "parameters" that contains it all. Going further, this dictionary is an attribute of the CadetCareerProblem object "instance" and so we have access to all of this information anywhere we can access the problem instance. One other cool thing you can do with dictionaries is be able to name the variables whatever you want since they're just python strings. In my thesis, I use superscripts in my sets ("I^E" for example) and can actually call it that in python where you couldn't do that as a regular variable name.

In [72]:
# Writing "instance.parameters["I^E"][j]" might be a bit cumbersome so I abbreviate it a lot in my functions
p = instance.parameters  # abbreviate it so you can read/write code better
j = 0
print(p["I^E"][j])  # Set of all cadets that are eligible for AFSC j (j=0 in this case)

[  29   47   54   58  201  256  317  362  378  394  421  456  495  513
  585  646  673  785  789  851  878  886  982 1050 1076 1130 1131 1138]


## "value_parameters" dictionary
Where "parameters" was the stuff we can't change (for the most part), the CadetCareerProblem attribute "value_parameters" is all the stuff that we can change! These are all the objectives, weights, value functions, constraints, etc. that the analyst has the flexibility to adjust. Let's explore this dictionary.

### Direct parameters from Excel
Just like before, the first parameters in the dictionary that I want to discuss are the ones that are essentially direct imports from excel. These ones are pretty straightforward. From these value parameters, I have another function that creates sets of AFSCs and AFSC objectives. 