## How to use this notebook

This notebook is detailing a computer matching algorithm for recommending who should be assigned available computers from the underdog devs program.  It goes over the methodology, logic behind the formula, using the sorting engine on mock data, and has a todo list at the end.  You can open and close relevant sections using the arrow to the left of the markdown section descriptions.

## Imports

In [1]:
from dataclasses import dataclass, field
import random as r
import numpy as np
import pandas as p
import math as m
import itertools as iter

## Create mock data

#### Class and variables

In [2]:
@dataclass
class Mentee:
    Name : str 
    need_new_comp_rank: int
    need_help_aquiring_rank: int
    


##### name variables (long)

In [3]:
first_names = (
    "Liam", "Noah", "Oliver", "Elijah", "William", "James", "Benjamin", "Lucas",
    "Henry", "Alexander", "Mason", "Michael", "Ethan", "Daniel", "Jacob",
    "Logan", "Jackson", "Levi", "Sebastian", "Mateo", "Jack", "Owen",
    "Theodore", "Aiden", "Samuel", "Joseph", "John", "David", "Wyatt",
    "Matthew", "Luke", "Asher", "Carter", "Julian", "Grayson", "Leo", "Jayden",
    "Gabriel", "Isaac", "Lincoln", "Anthony", "Hudson", "Dylan", "Ezra",
    "Thomas", "Charles", "Christopher", "Jaxon", "Maverick", "Josiah", "Isaiah",
    "Andrew", "Elias", "Joshua", "Nathan", "Caleb", "Ryan", "Adrian", "Miles",
    "Eli", "Nolan", "Christian", "Aaron", "Cameron", "Ezekiel", "Colton",
    "Luca", "Landon", "Hunter", "Jonathan", "Santiago", "Axel", "Easton",
    "Cooper", "Jeremiah", "Angel", "Roman", "Connor", "Jameson", "Robert",
    "Olivia", "Emma", "Ava", "Charlotte", "Sophia", "Amelia", "Isabella", "Mia",
    "Evelyn", "Harper", "Camila", "Gianna", "Abigail", "Luna", "Ella",
    "Elizabeth", "Sofia", "Emily", "Avery", "Mila", "Scarlett", "Eleanor",
    "Madison", "Layla", "Penelope", "Aria", "Chloe", "Grace", "Ellie", "Nora",
    "Hazel", "Zoey", "Riley", "Victoria", "Lily", "Aurora", "Violet", "Nova",
    "Hannah", "Emilia", "Zoe", "Stella", "Everly", "Isla", "Leah", "Lillian",
    "Addison", "Willow", "Lucy", "Paisley", "Natalie", "Naomi", "Eliana",
    "Brooklyn", "Elena", "Aubrey", "Claire", "Ivy", "Kinsley", "Audrey", "Maya",
    "Genesis", "Skylar", "Bella", "Aaliyah", "Madelyn", "Savannah", "Anna",
    "Delilah", "Serenity", "Caroline", "Kennedy", "Valentina", "Ruby", "Sophie",
    "Alice", "Gabriella", "Sadie", "Ariana", "Allison", "Hailey", "Autumn",
    "Nevaeh", "Natalia", "Quinn", "Josephine", "Sarah", "Cora", "Emery",
    "Samantha", "Piper", "Leilani", "Eva", "Everleigh", "Madeline", "Lydia",
    "Jade", "Peyton", "Brielle", "Adeline", "Vivian", "Rylee", "Clara",
    "Raelynn", "Melanie", "Melody", "Julia", "Athena", "Maria", "Liliana",
    "Hadley", "Arya", "Rose", "Reagan", "Eliza", "Adalynn", "Kaylee", "Lyla",
    "Mackenzie", "Alaia", "Isabelle", "Charlie", "Arianna", "Mary", "Remi",
    "Margaret", "Iris", "Parker", "Ximena", "Eden", "Ayla", "Kylie", "Elliana",
    "Josie", "Katherine", "Faith", "Alexandra", "Eloise", "Adalyn", "Amaya",
    "Jasmine", "Amara", "Daisy", "Reese", "Valerie", "Brianna", "Cecilia",
    "Andrea", "Summer", "Valeria", "Norah", "Ariella", "Esther", "Ashley",
    "Emerson", "Aubree", "Isabel", "Anastasia", "Ryleigh", "Khloe", "Taylor",
    "Londyn", "Lucia", "Emersyn", "Callie", "Sienna", "Blakely", "Kehlani",
)

last_names = (
    "Smith", "Johnson", "Williams", "Brown", "Jones", "Garcia", "Miller",
    "Davis", "Rodriguez", "Martinez", "Hernandez", "Lopez", "Gonzales",
    "Wilson", "Anderson", "Thomas", "Taylor", "Moore", "Jackson", "Martin",
    "Lee", "Perez", "Thompson", "White", "Harris", "Sanchez", "Clark",
    "Ramirez", "Lewis", "Robinson", "Walker", "Young", "Allen", "King",
    "Wright", "Scott", "Torres", "Nguyen", "Hill", "Flores", "Green", "Adams",
    "Nelson", "Baker", "Hall", "Rivera", "Campbell", "Mitchell", "Carter",
    "Roberts", "Gomez", "Phillips", "Evans", "Turner", "Diaz", "Parker", "Cruz",
    "Edwards", "Collins", "Reyes", "Stewart", "Morris", "Morales", "Murphy",
    "Cook", "Rogers", "Gutierrez", "Ortiz", "Morgan", "Cooper", "Peterson",
    "Bailey", "Reed", "Kelly", "Howard", "Ramos", "Kim", "Cox", "Ward",
    "Richardson", "Watson", "Brooks", "Chavez", "Wood", "James", "Bennet",
    "Gray", "Mendoza", "Ruiz", "Hughes", "Price", "Alvarez", "Castillo",
    "Sanders", "Patel", "Myers", "Long", "Ross", "Foster", "Jimenez",
)

In [4]:
def name():
    return f"{r.choice(first_names)} {r.choice(last_names)}"

#### Create and run mock data creation function

In [5]:
def Create_Populate_Table(Amount):
    mentees = []
    for _ in range(Amount):
        newment = Mentee(Name=name(),
                         need_new_comp_rank=r.randint(0,3),
                         need_help_aquiring_rank=r.randint(0,3))
        mentees += [newment]
    return mentees
        
        

In [6]:
menteelist = Create_Populate_Table(50)
menteelist

[Mentee(Name='Josie Gutierrez', need_new_comp_rank=3, need_help_aquiring_rank=0),
 Mentee(Name='Hudson Cox', need_new_comp_rank=0, need_help_aquiring_rank=0),
 Mentee(Name='Adeline Long', need_new_comp_rank=2, need_help_aquiring_rank=2),
 Mentee(Name='Mackenzie Howard', need_new_comp_rank=3, need_help_aquiring_rank=2),
 Mentee(Name='Anna Gomez', need_new_comp_rank=1, need_help_aquiring_rank=3),
 Mentee(Name='Blakely White', need_new_comp_rank=1, need_help_aquiring_rank=3),
 Mentee(Name='Asher Edwards', need_new_comp_rank=1, need_help_aquiring_rank=2),
 Mentee(Name='Kaylee Jimenez', need_new_comp_rank=3, need_help_aquiring_rank=0),
 Mentee(Name='Asher Gomez', need_new_comp_rank=0, need_help_aquiring_rank=2),
 Mentee(Name='Thomas Hernandez', need_new_comp_rank=3, need_help_aquiring_rank=3),
 Mentee(Name='Luke Cox', need_new_comp_rank=1, need_help_aquiring_rank=3),
 Mentee(Name='Cooper Allen', need_new_comp_rank=0, need_help_aquiring_rank=2),
 Mentee(Name='Emilia Parker', need_new_comp_ra

#### convert to dataframe for easier manipulation 

In [7]:
df = p.DataFrame(map(vars, menteelist))

In [8]:
df.head(10) 

Unnamed: 0,Name,need_new_comp_rank,need_help_aquiring_rank
0,Josie Gutierrez,3,0
1,Hudson Cox,0,0
2,Adeline Long,2,2
3,Mackenzie Howard,3,2
4,Anna Gomez,1,3
5,Blakely White,1,3
6,Asher Edwards,1,2
7,Kaylee Jimenez,3,0
8,Asher Gomez,0,2
9,Thomas Hernandez,3,3


## Create sorting engine, short documentation of formula and methodology

#### Create and explain formula

##### Formula

need help = x, need comp = y

${(x^2-floor(.7*x)) * (abs(y^3-y^2-y))}$

##### Explanation 

__**Terms**__

**Need help** or "**need help aquiring rank**", is a representation of how much the person needs help aquiring a computer, with 0 representing "I could buy it today no problem", 1 being "it would take a little time (1 week to 1 month)", 2 being "it would take a while (1-3 months)", 3 being "I cannot aquire on my own in any reasonable timeframe".  1 and 2 are mutable, in that they can represent different timeframes if the user wishes, but 0 and 3 are not, since they represent the extremes. The reason this is based on timeframe, instead of income, is because it takes into consideration what the person can do rather than why;  for example, someone who lives in alaska (+government stipend) is recieving aid (+some income) and has a computer building friend who has an extra build for cheap (friend discount), that person may have less overall wealth than someone who cannot free up their funds due to a looming house forclosure, but might be able to attain the new computer more quickly (due to the homeowner needing to put ALL income into debt repayment else face homelessness).

**Need comp** or "**need new computer rank**", is a representation of how much the person needs a new computer, with 0 being "I do not need one at all" and 3 being "I signed up at the library and don't have a computer/ I am coding on a phone", with 1 and 2 representing values in between.  Note: 0 does not represent a "top of the line" computer, just a computer where upgrading would not improve the ability to complete the coding work.  An example: a data scientist who has to train models, may have different computing needs than someone who is developing apps for ios/android.

**help sort value** is the value generated by the fomula from the need help and need comp values, and is how the data is sorted.  After this value is created, sorting the dataframe in descending order by this column generates the list of priorities for available computers.

**Formula explanation**

The formula is built so that those who need more help aquiring (will take longer to get computer on own) are favored, since the sooner someone upgrades their equipment the sooner they will get used to their dev enviornment (for example, if someone is going to switch from PC to mac/linus, it is best for that to happen as soon as possible).  The exception is when someone desperately needs a computer (rank 3), where it will be prioritized over anything else, since not having a viable computer goes from "massively inefficient", to the extreme of "unable to do any learning".  If either rank is 0 (they do not need a new computer, or they do not need help aquiring), the equation zeroes out so that only someone who either needs a computer or needs help returns a positive value.  

**important note** 

Still return the bottom of the list/0 values!  It is imperative that there are no excluded members.  It is a feature for the organization to be able to ensure all donated computers see use so they can ensure continued donations in the future.

#### Create test dataframe to ensure wanted results

In [9]:
a = [0,1,2,3]
b = [0,1,2,3]
c = list(iter.product(a,b))
print(c)

[(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2), (2, 3), (3, 0), (3, 1), (3, 2), (3, 3)]


In [10]:
dftest = p.DataFrame(data=c, columns=['Need help','Need comp'])
dftest

Unnamed: 0,Need help,Need comp
0,0,0
1,0,1
2,0,2
3,0,3
4,1,0
5,1,1
6,1,2
7,1,3
8,2,0
9,2,1


#### Test on test dataframe

This will show all possible combinations of values for 'need help' and 'need comp', and the 'help sort value' generated.

In [11]:
col1 = dftest['Need help']
col2 = dftest['Need comp']
dftest['help sort value'] = [(x**2-m.floor((.7*x))) * (abs(y**3-y**2-y)) for x, y in zip(col1, col2)]
dftest.sort_values('help sort value', ascending=False)

Unnamed: 0,Need help,Need comp,help sort value
15,3,3,105
11,2,3,45
7,1,3,15
14,3,2,14
13,3,1,7
10,2,2,6
9,2,1,3
6,1,2,2
5,1,1,1
0,0,0,0


#### Test on mock data

In [12]:
col1 = df['need_help_aquiring_rank']
col2 = df['need_new_comp_rank']
df['computer_assignment_sorting_value'] = [(x**2-m.floor((.7*x))) * (abs(y**3-y**2-y)) for x, y in zip(col1, col2)]
df.sort_values('computer_assignment_sorting_value', ascending=False)

Unnamed: 0,Name,need_new_comp_rank,need_help_aquiring_rank,computer_assignment_sorting_value
47,Norah Scott,3,3,105
39,Paisley Long,3,3,105
12,Emilia Parker,3,3,105
9,Thomas Hernandez,3,3,105
3,Mackenzie Howard,3,2,45
37,Melanie Anderson,3,2,45
29,Addison Thomas,3,2,45
41,Dylan Morales,3,2,45
43,Cora Thomas,3,1,15
27,Jacob Morales,3,1,15


## Change sort value to be presentable

This is where the sort value is converted into something that is easy to read for the end user, rather than returning the pure number sorting value.  Created by making a function that returns a list of the unique values in the helps sort values column, then using numpy interp, converts to a 10 star rating system.  The formula is then applied over the dataframe's computer assignment sorting values to create a new column for Computer Assistance Star Rating.

In [13]:
def convert_to_star(dataframe, x):
    '''The function used to convert the computer assignment value to a star rating'''
    sortvals = list(dataframe.computer_assignment_sorting_value.unique())
    sortvals.sort()
    StarRatings = [1,2,3,4,5,6,7,8,9,10]
    numstars = int(np.interp(x,sortvals,StarRatings))
    stars = '⭐️' * numstars
    return stars

In [14]:
df['Computer Assistance Star Rating'] = [convert_to_star(df, x) for x in df['computer_assignment_sorting_value']]


In [15]:
df[['Name','Computer Assistance Star Rating']].sort_values('Computer Assistance Star Rating', ascending=False)

Unnamed: 0,Name,Computer Assistance Star Rating
47,Norah Scott,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
39,Paisley Long,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
12,Emilia Parker,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
9,Thomas Hernandez,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
3,Mackenzie Howard,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
37,Melanie Anderson,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
29,Addison Thomas,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
41,Dylan Morales,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
43,Cora Thomas,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️
27,Jacob Morales,⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️


## TODO

- add collection to mongodb database so that this data can be handled seperately

- create api functions for new collection 

- connect to backend/frontend

Find more details on the trello card