# CSUEB Data Science Club Fall 2020 Project

This project is for undergraduate and graduate students who are looking for an extracurricular project to sharpen their data science skills. The problem is based off a real mentorship program being offered by the San Francisco Professional Chapter of ALPFA in partnership with the CSUEB Student Chapter of ALPFA. The program was launched over the summer 2020 and will have recurring periodic enrollment moving forward, this project seeks to automate the matching of mentors/mentees, a process which is being done manually. The tasks will be broken up into three sections, beginning with the creation of our mock survey results below. The first task: Mentee Ranking, will be solved at our club's second live event later this semester, and the last task: Stable Matching, will be solved at our club's final event at the end of the semester. Direct any questions to info.csueb.dsc@gmail.com. Happy problem solving!

#### We begin by importing some basic packages

In [1]:
import random
import pandas as pd

This is a function to generate a list of 10 random numbers from 0 to a specified number "num":

In [2]:
def surveyCol(num):
    return [random.randint(0, num) for _ in range(10)]

This is script to create objects from a class called "Participants," these objects are structured like a dictionary with key-value pairs but will need to be converted to a "dict" type for us to perform dictionary operations on them.

In [3]:
class Participant:
    def __init__(self, name):
        self.name = name
        self.primary = surveyCol(5)
        self.ideal_match = surveyCol(5)
        self.level_of_importance = surveyCol(3)

Here is our lists of participating mentors and mentees.

In [4]:
mentor_names = ['Jose', 'Amanda', 'Francisco', 'Megan', 'Phil', 'Carla']
mentee_names = ['Chris', 'Kevin', 'Rachel', 'Monica', 'Emily', 'William']

These are two functions, the first takes dictionary structured objects converts them to a dictionary format, and will be called in our second function. "surveyGroup" takes a list of strings as an argument and creates a "Participant" object from each string, and calls the "convert" function on each object. A new list of dictionaries is returned.

In [5]:
def convert(dict):
    dict = dict.__dict__
    return dict

def surveyGroup(list):
    user_list = []
    for i in range(len(list)):
        user_list.append(convert(Participant(list[i])))
    return user_list

Here we pass our lists of participating mentor and mentee names to the above functions and get our new lists with each name, primary survey answers, ideal matches survey answers and a level of importance survey responses as keys with their respective values.

In [6]:
mentors = surveyGroup(mentor_names)
mentees = surveyGroup(mentee_names)

Here we print out our newly created lists

In [7]:
for mentor in mentors:
    print(mentor)

{'name': 'Jose', 'primary': [1, 0, 3, 3, 5, 5, 5, 0, 0, 5], 'ideal_match': [2, 0, 3, 1, 5, 1, 0, 4, 0, 0], 'level_of_importance': [3, 0, 0, 2, 3, 2, 1, 1, 2, 1]}
{'name': 'Amanda', 'primary': [2, 1, 4, 0, 1, 5, 3, 2, 0, 1], 'ideal_match': [2, 4, 4, 1, 5, 2, 2, 2, 0, 5], 'level_of_importance': [3, 1, 3, 0, 2, 0, 1, 1, 0, 1]}
{'name': 'Francisco', 'primary': [2, 1, 3, 1, 3, 5, 4, 5, 3, 4], 'ideal_match': [5, 5, 3, 0, 0, 4, 2, 4, 3, 5], 'level_of_importance': [2, 3, 0, 0, 2, 3, 1, 2, 2, 2]}
{'name': 'Megan', 'primary': [3, 3, 5, 3, 4, 4, 1, 2, 0, 4], 'ideal_match': [2, 5, 0, 2, 1, 4, 0, 1, 4, 4], 'level_of_importance': [3, 2, 3, 2, 2, 3, 1, 0, 3, 0]}
{'name': 'Phil', 'primary': [4, 3, 4, 4, 5, 3, 4, 3, 3, 5], 'ideal_match': [5, 2, 5, 3, 4, 5, 1, 5, 3, 4], 'level_of_importance': [3, 3, 2, 1, 1, 2, 0, 3, 2, 2]}
{'name': 'Carla', 'primary': [0, 4, 5, 3, 4, 1, 4, 0, 0, 0], 'ideal_match': [0, 5, 2, 2, 0, 0, 5, 3, 1, 4], 'level_of_importance': [2, 3, 1, 3, 1, 3, 1, 1, 1, 3]}


In [8]:
for mentee in mentees:
    print(mentee)

{'name': 'Chris', 'primary': [2, 0, 0, 2, 0, 1, 1, 2, 3, 4], 'ideal_match': [1, 5, 4, 2, 2, 3, 1, 4, 3, 0], 'level_of_importance': [2, 3, 0, 2, 0, 0, 3, 1, 1, 3]}
{'name': 'Kevin', 'primary': [4, 5, 3, 5, 3, 1, 2, 5, 4, 2], 'ideal_match': [4, 5, 3, 3, 0, 0, 4, 2, 3, 5], 'level_of_importance': [2, 1, 3, 1, 2, 3, 3, 2, 1, 1]}
{'name': 'Rachel', 'primary': [1, 5, 5, 1, 0, 5, 3, 3, 0, 0], 'ideal_match': [5, 0, 0, 0, 1, 4, 3, 3, 4, 3], 'level_of_importance': [3, 0, 2, 1, 1, 3, 2, 3, 1, 1]}
{'name': 'Monica', 'primary': [1, 4, 2, 3, 5, 2, 3, 1, 3, 4], 'ideal_match': [5, 0, 2, 3, 5, 3, 0, 5, 3, 2], 'level_of_importance': [1, 1, 0, 2, 1, 1, 3, 0, 1, 2]}
{'name': 'Emily', 'primary': [2, 4, 5, 0, 4, 3, 3, 3, 4, 4], 'ideal_match': [2, 5, 0, 2, 4, 5, 5, 5, 3, 5], 'level_of_importance': [3, 0, 2, 2, 0, 3, 2, 2, 0, 3]}
{'name': 'William', 'primary': [5, 0, 3, 5, 0, 3, 4, 3, 5, 0], 'ideal_match': [1, 3, 3, 5, 1, 1, 3, 1, 2, 5], 'level_of_importance': [1, 0, 3, 3, 0, 2, 3, 2, 2, 1]}


#### In this step we want to convert our list items(dictionaries), to data frames to make them easier to work with in performing analysis. This is a problem because the first key is not like the others in that it is a string and not a list of 10 integers. We remove it with the pop() function.

In [9]:
for mentor in mentors:
    mentor.pop('name')

Now we must assign the corresponding survey response key-value pairs to a variable with the participating mentor's name:

In [10]:
Jose = pd.DataFrame.from_dict(mentors[0])
Amanda = pd.DataFrame.from_dict(mentors[1])
Francisco = pd.DataFrame.from_dict(mentors[2])
Megan = pd.DataFrame.from_dict(mentors[3])
Phil = pd.DataFrame.from_dict(mentors[4])
Carla = pd.DataFrame.from_dict(mentors[5])

We repeat this process for the mentees:

In [11]:
for mentee in mentees:
    mentee.pop('name')

In [12]:
Chris = pd.DataFrame.from_dict(mentees[0])
Kevin = pd.DataFrame.from_dict(mentees[1])
Monica = pd.DataFrame.from_dict(mentees[2])
Rachel = pd.DataFrame.from_dict(mentees[3])
Emily = pd.DataFrame.from_dict(mentees[4])
William = pd.DataFrame.from_dict(mentees[5])

Finally we create a list of data frames to make parsing through them for analysis more efficient:

In [13]:
df_mentors = [Jose, Amanda, Francisco, Megan, Phil, Carla]
df_mentees = [Chris, Kevin, Monica, Rachel, Emily, William]

### Task 1: Create a compatibility ranking system for mentors & mentees and return a dictionary with the name of each mentor as the value and a sorted list of mentees matched from most compatible to least compatible. 

In [None]:
#Your code here
#Tip: Use the geometric mean of the mentor/mentee survey scores to determine compatibility score 
# used for ranking potential matches.

### Task 2: Based on the sorted list of potential matches pair every mentor with their best available mentee match.

In [None]:
#Your code here
#Tip: Use the "Stable Matching" Algorithm.