# **AI Career Advisor**: A Predictive Model for Career Path Selection.

## **Description**

This project will leverage machine learning to guide individuals in discovering career paths aligned with their personalities, interests, and work preferences. By analyzing a range of factors, the model aims to provide tailored career recommendations for individuals just starting out, and professionals seeking a career transition.

## **Initial Road Map**

#### **Define the Problem:**

The aim of this project is to predict or recommend career paths based on certain criteria; this requires taking into consideration some factors that could influence individual career decisions, such as:

* Personality traits
* Interests or hobbies
* Skills or strengths
* Educational background
* Work-life balance preferences

**Data Collection:**

* Conduct a survey to gather my own data.

**Feature Selection:**

* Identify which features (e.g., personality type, education level, interest areas) will be used to train the model.

**Modeling:**

* Classification algorithms to predict the suitable career. For example:Random Forest or Decision Trees might be a good start since they handle categorical data well.

* Logistic Regression or even K-Nearest Neighbors.

* Recommendation systems or clustering techniques.(For a broader range of careers)

**Evaluation:**

* Test model to see how well it predicts a career based on a test set of data. Use metrics like accuracy, precision, and recall.

**Interface/User Interaction:**

* Create a simple web or app interface where users input their details, and the model returns career suggestions.
* Use Flask (Python) or other frameworks like Streamlit.

**Resources:**

Libraries: 

* pandas scikit-learn
* XGBoost for data processing and modeling.



In [2]:
import pandas as pd
from sklearn.preprocessing import StandardScaler, MultiLabelBinarizer

# Load dataset
df = pd.read_csv('Career Path Survey.csv')

In [3]:
df.head()

Unnamed: 0,Timestamp,What is your gender?,What is your current employment status?,How many years of professional work experience do you have?,What is your highest level of education?,What is your age?,What is the main reason you are looking to transition into a new career?,"Other, pls specify",What field(s) are you considering for your career transition?,What challenges do you anticipate in making this transition?,...,Do you have any professional certifications or licenses?,"If Yes, please specify",How do you usually spend your free time?,How do you prefer to recharge after a long day?,"In large social gatherings, do you typically feel",How would you describe yourself in conversations?,Which type of work environment do you prefer?,Do you prefer working in,How do you handle uncertainty or change in the workplace?,What is your ideal work-life balance?
0,9/29/2024 12:08:43,Female,"Unemployed, looking for work",4-7 years,Professional certifications,45-54,Seeking higher income or more security,The water,Tech,Difficulty finding opportunities,...,Yes,TRCN,Spending time alone or in quiet reflection,,Spending time alone or in quiet reflection,I mostly listen and speak when necessary,"Flexible, dynamic, and open-ended","Independent, solo work",I adapt quickly and enjoy new challenges,Balanced between work and personal time
1,9/29/2024 15:04:09,Female,Employed full-time,1-3 years,Master’s degree,25-34,,,,,...,No,,Socializing with friends or attending events,Spending time alone or in quiet reflection,Energized and excited by the group,I mostly listen and speak when necessary,"Flexible, dynamic, and open-ended",Team settings with collaboration,I adapt quickly and enjoy new challenges,Balanced between work and personal time
2,9/29/2024 15:16:10,Female,"Unemployed, looking for work",10+ years,Master’s degree,35-44,Seeking higher income or more security,Financial instability during the transition,Academics,Difficulty finding opportunities,...,,,,,,,,,,
3,9/29/2024 16:43:13,Female,Employed full-time,4-7 years,Bachelor’s degree,25-34,,,,,...,Yes,Yes,Spending time alone or with one or two close f...,Being with people or engaging in social activi...,Overwhelmed and prefer smaller groups,I often start conversations and talk a lot,"Structured, organized, and predictable",Team settings with collaboration,I adapt quickly and enjoy new challenges,Balanced between work and personal time
4,9/29/2024 19:19:23,Male,Employed full-time,10+ years,Master’s degree,45-54,Personal growth or a new challenge,New challenge and decision making,Executive Management,Difficulty finding opportunities,...,,,,,,,,,,


In [4]:
# Preprocess dataset
# Drop redundant columns
df.drop(columns=['Timestamp'], inplace=True)

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 83 entries, 0 to 82
Data columns (total 29 columns):
 #   Column                                                                                                                    Non-Null Count  Dtype  
---  ------                                                                                                                    --------------  -----  
 0   What is your gender?                                                                                                      83 non-null     object 
 1   What is your current employment status?                                                                                   83 non-null     object 
 2   How many years of professional work experience do you have?                                                               65 non-null     object 
 3   What is your highest level of education?                                                                                  83 non-null    

In [73]:
df.duplicated().sum()

0

In [6]:
#Remove whitespaces
df.columns = df.columns.str.strip()

In [7]:
# Dropping specific columns by name
df.drop(columns=['Other, please specify .1','Other, please specify .2', 'Other, pls specify'], inplace=True)


In [8]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 83 entries, 0 to 82
Data columns (total 25 columns):
 #   Column                                                                                                                    Non-Null Count  Dtype  
---  ------                                                                                                                    --------------  -----  
 0   What is your gender?                                                                                                      83 non-null     object 
 1   What is your current employment status?                                                                                   83 non-null     object 
 2   How many years of professional work experience do you have?                                                               65 non-null     object 
 3   What is your highest level of education?                                                                                  83 non-null    

In [9]:
#Create new column labels 

# Define a dictionary with old column names as keys and new column names as values
new_column_names = {
    'What is your gender?': 'gender',
    'What is your current employment status?': 'employment_status',
    'How many years of professional work experience do you have?': 'years_experience',
    'What is your highest level of education?': 'education_level',
    'What is your age?': 'age',
    'What is the main reason you are looking to transition into a new career?': 'transition_reason',
    'What field(s) are you considering for your career transition?': 'desired_field',
    'What challenges do you anticipate in making this transition?': 'transition_challenge',
    'Other, please specify': 'challenges_other',
    'What is your current job title or industry (if applicable)?': 'current_job_title',
    'How satisfied are you with your current career?\n(On a scale of 1 to 5, where 1 = not satisfied and 5 = highly satisfied)': 'career_satisfaction',
    'What is your primary motivation for changing careers or choosing a career path?': 'primary_motivation',
    'What type of career do you aspire to have in the future?': 'aspired_job',
    'Which technical skills do you have? (Select all that apply)': 'technical_skills',
    'Other, pls specify ': 'technical_skills_other',
    'Which soft skills do you possess? (Select all that apply)': 'soft_skills',
    'Do you have any professional certifications or licenses?': 'has_certifications',
    'If Yes, please specify': 'certifications',
    'How do you usually spend your free time?': 'free_time_activity',
    'How do you prefer to recharge after a long day?': 'recharge_preference',
    'In large social gatherings, do you typically feel': 'social_gathering_feelings',
    'How would you describe yourself in conversations?': 'conversation_style',
    'Which type of work environment do you prefer?': 'work_environment_preference',
    'Do you prefer working in': 'work_style_preference',
    'How do you handle uncertainty or change in the workplace?': 'uncertainty_handling',
    'What is your ideal work-life balance?': 'ideal_work_life_balance'
}

# Rename the columns
df.rename(columns=new_column_names, inplace=True)

In [15]:
df.head(1)

Unnamed: 0,gender,employment_status,years_experience,education_level,age,transition_reason,desired_field,transition_challenge,challenges_other,current_job_title,...,has_certifications,certifications,free_time_activity,recharge_preference,social_gathering_feelings,conversation_style,work_environment_preference,work_style_preference,uncertainty_handling,ideal_work_life_balance
0,1,"Unemployed, looking for work",4-7 years,Professional certifications,3,Seeking higher income or more security,Tech,Difficulty finding opportunities,The water,Transcriptionist,...,Yes,TRCN,Spending time alone or in quiet reflection,,Spending time alone or in quiet reflection,I mostly listen and speak when necessary,"Flexible, dynamic, and open-ended","Independent, solo work",I adapt quickly and enjoy new challenges,Balanced between work and personal time


In [17]:
df.isnull().sum()

gender                          0
employment_status               0
years_experience               18
education_level                 0
age                             0
transition_reason              34
desired_field                  41
transition_challenge           36
challenges_other               74
current_job_title              20
career_satisfaction            18
primary_motivation             18
aspired_job                    18
technical_skills                8
soft_skills                     8
has_certifications              8
certifications                 52
free_time_activity             18
recharge_preference            20
social_gathering_feelings      18
conversation_style             18
work_environment_preference    18
work_style_preference          18
uncertainty_handling           18
ideal_work_life_balance        18
dtype: int64

In [12]:
#Encoding Age
df['age'] = df['age'].map({'45-54':3, '25-34':1, '35-44':2, '18-24':0})

In [16]:
df['years_experience'].unique()

array(['4-7 years', '1-3 years', '10+ years', nan, '8-10 years',
       'Less than 1 year'], dtype=object)

In [14]:
#Encode columns
# Encoding gender
df['gender'] = df['gender'].map({'Male': 0, 'Female': 1})

In [11]:
data.drop(columns=['What is your gender?'], inplace=True)

In [6]:

# One-hot encoding employment status
data = pd.get_dummies(data, columns=['What is your current employment status?'])

# Normalizing years of professional work experience
scaler = StandardScaler()
data['years_of_experience'] = scaler.fit_transform(data[['How many years of professional work experience do you have?']])

# Encoding highest level of education
data = pd.get_dummies(data, columns=['What is your highest level of education?'])

# Normalizing age
data['age'] = scaler.fit_transform(data[['What is your age?']])

# Handling main reason for transition
data = pd.get_dummies(data, columns=['What is the main reason you are looking to transition into a new career?'])

# Handling fields for career transition
data['career_fields'] = data['What field(s) are you considering for your career transition?'].str.get_dummies(sep=', ')

# Encoding challenges
data = pd.get_dummies(data, columns=['What challenges do you anticipate in making this transition?'])

# Encoding current job title
data['current_job_title'] = data['What is your current job title or industry (if applicable)?'].astype('category').cat.codes

# Normalizing satisfaction rating
data['satisfaction'] = scaler.fit_transform(data[['How satisfied are you with your current career?']])

# Encoding primary motivation
data = pd.get_dummies(data, columns=['What is your primary motivation for changing careers or choosing a career path?'])

# Encoding aspirational job title
data['target'] = data['What type of career do you aspire to have in the future?']


ValueError: could not convert string to float: '4-7 years'

In [None]:
# Handling technical skills
mlb = MultiLabelBinarizer()
technical_skills = mlb.fit_transform(data['Which technical skills do you have? (Select all that apply)'].str.split(', '))
data = data.join(pd.DataFrame(technical_skills, columns=mlb.classes_))

# Encoding soft skills
soft_skills = mlb.fit_transform(data['Which soft skills do you possess? (Select all that apply)'].str.split(', '))
data = data.join(pd.DataFrame(soft_skills, columns=mlb.classes_))

# Encoding professional certifications
data['has_certification'] = data['Do you have any professional certifications or licenses?'].apply(lambda x: 1 if x == 'Yes' else 0)

# Handling personality questions (use the earlier mapping logic)
def personality_type(row):
    if row['How do you usually spend your free time?'] == 'Spending time alone or in quiet reflection':
        return 'introvert'
    else:
        return 'extrovert'

data['personality_type'] = data.apply(personality_type, axis=1)

# Final cleanup: Drop unnecessary columns
data.drop(columns=['What is your gender?', 'What is your current employment status?', 'How many years of professional work experience do you have?',
                  'What is your highest level of education?', 'What is your age?', 'What is the main reason you are looking to transition into a new career?',
                  'What field(s) are you considering for your career transition?', 'What challenges do you anticipate in making this transition?',
                  'What is your current job title or industry (if applicable)?', 'What is your primary motivation for changing careers or choosing a career path?',
                  'What type of career do you aspire to have in the future?', 'Which technical skills do you have? (Select all that apply)',
                  'Which soft skills do you possess? (Select all that apply)', 'Do you have any professional certifications or licenses?',
                  'How do you usually spend your free time?', 'How do you prefer to recharge after a long day?',
                  'In large social gatherings, do you typically feel?', 'How would you describe yourself in conversations?',
                  'Which type of work environment do you prefer?', 'Do you prefer working in?',
                  'How do you handle uncertainty or change in the workplace?', 'What is your ideal work-life balance?'], inplace=True)

# Final data preparation
data = data.reset_index(drop=True)
