# Career Path Analysis Tool

## Overview
This notebook analyzes different career paths based on multiple criteria to help you find best fit for your skills, preferences, and priorities.

## How It Works
1. **Data Definition**: Defines roles with various attributes (salary, stability, work-life balance, etc.)
2. **Personal Inputs**: Enter your weighted preferences and current skill levels
3. **Scoring Algorithm**: Calculates fit scores based on weighted criteria and skill gaps
4. **Visualization**: Displays results through multiple interactive charts
5. **Ultimate Ranking**: Combines all factors into a final score based on your priorities

## Key Metrics
- **Fit Score**: How well the job matches your personality and preferences
- **Salary**: Gross annual salary in EUR (Portugal market)
- **Learning Gap**: Points representing how much you need to learn (lower is better)
- **Ultimate Score**: Weighted combination of fit, salary, and speed to hire

## Setup and Imports

In [34]:
# Import required libraries for data manipulation and visualization
import pandas as pd  # Data manipulation and analysis
import plotly.express as px  # Interactive visualizations
import plotly.graph_objects as go  # Advanced plot customization

## Data Definition

This section defines career roles and their attributes, along with tools required for each role.

In [35]:
# Define career roles with their attributes (scale: 1-10, higher is better unless noted)
# Attributes:
# - Low_Syntax_Need: Lower is better (less coding/technical complexity)
# - Stability: Job market stability and security
# - Visual_Output: Amount of visual/dashboard work
# - Work_Life_Balance: Balance between work and personal life
# - Innovation: Opportunity for creative and innovative work
# - Remote_Friendly: Ability to work remotely
# - Math_Intensity: Level of mathematical/analytical work
# - Entry_Difficulty: Difficulty of entering the field (lower is easier)
# - Customer_Facing: Level of interaction with customers/stakeholders
# - Avg_Junior_Salary_EUR: Average gross annual salary for junior positions
data = {
    'Role': [
        'Data Analyst', 'Data Engineer', 'BI Developer', 'Database Admin (DBA)', 
        'Crypto Developer', 'Product Manager', 'Analytics Engineer', 'Technical Writer'
    ],
    'Low_Syntax_Need': [8, 3, 7, 2, 1, 6, 7, 9],
    'Stability':       [8, 7, 8, 9, 1, 7, 6, 4],
    'Visual_Output':   [9, 1, 10, 1, 1, 8, 6, 2],
    'Work_Life_Balance':[7, 6, 7, 8, 3, 5, 6, 8],
    'Innovation':      [5, 7, 4, 2, 9, 6, 7, 5],
    'Remote_Friendly': [7, 6, 6, 3, 8, 5, 6, 9],
    'Math_Intensity':  [4, 5, 3, 4, 8, 2, 5, 2],
    'Entry_Difficulty':[3, 6, 4, 6, 9, 8, 6, 4],
    'Customer_Facing': [7, 3, 5, 2, 3, 9, 4, 1],
    'Avg_Junior_Salary_EUR': [26000, 32000, 27000, 30000, 40000, 38000, 31000, 22000]
}

# Create DataFrame from the data dictionary
df = pd.DataFrame(data)

# Define common tools/technologies required for each role
tools_data = {
    'Data Analyst': ['SQL', 'Excel', 'PowerBI', 'Python (Analysis)'],
    'Data Engineer': ['Python (Scripting)', 'SQL', 'AWS/Azure', 'Spark', 'Airflow'],
    'BI Developer': ['PowerBI', 'DAX', 'SQL', 'Tableau'],
    'Database Admin (DBA)': ['Oracle', 'SQL Server', 'PL/SQL'],
    'Crypto Developer': ['Solidity', 'React', 'Web3'],
    'Product Manager': ['Jira', 'Confluence', 'Metrics', 'SQL'],
    'Analytics Engineer': ['SQL', 'dbt', 'Python (Analysis)', 'Git'],
    'Technical Writer': ['English', 'Markdown', 'Git']
}

# Map tools to each role in the DataFrame
df['Common_Tools'] = df['Role'].map(tools_data)

# Define tool profiles with pain (difficulty) and reward (satisfaction) scores
# Pain: How difficult/complex the tool is to learn (scale: 0-10)
# Reward: How satisfying/rewarding it is to work with (scale: 0-10)
tool_profile = {
    'SQL': {'pain': 2, 'reward': 0}, 'Oracle': {'pain': 4, 'reward': 0}, 'SQL Server': {'pain': 4, 'reward': 0},
    'PL/SQL': {'pain': 5, 'reward': 0}, 'dbt': {'pain': 4, 'reward': 2}, 'Python (Analysis)': {'pain': 3, 'reward': 1},
    'Python (Scripting)': {'pain': 8, 'reward': 0}, 'Spark': {'pain': 7, 'reward': 0}, 'Solidity': {'pain': 9, 'reward': 0},
    'React': {'pain': 9, 'reward': 2}, 'Excel': {'pain': 1, 'reward': 8}, 'PowerBI': {'pain': 2, 'reward': 10},
    'Tableau': {'pain': 2, 'reward': 10}, 'DAX': {'pain': 5, 'reward': 8}, 'AWS/Azure': {'pain': 7, 'reward': 0},
    'Airflow': {'pain': 7, 'reward': 0}, 'Web3': {'pain': 9, 'reward': 0}, 'Jira': {'pain': 0, 'reward': 0},
    'Confluence': {'pain': 0, 'reward': 0}, 'Git': {'pain': 4, 'reward': 0}, 'English': {'pain': 0, 'reward': 0},
    'Markdown': {'pain': 0, 'reward': 0}, 'Metrics': {'pain': 2, 'reward': 2}
}

## Personal Inputs

Customize these values to match your personal preferences and current skill levels.

In [36]:
# --- CRITERIA WEIGHTS (Scale: 1-10, higher = more important) ---
# Adjust these weights based on what matters most to you
my_weights = {
    'Low_Syntax_Need': 6,    # Preference for less coding/technical work
    'Stability': 8,          # Importance of job security
    'Visual_Output': 8,       # Desire for visual/dashboard work
    'Work_Life_Balance': 8,   # Importance of work-life balance
    'Innovation': 5,         # Desire for creative/innovative work
    'Remote_Friendly': 8,     # Importance of remote work options
    'Math_Intensity':7,      # Comfort with mathematical/analytical work
    'Entry_Difficulty': 3,    # Preference for easier entry (lower = want easier)
    'Customer_Facing': 5      # Comfort with customer interactions
}

# Minimum acceptable annual salary in EUR
min_acceptable_salary = 24000

# --- CURRENT SKILL LEVELS (Scale: 0-10, higher = more skilled) ---
# Rate your current proficiency in each tool/technology
my_current_skills = {
    'SQL': 5, 'Excel': 4, 'PowerBI': 0, 'Python (Analysis)': 4, 'Python (Scripting)': 3,
    'Tableau': 0, 'DAX': 0, 'AWS/Azure': 1, 'Spark': 0, 'Airflow': 2, 'Git': 4, 'English': 8, 
    'Oracle': 0, 'SQL Server': 2, 'PL/SQL': 0, 'Solidity': 0, 'React': 0, 'Web3': 0, 
    'Jira': 2, 'Confluence': 2, 'Markdown': 4, 'Metrics': 1
}

## Scoring Algorithm

This section calculates fit scores, learning gaps, and filters roles based on salary requirements.

In [37]:
# Get list of criteria columns (excluding Role, Tools, and Salary)
criteria_cols = [col for col in df.columns if col not in ['Common_Tools', 'Role', 'Avg_Junior_Salary_EUR']]

# Initialize lists to store calculated scores
fit_scores = []      # Overall job compatibility score
salary_checks = []   # Whether salary meets minimum requirement
learning_gaps = []   # Total learning gap points for each role

# Calculate sensitivity multipliers based on personal weights
# These adjust how much tool pain/reward affects the final score
syntax_sens = my_weights['Low_Syntax_Need'] * 0.5  # Sensitivity to tool complexity
visual_sens = my_weights['Visual_Output'] * 0.2     # Sensitivity to visual rewards

# Required skill level (out of 10) for job readiness
required_level = 8

# Calculate scores for each role
for role in df.index:
    # Calculate base job score from weighted criteria
    job_score = sum(df.loc[role, col] * my_weights[col] for col in criteria_cols)
    
    # Get tools required for this role
    tools = df.loc[role, 'Common_Tools']
    
    # Initialize adjustment and gap scores
    adj_score = 0  # Adjustment based on tool pain/reward
    gap_score = 0  # Total learning gap points
    
    # Evaluate each tool required for the role
    for tool in tools:
        # Get tool profile (default to neutral if not found)
        prof = tool_profile.get(tool, {'pain': 0, 'reward': 0})
        
        # Adjust score based on tool pain (negative) and reward (positive)
        adj_score += (prof['reward'] * visual_sens) - (prof['pain'] * syntax_sens)
        
        # Calculate learning gap (how much you need to learn)
        current_level = my_current_skills.get(tool, 0)
        gap = max(0, required_level - current_level)
        gap_score += gap
            
    # Store calculated scores
    fit_scores.append(job_score + adj_score)
    salary_checks.append(df.loc[role, 'Avg_Junior_Salary_EUR'] >= min_acceptable_salary)
    learning_gaps.append(gap_score)

# Add calculated scores to the DataFrame
df['Fit_Score'] = fit_scores
df['Salary_Passes'] = salary_checks
df['Learning_Gap_Points'] = learning_gaps

# Filter to only include roles that meet the salary requirement
df_real = df[df['Salary_Passes'] == True].copy()

# Sort by fit score (descending) to show best matches first
df_real = df_real.sort_values(by='Fit_Score', ascending=False)

## Visualization 1: The Money

Display gross annual salary for each qualifying role, with a reference line for your minimum acceptable salary.

In [38]:
# Create bar chart showing salaries for qualifying roles
fig_money = px.bar(
    df_real,
    x='Role',
    y='Avg_Junior_Salary_EUR',
    title="1. The Money: Gross Annual Salary (EUR)",
    text='Avg_Junior_Salary_EUR',
    color='Avg_Junior_Salary_EUR',
    color_continuous_scale='Blues'  # Blue color scale for professional look
)

# Add horizontal reference line for minimum acceptable salary
fig_money.add_hline(
    y=min_acceptable_salary, 
    line_dash="dash", 
    line_color="red", 
    annotation_text="Your Minimum", 
    annotation_position="top right"
)

# Format text labels and axis titles
fig_money.update_traces(texttemplate='%{text:,.0f}€', textposition='outside')
fig_money.update_layout(yaxis_title="Gross Salary (€)", xaxis_title="")
fig_money.show()

## Visualization 2: The Love

Display fit score for each role, indicating how well each job matches your personality and preferences.

In [39]:
# Create bar chart showing fit scores (job compatibility)
fig_love = px.bar(
    df_real,
    x='Role',
    y='Fit_Score',
    title="2. The Love: How well does this job fit your personality?",
    text='Fit_Score',
    color='Fit_Score',
    color_continuous_scale='RdYlGn'  # Red-Yellow-Green: Red=Low, Green=High
)

# Format text labels and axis titles
fig_love.update_traces(texttemplate='%{text:.0f}', textposition='outside')
fig_love.update_layout(yaxis_title="Compatibility Points", xaxis_title="")
fig_love.show()

## Visualization 3: The Effort

Display learning gap points for each role. Lower values indicate less learning required to become job-ready.

In [40]:
# Sort by learning gap (ascending) to show easiest paths first
df_effort = df_real.sort_values(by='Learning_Gap_Points', ascending=True)

# Create bar chart showing learning gap points
fig_effort = px.bar(
    df_effort,
    x='Role',
    y='Learning_Gap_Points',
    title="3. The Effort: How much do you need to learn? (Lower is Better)",
    text='Learning_Gap_Points',
    color='Learning_Gap_Points',
    color_continuous_scale='Viridis_r'  # Reversed Viridis: Yellow=Low, Purple=High
)

# Format text labels and axis titles
fig_effort.update_traces(texttemplate='%{text:.0f} pts', textposition='outside')
fig_effort.update_layout(yaxis_title="Study Effort Points", xaxis_title="")
fig_effort.show()

## Visualization 4: The Strategy Matrix

A scatter plot showing the relationship between learning effort (X-axis) and job compatibility (Y-axis). Bubble size represents salary.

In [41]:
# Create scatter plot: Effort (X) vs Reward (Y)
# - X-axis: Learning effort required (lower is better)
# - Y-axis: Job compatibility score (higher is better)
# - Bubble size: Salary (larger = higher salary)
# - Color: Role (for identification)
fig_strategy = px.scatter(
    df_real,
    x='Learning_Gap_Points', 
    y='Fit_Score',           
    size='Avg_Junior_Salary_EUR', 
    color='Role',
    title="4. The Strategy Matrix: Effort (X) vs Reward (Y)",
    labels={'Learning_Gap_Points': 'Learning Effort Required', 'Fit_Score': 'Job Compatibility'},
    hover_data={'Avg_Junior_Salary_EUR': ':.0f', 'Role': True},
    trendline="ols"  # Add trend line for reference
)

# Hide legend for cleaner look
fig_strategy.update_layout(showlegend=False)
fig_strategy.show()

# Display summary table
print("--- DATA TABLE ---")
print(df_real[['Role', 'Fit_Score', 'Avg_Junior_Salary_EUR', 'Learning_Gap_Points']].to_string(index=False))

--- DATA TABLE ---
                Role  Fit_Score  Avg_Junior_Salary_EUR  Learning_Gap_Points
        Data Analyst      399.4                  26000                   19
        BI Developer      379.8                  27000                   27
     Product Manager      340.2                  38000                   22
  Analytics Engineer      307.8                  31000                   19
Database Admin (DBA)      207.0                  30000                   22
       Data Engineer      188.0                  32000                   29
    Crypto Developer      175.2                  40000                   24


## Ultimate Score Calculator

Combine all factors into a single "Ultimate Score" based on your personal priorities. Adjust priority weights to see how different priorities affect the ranking.

In [44]:
# --- STEP 1: Define your LIFE PRIORITIES ---
# Adjust these 3 numbers. They must add up to 1.0 (100%).
# Example: If Fit is 0.5, that counts for 50% of your decision.

priority_fit = 0.60   # How much do you need to LIKE the job?
priority_money = 0.20 # How much do you need the MAX salary?
priority_speed = 0.20 # How much do you need to get hired FAST? 

# --- STEP 2: Normalize the Data (Scale 0-1) ---

def normalize(series, invert=False):
    """
    Normalize a pandas Series to the range [0, 1].
    
    Parameters:
    -----------
    series : pd.Series
        The series to normalize
    invert : bool, optional (default=False)
        If True, invert the normalization so lower values get higher scores.
        This is useful for metrics where lower is better (e.g., learning gap).
    
    Returns:
    --------
    pd.Series
        Normalized series with values in [0, 1]
    """
    min_val = series.min()
    max_val = series.max()
    range_val = max_val - min_val
    
    # Avoid division by zero if all values are the same
    if range_val == 0:
        return pd.Series([0.5] * len(series), index=series.index)
    
    if invert:
        # Invert means Lower Value = Higher Score (Good for Learning Gap)
        # Formula: (Max - Current) / Range
        return (max_val - series) / range_val
    else:
        # Standard: Higher Value = Higher Score (Good for Fit & Salary)
        # Formula: (Current - Min) / Range
        return (series - min_val) / range_val

# Normalize each metric to the 0-1 scale
df_real['Fit_Norm'] = normalize(df_real['Fit_Score'], invert=False)      # Higher is better
df_real['Salary_Norm'] = normalize(df_real['Avg_Junior_Salary_EUR'], invert=False)  # Higher is better
df_real['Speed_Norm'] = normalize(df_real['Learning_Gap_Points'], invert=True)  # Lower is better

# --- STEP 3: Calculate the Ultimate Score ---
# Weighted combination of normalized scores
df_real['Ultimate_Score'] = (
    (df_real['Fit_Norm'] * priority_fit) +
    (df_real['Salary_Norm'] * priority_money) +
    (df_real['Speed_Norm'] * priority_speed)
)

# Sort by Ultimate Score (descending)
df_final = df_real.sort_values(by='Ultimate_Score', ascending=False)

# Display the final ranking table
print(f"--- THE ULTIMATE RANKING (Priorities: Fit {priority_fit*100}%, Money {priority_money*100}%, Speed {priority_speed*100}%) ---")
cols = ['Role', 'Ultimate_Score', 'Fit_Score', 'Avg_Junior_Salary_EUR', 'Learning_Gap_Points']
print(df_final[cols].to_string(index=False))

--- THE ULTIMATE RANKING (Priorities: Fit 60.0%, Money 20.0%, Speed 20.0%) ---
                Role  Ultimate_Score  Fit_Score  Avg_Junior_Salary_EUR  Learning_Gap_Points
        Data Analyst        0.800000      399.4                  26000                   19
     Product Manager        0.752999      340.2                  38000                   22
  Analytics Engineer        0.626290      307.8                  31000                   19
        BI Developer        0.601833      379.8                  27000                   27
    Crypto Developer        0.300000      175.2                  40000                   24
Database Admin (DBA)        0.282245      207.0                  30000                   22
       Data Engineer        0.119969      188.0                  32000                   29


## Final Decision

The final visualization shows the Ultimate Score for each role, with the winner highlighted at the top.

In [45]:
# Create final bar chart showing Ultimate Scores
fig_winner = px.bar(
    df_final,
    x='Role',
    y='Ultimate_Score',
    title="FINAL DECISION: The Best Career Path For You",
    text='Ultimate_Score',
    color='Ultimate_Score',
    color_continuous_scale='Turbo',  # High contrast color scale
    labels={'Ultimate_Score': 'Overall Success Probability'}
)

# Format text labels and axis titles
fig_winner.update_traces(texttemplate='%{text:.2f}', textposition='outside')
fig_winner.update_layout(yaxis_title="Ultimate Score (0-1)", xaxis_title="", showlegend=False)

# Add text annotation for the winner
winner_role = df_final.iloc[0]['Role']
winner_score = df_final.iloc[0]['Ultimate_Score']
fig_winner.add_annotation(
    x=0,
    y=winner_score + 0.05,
    text=f"Winner: {winner_role}",
    showarrow=False,
    font=dict(size=16, color="black")
)

fig_winner.show()

## Summary and Next Steps

### How to Use This Analysis

1. **Review the Winner**: The top role in the final chart is your best match based on current priorities

2. **Adjust Priorities**: Go back to the "Ultimate Score Calculator" section and change the `priority_*` values to see how different priorities affect the ranking

3. **Update Skills**: If you learn new skills, update `my_current_skills` in the "Personal Inputs" section and re-run the analysis

4. **Change Weights**: Modify `my_weights` in the "Personal Inputs" section to reflect different preferences

### Understanding the Metrics

- **Fit Score**: Combines all job attributes weighted by your preferences
- **Learning Gap**: Total points needed to reach proficiency (8/10) in all required tools
- **Ultimate Score**: Weighted average of normalized fit, salary, and speed scores

### Tips for Career Planning

- Focus on roles with high fit scores and low learning gaps for quick wins
- Consider roles with high salaries if financial goals are a priority
- Use the Strategy Matrix to find roles in the "sweet spot" (low effort, high reward)