# Salary Analysis:

### In this notebook, we analyzed what factors effect your salary in America. These factors include: Age, Gender, Experience, Education Level, and Race.

In [1]:
# Import Libraries
from dash import Dash, html, dcc
from dash.dependencies import Input, Output
import pandas as pd
import plotly.express as px

In [3]:
# Read in updated csv file
usa_salary_data= pd.read_csv("usa_salary_df.csv")
usa_salary_data.head()

Unnamed: 0,Age,Gender,Education Level,Job Title,Years of Experience,Salary,Country,Race,Senior
0,28,Female,Master's Degree,Data Analyst,3,65000,USA,Hispanic,0
1,36,Female,Bachelor's Degree,Sales Associate,7,60000,USA,Hispanic,0
2,52,Male,Master's Degree,Director,20,200000,USA,Asian,0
3,29,Male,Bachelor's Degree,Marketing Analyst,2,55000,USA,Hispanic,0
4,42,Female,Master's Degree,Product Manager,12,120000,USA,Asian,0


In [25]:
# Create a dash app showing scatter plots that compare columns values to Salary
dash_df = usa_salary_data

app = Dash(__name__)

app.layout = html.Div([
    dcc.Dropdown(
        id='column-dropdown',
        options=[
            {'label': 'Age', 'value': 'Age'},
            {'label': 'Gender', 'value': 'Gender'},
            {'label': 'Education Level', 'value': 'Education Level'},
            {'label': 'Years of Experience', 'value': 'Years of Experience'},
            {'label': 'Race', 'value': 'Race'}     
        ],
        value='Age'
    ),
    dcc.Graph(id='salary-bar-graph')
])

@app.callback(
    Output('salary-bar-graph', 'figure'),
    Input('column-dropdown', 'value')
)
def update_graph(selected_column):
    if selected_column == 'Age':
        age_comparision = dash_df.groupby(['Age', 'Gender'])['Salary'].mean().reset_index()
        fig = px.bar(age_comparision, x='Age', y='Salary', color='Gender', barmode='group',
                     title='Average Salary by Age')
                
    elif selected_column == 'Gender':
        fig = px.pie(dash_df, values='Salary', names='Gender', title='Salary Percentage by Gender')
        
    elif selected_column == 'Race':
        fig = px.pie(dash_df, values='Salary', names='Race', title='Salary Percentage by Race')

    elif selected_column == 'Education Level':
        Education_comparision = dash_df.groupby(['Education Level', 'Race'])['Salary'].mean().reset_index()
        fig = px.bar(Education_comparision, x='Education Level', y='Salary', color='Race', barmode='group',
                     title='Average Salary by Education Level')
    
    else:
        salary_comparision = dash_df.groupby(selected_column)['Salary'].mean().reset_index()

        fig = px.bar(salary_comparision, x=selected_column, y='Salary',
                     title=f'Average Salary by {selected_column}')
    return fig

if __name__ == '__main__':
    app.run_server(debug=True)

            
        
            
    

# Conclusion

#  - Based on the data, Age has a masssive factor on salary. The difference between the average salary of a 25 year old and a 50 year old is almost 150,000 dollars.

# - Gender is also a massive factor in salary. Based on this dataset, males are paid 15% more than females.

# - Education Level is another important factor in salary. Workers with Phds' are paid well over 100,000 more dollars compared to workers with highschool diplomas.

# - Workers with more years of experience seem to be paid far more than those with little experience. The jump from 0 years of experience to 15 is over 100,000 dollars.

# - Based on this data, Race is not a very big facter in salary. This dataset does only contain around 1400 entries, so a larger dataset could produce different results.