
# Salary Comparison Chart

This notebook creates a bar chart to compare the average salaries of Data Analysts and Business Analysts in the US.



## Data Preparation

The first step is to prepare the data for plotting. We create a simple dataset with the roles and their corresponding average salaries.


In [None]:

import matplotlib.pyplot as plt

# Data preparation
roles = ['Data Analyst', 'Business Analyst']
salaries = [82000, 93043]



## Initial Plot Creation

Next, we create an initial bar chart to visualize the salaries.


In [None]:

# Creating the initial bar chart
plt.figure(figsize=(10, 6))
plt.bar(roles, salaries, color=['blue', 'green'])
plt.xlabel('Role')
plt.ylabel('Average Salary ($)')
plt.title('Average Salaries of Data and Business Analysts in the US')
plt.ylim(0, 100000)  # Setting y-axis limit
plt.grid(axis='y')
plt.show()



## Updated Plot

Based on further requirements, the plot is updated to remove axis lines, add data labels, increase font sizes, and change bar colors.


In [None]:

# Colors for the bars
colors = ['lightblue', 'pink']

# Creating the bar chart with updated specifications
plt.figure(figsize=(10, 6))
bars = plt.bar(roles, salaries, color=colors)

# Adding data labels
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval, int(yval), 
             va='bottom', ha='center', fontsize=12)

# Removing axis lines and grid
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.gca().spines['left'].set_visible(False)
plt.gca().spines['bottom'].set_visible(False)
plt.grid(False)

# Increasing font size of axis labels and title
plt.xlabel('Role', fontsize=14)
plt.ylabel('Average Salary ($)', fontsize=14)
plt.title('Average Salaries of Data and Business Analysts in the US', fontsize=16)
plt.xticks(fontsize=14)
plt.ylim(0, 100000)
plt.show()



## Final Plot with Y-Axis Units in 'k'

In the final step, we modify the y-axis to display units in 'k' for every thousand and increase the font size of the data labels.


In [None]:

def format_salary(value, tick_number):
    # Helper function to format the salary on y-axis
    return f'{int(value / 1000)}k'

plt.figure(figsize=(10, 6))
bars = plt.bar(roles, salaries, color=colors)

# Adding data labels with increased font size
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval, f'{yval/1000:.0f}k', 
             va='bottom', ha='center', fontsize=14)

# Removing axis and grid lines
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.gca().spines['left'].set_visible(False)
plt.gca().spines['bottom'].set_visible(False)
plt.grid(False)

# Setting labels, title, and y-axis format
plt.xlabel('Role', fontsize=14)
plt.ylabel('Average Salary ($)', fontsize=14)
plt.title('Average Salaries of Data and Business Analysts in the US', fontsize=16)
plt.xticks(fontsize=14)
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(format_salary))
plt.ylim(0, 100000)
plt.show()
