Skip to content

skytska/Python_Data_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

The Analysis

1. What are the most demanded skills for the TOP 3 most popular data roles?

To find the most demanded skills for the top 3 most popular data roles. I filtered out those positions by which ones were the most popular and

Got the top 5 skills for these top 3 roles. This query highlights the most popular job titles and their corresponding top skills, indicating which skills I should focus on depending on my target role.

View my notebook with details steps here:
2_Skill_Demand.ipynb

Visualize Data

fig, ax = plt.subplots (len(job_titles), 1)

for i, job_title in enumerate (job_titles):
    df_plot = df_skills_count[df_skills_count['job_title_short'] == job_title].head(5)
    df_plot.plot(kind = 'barh', x = 'job_skills', y = 'skill_count', ax = ax [i], title = job_title, legend = False)
    ax[i].invert_yaxis()
    ax[i].set_ylabel('')
fig.suptitle('Counts of Job Skills in Job Postings', fontsize = 15, fontweight='semibold')
plt.tight_layout()
plt.show()

Results

Visualization of TOP Skills for Data Nerds

Insights

  • SQL is the most requested skill for Data Analysts and Data Scientists, with it in over half the job postings for both roles. For Data Engineers, Python is the most sought-after skill, appearing in 68% of job postings.
  • Data Engineers require more specialized technical skills (AWS, Azure, Spark) compared to Data Analysts and Data Scientists who are expected to be proficient in more general data management and analysis tools (Excel, Tableau).
  • Python is a versatile skill, highly demanded across all three roles, but most prominently for Data Scientists (72%) and Data Engineers (65%).

How are in-demand skills trending for Data Analysts?

Visualize data

from matplotlib.ticker import PercentFormatter

df_plot = df_DA_US_percent.iloc[:,:5]
sns.lineplot(data = df_plot, dashes = False, legend = False)
sns.set_theme(style = 'ticks')
sns.despine()

plt.title('Trending Top Skills for Data Analysts in the US')
plt.ylabel('Likelihood in Job Posting')
plt.xlabel('2023')

plt.gca().yaxis.set_major_formatter(PercentFormatter(decimals=0))

for i in range(5):
    plt.text(11.3, df_plot.iloc[-1, i], df_plot.columns[i])

plt.show()

Results

Trending Top Skills for Data Analysts in the US
Bar graph visualizing the trending top skills for data analysts in the US in 2023

Insights

  • SQL remains the most consistently demanded skill throughout the year, although it shows a gradual decrease in demand.
  • Excel experienced a significant increase in demand starting around September, surpassing both Python and Tableau by the end of the year.
  • Both Python and Tableau show relatively stable demand throughout the year with some fluctuations but remain essential skills for data analysts. Power BI, while less demanded compared to the others, shows a slight upward trend towards the year's end.

3. How well do job and skills pay for Data Analysts?

Salary Analysis for Data Nerds

Visualize Data

sns.boxplot(data=df_US_top6, x='salary_year_avg', y='job_title_short', order=job_order)

ticks_x = plt.FuncFormatter(lambda y, pos: f'${int(y/1000)}K')
plt.gca().xaxis.set_major_formatter(ticks_x)
plt.show()

Results

Salary Distribution of Data Jobs in the US Box plot visualizing the salary distribution for the top 6 data job titles

Insights

  • There's a significant variation in salary ranges across different job titles. Senior Data Scientist positions tend to have the highest salary potential, with up to $600K, indicating the high value placed on advanced data skills and experience in the industry.

  • Senior Data Engineer and Senior Data Scientist roles show a considerable number of outliers on the higher end of the salary spectrum, suggesting that exceptional skills or circumstances can lead to high pay in these roles. In contrast, Data Analyst roles demonstrate more consistency in salary, with fewer outliers.

  • The median salaries increase with the seniority and specialization of the roles. Senior roles (Senior Data Scientist, Senior Data Engineer) not only have higher median salaries but also larger differences in typical salaries, reflecting greater variance in compensation as responsibilities increase.

Highest Paid and Most Demanded skills for Data Analysts

Visualize Data

fig, ax = plt.subplots(2, 1)  

# Top 10 Highest Paid Skills for Data Analysts
sns.barplot(data=df_DA_top_pay, x='median', y=df_DA_top_pay.index, hue='median', ax=ax[0], palette='dark:b_r')

# Top 10 Most In-Demand Skills for Data Analysts')
sns.barplot(data=df_DA_skills, x='median', y=df_DA_skills.index, hue='median', ax=ax[1], palette='light:b')

plt.show()

The_Highest_Paid_&_Most_In-Demand_Skills_for_DA_US Two separate bar graphs visualizing the highest paid skills and most in-demand skills for data analysts in the US.

Insights

  • The top graph shows specialized technical skills like dplyr, Bitbucket, and Gitlab are associated with higher salaries, some reaching up to $200K, suggesting that advanced technical proficiency can increase earning potential.

  • The bottom graph highlights that foundational skills like Excel, PowerPoint, and SQL are the most in-demand, even though they may not offer the highest salaries. This demonstrates the importance of these core skills for employability in data analysis roles.

  • There's a clear distinction between the skills that are highest paid and those that are most in-demand. Data analysts aiming to maximize their career potential should consider developing a diverse skill set that includes both high-paying specialized skills and widely demanded foundational skills.

4. What is the most optimal skills to learn for Data Analysts?

Visualize data

from adjustText import adjust_text
import matplotlib.pyplot as plt

plt.scatter(df_DA_skills_high_demand['skill_percent'], df_DA_skills_high_demand['median_salary'])
plt.show()

Results

Most_Optimal_Skills_for_Data_AnaLysts A scatter plot visualizing the most optimal skills (high paying & high demand) for data analysts in the US.

Insights:

  • The scatter plot shows that most of the programming skills (colored blue) tend to cluster at higher salary levels compared to other categories, indicating that programming expertise might offer greater salary benefits within the data analytics field.

  • The database skills (colored orange), such as Oracle and SQL Server, are associated with some of the highest salaries among data analyst tools. This indicates a significant demand and valuation for data management and manipulation expertise in the industry.

  • Analyst tools (colored green), including Tableau and Power BI, are prevalent in job postings and offer competitive salaries, showing that visualization and data analysis software are crucial for current data roles. This category not only has good salaries but is also versatile across different types of data tasks.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published