This was a look into the data job market. The focus was on data analyst/data scientist positions. This project focused on top-paying jobs, in demand skills, and where high demand meets high salary in data analytics.
SQL Queries located here: project_sql_folder
The motivation behind this project stemmed from my desire to understand the data analytics job market better. I aimed to discover which skills are paid the most and in demand, making my job search more targeted and effective.
The data for this analysis is from Luke Barousse’s SQL Course (here). This data includes details on job titles, salaries, locations, and required skills.
The questions I wanted to answer through my SQL queries were:
- What are the top-paying data analyst/data scientist jobs?
- What skills are required for these top-paying jobs?
- What skills are most in demand for data analysts/data scientists?
- Which skills are associated with higher salaries?
- What are the most optimal skills to learn for a data analyst looking to maximize job market value?
In this project, I utilized a variety of tools to conduct my analysis:
- SQL (Structured Query Language): Enabled me to interact with the database, extract insights, and answer my key questions through queries.
- PostgreSQL: As the database management system, PostgreSQL allowed me to store, query, and manipulate the job posting data.
- Visual Studio Code: This open-source administration and development platform helped me manage the database and execute SQL queries.
Each query for this project aimed at investigating specific aspects of the data analyst job market. Here’s how I approached each question:
To identify the highest-paying roles, I filtered data analyst/data scientist positions by average yearly salary and location, focusing on remote jobs. This query highlights the high paying opportunities in the field.
SELECT
job_id,
job_title,
job_title_short,
job_location,
job_schedule_type,
salary_year_avg,
job_posted_date,
company_dim.name AS company_name
FROM job_postings_fact
LEFT JOIN
company_dim ON job_postings_fact.company_id = company_dim.company_id
WHERE
(job_title_short = 'Data Analyst' OR job_title_short = 'Data Scientist')
AND job_location = 'Anywhere'
AND salary_year_avg IS NOT NULL
ORDER BY
salary_year_avg DESC
LIMIT 10;To understand what skills are required for the top-paying jobs, I joined the job postings with the skills data, providing insights into what employers value for high-compensation roles.
WITH top_paying_jobs AS (
SELECT
job_id,
job_title,
job_title_short,
job_location,
job_schedule_type,
salary_year_avg,
company_dim.name AS company_name
FROM job_postings_fact
LEFT JOIN
company_dim ON job_postings_fact.company_id = company_dim.company_id
WHERE
(job_title_short = 'Data Analyst' OR job_title_short = 'Data Scientist')
AND job_location = 'Anywhere'
AND salary_year_avg IS NOT NULL
ORDER BY
salary_year_avg DESC
LIMIT 10
)
SELECT
top_paying_jobs.*,
skills
FROM
top_paying_jobs
INNER JOIN skills_job_dim ON top_paying_jobs.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
ORDER BY
salary_year_avg DESC;This query helped identify the skills most frequently requested in job postings, directing focus to areas with high demand.
SELECT
skills,
COUNT(skills_job_dim.job_id) AS demand_count
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE
(job_postings_fact.job_title_short = 'Data Analyst'
OR job_postings_fact.job_title_short = 'Data Scientist')
AND job_postings_fact.job_work_from_home = TRUE
GROUP BY
skills
ORDER BY
demand_count DESC
LIMIT 5;Exploring the average salaries associated with different skills revealed which skills are the highest paying.
SELECT
skills,
ROUND(AVG(salary_year_avg), 0) AS avg_salary
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE
(job_postings_fact.job_title_short = 'Data Analyst'
OR job_postings_fact.job_title_short = 'Data Scientist')
AND salary_year_avg IS NOT NULL
AND job_postings_fact.job_work_from_home = TRUE
GROUP BY
skills
ORDER BY
avg_salary DESC
LIMIT 25;Combining insights from demand and salary data, this query aimed to pinpoint skills that are both in high demand and have high salaries, offering a strategic focus for skill development.
WITH skills_demand AS (
SELECT
skills_dim.skill_id,
skills_dim.skills,
COUNT(skills_job_dim.job_id) AS demand_count
FROM
job_postings_fact
INNER JOIN
skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN
skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE
job_postings_fact.job_title_short = 'Data Analyst'
AND job_postings_fact.salary_year_avg IS NOT NULL
AND job_postings_fact.job_work_from_home = True
GROUP BY
skills_dim.skill_id
),
-- Skills with high average salaries for Data Analyst roles
average_salary AS (
SELECT
skills_job_dim.skill_id,
AVG(job_postings_fact.salary_year_avg) AS avg_salary
FROM
job_postings_fact
INNER JOIN
skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
WHERE
job_postings_fact.job_title_short = 'Data Analyst'
AND job_postings_fact.salary_year_avg IS NOT NULL
AND job_postings_fact.job_work_from_home = True
GROUP BY
skills_job_dim.skill_id
)
-- Return high demand and high salaries for 10 skills
SELECT
skills_demand.skills,
skills_demand.demand_count,
ROUND(average_salary.avg_salary, 2) AS avg_salary --ROUND to 2 decimals
FROM
skills_demand
INNER JOIN
average_salary ON skills_demand.skill_id = average_salary.skill_id
WHERE demand_count > 10
ORDER BY
avg_salary DESC,
demand_count DESC
LIMIT 25;Throughout this project, I learned and honed several key SQL techniques and skills:
- Complex Query Construction: Learning to build advanced SQL queries that combine multiple tables and employ functions like
WITHclauses for temporary tables. - Data Aggregation: Utilizing
GROUP BYand aggregate functions likeCOUNT()andAVG()to summarize data effectively. - Analytical Thinking: Developing the ability to translate real-world questions into actionable SQL queries that got insightful answers.
From the analysis, several general insights emerged:
- Top-Paying Data Analyst Jobs: The highest-paying jobs for data analysts that allow remote work offer a wide range of salaries, the highest at $650,000!
- Skills for Top-Paying Jobs: High-paying data analyst jobs require advanced proficiency in SQL, suggesting it’s a critical skill for earning a top salary.
- Most In-Demand Skills: SQL is also the most demanded skill in the data analyst/data scientist job market, thus making it essential for job seekers.
- Skills with Higher Salaries: Specialized skills, such as gdpr and bitbucket, are associated with the highest average salaries, indicating a premium on niche expertise.
- Optimal Skills for Job Market Value: SQL leads in demand and offers for a high average salary, positioning it as one of the most optimal skills for data analysts to learn to maximize their market value.
In this project, I delved into SQL, honing my skills and gaining valuable insights into the data analyst and data scientist job market. The analysis I conducted serves as a roadmap for prioritizing skill development and optimizing job search efforts. As a data analyst, I can strategically position myself in a competitive market by focusing on high-demand, high-salary skills.
This exploration underscores the significance of continuous learning and adaptability, especially in the ever-evolving field of data analytics. By staying apprised of emerging trends, I'll be able contribute effectively to data-driven decision-making.