📊 Dive into the data job market! Focusing on data analyst roles, this project explores 💰 top-paying jobs, 🔥 in-demand skills, and 📈 where high demand meets high salary in data analytics.
The questions I wanted to answer through my SQL queries were:
What are the top-paying data analyst jobs? What skills are required for these top-paying jobs? What skills are most in demand for data analysts? Which skills are associated with higher salaries? What are the most optimal skills to learn?
-
SQL: The backbone of my analysis, allowing me to query the database and unearth critical insights.
-
PostgreSQL: The chosen database management system, ideal for handling the job posting data.
-
Visual Studio Code: My go-to for database management and executing SQL queries.
-
Git & GitHub: Essential for version control and sharing my SQL scripts and analysis, ensuring collaboration and project tracking.
Each query for this project aimed at investigating specific aspects of the data analyst job market. Here’s how I approached each question:
To identify the highest-paying roles, I filtered data analyst positions by average yearly salary and location, focusing on remote jobs. This query highlights the high paying opportunities in the field.
SELECT
job_id,
job_title,
job_location,
job_schedule_type,
salary_year_avg,
job_posted_date,
name AS company_name
FROM
job_postings_fact
LEFT JOIN company_dim on job_postings_fact.company_id = company_dim.company_id
WHERE
job_title = 'Data Engineer' AND
job_location = 'Anywhere' AND
salary_year_avg IS NOT NULL
ORDER BY salary_year_avg DESC
LIMIT 10;
Here's the breakdown of the top data analyst jobs in 2023:
Wide Salary Range: Top 10 paying data analyst roles span from $184,000 to $650,000, indicating significant salary potential in the field.
Diverse Employers: Companies like SmartAsset, Meta, and AT&T are among those offering high salaries, showing a broad interest across different industries.
Job Title Variety: There's a high diversity in job titles, from DE to Director of DE reflecting varied roles and specializations within data analytics.
To understand what skills are required for the top-paying jobs, I joined the job postings with the skills data, providing insights into what employers value for high-compensation roles.
With top_paying_jobs AS (
SELECT
job_postings_fact.job_id,
job_postings_fact.job_title,
job_postings_fact.salary_year_avg,
company_dim.name AS company_name
FROM
job_postings_fact
LEFT JOIN company_dim on job_postings_fact.company_id = company_dim.company_id
WHERE
job_title = 'Data Engineer' AND
job_location = 'Anywhere' AND
salary_year_avg IS NOT NULL
ORDER BY salary_year_avg DESC
LIMIT 10
)
SELECT
top_paying_jobs.*,
Skills
FROM
top_paying_jobs
INNER JOIN skills_job_dim ON top_paying_jobs.job_id = skills_job_dim.job_id
INNER JOIN skills_dim on skills_job_dim.skill_id = skills_dim.skill_id
ORDER BY salary_year_avg DESCThe analysis of top-paying job skills reveals that Data Engineer roles dominate in terms of salary. Below are the key skills associated with the highest average salaries:
- Numpy – $325,000/year
- PySpark – $325,000/year
- Kubernetes – $325,000/year
- Pandas – $283,333/year
- Spark – $259,375/year
- Hadoop – $259,375/year
- Python – $249,500/year
- Kafka – $248,700/year
- Scala – $232,000/year
- SQL – $222,875/year
- Data Engineering is one of the most lucrative career paths, with strong demand for skills related to big data technologies such as Numpy, PySpark, Spark, Hadoop, Kafka, and SQL.
- Python, a versatile programming language, remains a crucial skill, highly sought after in Data Engineering roles.
- Mastery of distributed computing frameworks and data manipulation libraries (e.g., Pandas, Spark) significantly boosts earning potential.
Professionals skilled in big data tools and programming languages are well-positioned to secure some of the highest salaries in the tech industry.
SELECT
Skills,
COUNT (skills_job_dim.job_id) AS demand_count
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim on skills_job_dim.skill_id = skills_dim.skill_id
WHERE job_title_short = 'Data Engineer' OR job_title_short = 'Data Analyst'
GROUP BY Skills
ORDER BY demand_count DESC
LIMIT 10;
The following are the most in-demand skills for Data Engineers, ranked by the number of job listings or demand occurrences:
- SQL – 206,003 demand mentions
- Python – 165,591 demand mentions
- Excel – 75,550 demand mentions
- Azure – 71,765 demand mentions
- AWS – 71,237 demand mentions
- SQL is the most in-demand skill for Data Engineers, essential for managing and querying databases.
- Python is crucial for data manipulation, automation, and building data pipelines.
- Skills in cloud platforms such as Azure and AWS are highly sought after as companies shift towards cloud-based infrastructures.
- Proficiency in Excel remains relevant, especially for quick data analysis and reporting.
These in-demand skills highlight the need for Data Engineers to be versatile, combining strong programming expertise with cloud and data management tools.
SELECT
Skills,
ROUND (AVG (salary_year_avg), 0 ) AS Avg_salary
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim on skills_job_dim.skill_id = skills_dim.skill_id
WHERE (job_title_short = 'Data Engineer' OR job_title_short = 'Data Analyst')
AND salary_year_avg IS NOT NULL
GROUP BY Skills
ORDER BY Avg_salary DESC
LIMIT 50;The following are the highest-paying technical skills, ranked by average salary:
MongoDB– $173,623 average salary Node.js– $171,685 average salary Solidity– $170,500 average salary
Key Insights
MongoDB leads the pack in compensation, reflecting high demand for NoSQL database expertise Node.js demonstrates the premium value placed on full-stack JavaScript development skills Solidity commands high salaries due to increasing blockchain and smart contract development needs
WITH skills_demand AS (
SELECT
skills_dim.skills,
skills_dim.skill_id,
COUNT (skills_job_dim.job_id) AS demand_count
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim on skills_job_dim.skill_id = skills_dim.skill_id
WHERE job_title_short = 'Data Engineer' OR job_title_short = 'Data Analyst'
AND salary_year_avg IS NOT NULL
GROUP BY skills_dim.skill_id
), average_salary AS (
SELECT
skills_job_dim.skill_id,
ROUND (AVG (salary_year_avg), 0 ) AS Avg_salary
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim on skills_job_dim.skill_id = skills_dim.skill_id
WHERE (job_title_short = 'Data Engineer' OR job_title_short = 'Data Analyst')
AND salary_year_avg IS NOT NULL
GROUP BY skills_job_dim.skill_id
)
SELECT
skills_demand.skill_id,
skills,
demand_count,
Avg_salary
FROM skills_demand
INNER JOIN average_salary ON skills_demand.skill_id = average_salary.skill_id
WHERE demand_count>10
ORDER BY
Avg_salary DESC,
demand_count DESC
LIMIT 25Kafka Average Salary: $142,512 Demand Count: 29,203 Focus Area: Leading message streaming platform
Scala Average Salary: $141,246 Demand Count: 28,850 Focus Area: Popular for big data processing
Airflow Average Salary: $135,427 Demand Count: 25,627 Focus Area: Essential for data pipeline orchestration