# Tech Job Posting

In [None]:
# Import libraries
import pandas as pd
import sqlite3

In [None]:
# Import datasets
salaries = pd.read_csv('ds_salaries_2023.csv')
posting = pd.read_csv('job_postings.csv')
companies = pd.read_csv('companies.csv')
skills = pd.read_csv('job_skills.csv')
benefits = pd.read_csv('benefits.csv')
industry = pd.read_csv('company_industries.csv')

In [None]:
# Set SQL server
## Connector
conn = sqlite3.connect('tech_jobmarket.db')

# Transform csv to db
salaries.to_sql('salaries', conn, index=False, if_exists='replace')
posting.to_sql('posting', conn, index=False, if_exists='replace')
companies.to_sql('companies', conn, index=False, if_exists='replace')
skills.to_sql('skills', conn, index=False, if_exists='replace')
benefits.to_sql('benefits', conn, index=False, if_exists='replace')
industry.to_sql('industry', conn, index=False, if_exists='replace')

# Activate SQL extension
%load_ext sql

# Connect SQL Magic (to use SQl commands) to our databse
%sql sqlite:///tech_jobmarket.db

## **Business Questions**

### 1. Salary Analysis  
- What is the average salary for data science roles across experience levels?
- Which job titles have the highest salaries, and how do they vary by company size?
- How do remote and on-site roles compare in salary ranges?

**SQL Skills:**  
- **INNER JOIN** to link job titles and company size.  
- **Aggregations** (`AVG`, `MAX`) for salary comparisons.  
- **Window Functions** to rank salaries by job title and experience level.

### **2. Regional Insights**  
- Which countries offer the highest salaries for data science jobs?
- How does an employee’s residence compare to the company’s location for remote roles?
- What is the proportion of fully remote roles by region?

**SQL Skills:**  
- **INNER JOIN** between `employee_residence` and `company_location` for regional analysis.  
- **LEFT JOIN** to include companies with incomplete remote data.  
- **CASE** to classify remote work levels (`0`, `50`, `100`).

### **3. Skill Demand Trends**  
- What are the top 10 most frequently required technical skills?
- Which industries demand specific skills like AI or cloud computing?  
- How does skill demand vary by company size? 

**SQL Skills:**  
- **INNER JOIN** between `job_postings.csv` and `job_skills.csv`.  
- **FULL OUTER JOIN** to compare skill demand across industries.  
- **CTEs** to extract and rank the most in-demand skills.

### **4. Benefits and Perks**  
- What are the most common benefits offered for remote roles? 
- Which company sizes are most likely to offer comprehensive perks?
- How do benefits vary across different industries?

**SQL Skills:**  
- **LEFT JOIN** between `benefits.csv` and `job_postings.csv` to analyze job perks.  
- **Aggregations** to calculate the prevalence of benefits by job type.  
- **Subqueries** to identify roles with multiple benefits.

### **5. Emerging Industries and Future Trends**
- Which industries are hiring the most data professionals?
- What percentage of companies in emerging industries offer remote opportunities?
- How do salaries vary across industries for data-related roles?

**SQL Skills:**  
- **INNER JOIN** between `job_postings.csv` and `company_industries.csv` for analyzing industry-specific hiring trends.  
- **LEFT JOIN** to include companies missing specific job postings.  
- **Aggregations** to compute averages and percentages across industries.  
- **Window Functions** to rank industries by the number of remote opportunities and average salaries.

