# Data Jobs Analysis

---------------

##### _Data Engineering, Analytics, Business Intelligence, and Data Visualization Jobs in Bulgaria -- Availability and Required Skillset._

-------------

### Table of Contents:

- Introduction
- Job Titles Analysed
- Recent Trends
- Key Players
  - Identify Major Companies
- Skillset and Toolset Requirements
  - Identify Requirements
  - Identify Most Used Technologies
- Remuneration Analysis
  - Identify Salary Trends

## Introduction

This presentation slowly evolved from a personal project to better understand the job market in Bulgaria. 

## Job Titles Analysed


### Selecting the Offers of Interest

Let's say we are interested in job offers in the field of data analysis, data integration, data wrangling, etc. The first step in the process is to correctly filter the offers that are interesting for our research. That was done using a targeted query in the offer titles, and subsequent searches of selected keywords in the contents of the offers to identify additional strings to look for in the titles. A list of the keywords used in the final filter is given below:

```
'bi( |$)'
'(data|business) intelligence'
'etl( |$)'
'data analy(st|tics|sis)'
'анализ.*данни'
'data (engineer|scientist|warehouse)'
'reporting (analyst|specialist)'
'tableau'
'clikview'
```

In [42]:
import psycopg2
import pandas as pd

%matplotlib notebook
%matplotlib inline

In [43]:
conn = psycopg2.connect("dbname=jobsbg")

datajobs_df = pd.read_sql_query(
    'SELECT job_id, subm_date FROM v_full_data_offers_history', conn, index_col='subm_date')

conn.close()

print(f'The total amount of offers that match the selected criteria is {len(datajobs_df.index)}.')
print(f'The first matching record is from {min(datajobs_df.index)}, and the last matching record is from {max(datajobs_df.index)}')

The total amount of offers that match the selected criteria is 1601.
The first matching record is from 2017-09-27, and the last matching record is from 2018-12-21


## Recent Trends



Having identified and verified our set of targeted offers, we can provide some general stats and historical trends for the available time period. Let's start with showing how the selected offers' number accumulated over time.

### Offers over time bar chart

The chart shows a noticable increase in the total numbers of data related jobs available. 

In [49]:
%%HTML 
<iframe width="100%" height="525px" seamless="seamless" src="./data_offers_over_time_bar_chart.html"></iframe>

### Offers over time heatmap

Heatmaps are another useful way to present historical data and visually explore the trends. This one confirms the increase in demand shown in the previous chart, but also reveals weekly dynamics.



In [50]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_subm_heatmap.html"></iframe>

### Specializations 

- Compare selected job titles (BI, ETL, Data Engineer).


### Locations
- Identify Location Trends
What are the salaries for our targets.

## Key Players


## Skillset and Toolset Requirements

Use the requirements to produce a list of top technologies, then look in the offers' contents for them and provide a summary.

## Remuneration Analysis

In [51]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_scatter.html"></iframe>

## Appendix: 

### Visualization Recipes

Recipes for all charts used in this presentation are available on the links below: 

-------------------

##### 1. Data Offers Over Time Bar Chart [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Over_Time.ipynb)
##### 2. Data Offers Over Time Heatmap [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Heatmap.ipynb)
##### 3. Data Offers Salary Scatterplot [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Salary_Scatter.ipynb.ipynb)

In [52]:
from IPython.core.display import HTML
with open('../resources/styles/datum.css', 'r') as f:
    style = f.read()
HTML(style)