# Data Jobs Analysis




##### _Business Intelligence , Data Engineering, and Data Analytics Jobs in Bulgaria - Availability and Required Skillset_



## Abstract

This presentation attempts to provide a view on the job market for data engineering, data analytics, and business intelligence in Bulgaria. It does that by examining the submitted job offers on one of the largest Bulgarian job portals for a period of over a year. It focuses on the data presentation and tries to present an objective picture of the proffesional demand for what we refer to **data jobs**. 

While this page has its focus on graphics, I also created a recipe page for each of the charts shown. They are all listed in the Appendix at the bottom of the article and each illustrates the steps taken to produce the respective chart. Although this project was conceived purely out of curiousity and executed primarily as a learning exercise, I believe that the insights gathered are valuable and can be useful for both job seekers and employers looking to inform themselves about the market.


## Job Titles Analysed

### Defining Data Jobs

So what exactly do I consider a *data job*? The objective for this analysis was to select offers that are related to the field of data engineering, business intelligence, data analytics and reporting. I made an attempt to exclude basic data entry and digitization jobs and false hits that are not related to the previosly described categories. More on offer selection criteria below.

### Data  Filtering

I began with an exploration of the job offers I am most interested in, those in the field of data analysis, data integration, data wrangling, etc. The first step in the process is to correctly filter the offers that are relevant to our research. That was done using a series of targeted queries against the offer titles and content. The final Regex expressions used to select the analysed dataset is given below:

```
'(data|business) intelligence|(\W|^)bi(\W|$)'
'(data analy(st|tics|sis))|(анализ.*данни)'
'((\W|^)(etl|dwh)(\W|$))|data (engineer|warehouse)'
'reporting (analyst|specialist)'
'data scien.+'
```

### Targets Result Overview

The result of our initial filter is a small subset of job offers which we can explore further:

In [8]:
%%HTML 
<iframe width="100%" height="525px" seamless="seamless" src="./data_offers_pie_and_bar.html"></iframe>

## Temporal Analysis

Having identified and verified the set of targeted offers, we can now provide some historical perspective for the time period available to our study. Let's start with showing how the quantity of submitted data jobs has changed over two selected period bins: weekly and monthly.




### Offers over time bar chart

The chart shows significant increase in the total numbers of data related jobs available. 

In [10]:
%%HTML 
<iframe width="100%" height="525px" seamless="seamless" src="./data_offers_over_time_bar_chart.html"></iframe>

### Offers over time heatmap

Heatmaps are another useful way to present historical data and visually explore the trends. This one confirms the increase in demand shown in the previous chart, but also (could) reveal weekly dynamics. Notice the heatmap getting darker as time progresses.

In [20]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_subm_heatmap.html"></iframe>

###  Data Jobs Salary Transparency Chart

A percent stacked area graph illustrateing the ratio of data jobs with announced salary information in the job offer. In this type of chart the value of each group is normalized at each time stamp and presented as a percentage part of the whole, allowing the reader to compare the groups that compose the whole. In order to hide the noise from daily fluctuations the data has been aggregated in time period bins spanning one month.

In [24]:
%%HTML 
<iframe width="100%" height="525px" src="./data_jobs_salary_transparency.html"></iframe>

## Key Players & Locations


### Key Players by Total Number of Submitted Data Jobs



In [13]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_key_players_pareto.html"></iframe>

### Locations
- Identify Location Trends
What are the salaries for our targets.

## Skills and Tools Requirements

Use the requirements to produce a list of top technologies, then look in the offers' contents for them and provide a summary.

In [14]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_tech_requirements_chord.html"></iframe>

## Remuneration Analysis


### Data Offers  Salary Statistics Boxplot


In [23]:
%%HTML 
<iframe width="100%" height="525px" seamless="seamless" src="./monthly_salary_statistics_data_offers.html"></iframe>

### Data Offers Salary Scatter Plot

The shaded area illustrates the interuqartile range ([IQR](https://en.wikipedia.org/wiki/Interquartile_range)) calculated for all job offers with disclosed salary offered. In other words 50% of the offered salaries are somewhere inside the dark band. While it was more or less expected for professions in the technology field to be higher paid than the average, it is interesting to see also how some positions remain unfilled for long periods of time even when high salary is offered.

In [18]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_scatter.html"></iframe>

## Appendix: 



### Technologies Used

The following technologies were used to make that presentation:

- Plotly

<table>
    <tr>
        <td><img src='https://www.vectorlogo.zone/logos/python/python-vertical.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/postgresql/postgresql-vertical.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/jupyter/jupyter-icon.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/github/github-icon.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/javascript/javascript-vertical.svg' style="height: 60px;"></td>   
        <td><img src='https://www.vectorlogo.zone/logos/w3_html5/w3_html5-icon.svg' style="height: 60px;"></td>
    </tr>
</table>


### Visualization Recipes

Recipes for all charts used in this presentation are available on the links below: 

1. Data Offers Share Pie and Bar [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Share_Pie_and_Bar_Chart.ipynb)
2. Data Offers Over Time Bar Chart [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Historical_Bar_Charts.ipynb)
3. Data Offers Over Time Heatmap [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Heatmap.ipynb)
4. Data Offers Salary Transparency [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Salary_Transparency.ipynb)
5. Key Players by Total Number of Submitted Data Jobs (Pareto Chart) [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Key_Players_Double_Bar_Chart.ipynb)
6. Deep Dive into Data Offers Requirements with NLTK [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Requirements_Deep_Dive.ipynb)
7. A Chord Diagram Revealing Key Data Jobs Technology Requirements [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Requirements_Relationships_Chord.ipynb)
8. Data Offers Salary Statistics Boxplot [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Monthly_Salary_Stats_Box.ipynb)
9. Data Offers Salary Scatterplot [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Salary_Scatter.ipynb)

In [4]:
from IPython.core.display import HTML
with open('../resources/styles/datum.css', 'r') as f:
    style = f.read()
HTML(style)