# Data Job Trends in Bulgaria



##### _Business Intelligence , Data Engineering, and Data Analytics Jobs in Bulgaria - Availability and Required Skillset_

## Abstract

This presentation attempts to shed light on the employment demand for data engineering, data science, data analysis, and data presentation in Bulgaria by analyzing a dataset of job offers collected from one of the biggest on-line employment portals in Bulgaria. Referred together to as *data jobs*, these professional opportunities form a small, but rapidly growing subset of the IT job market. 

The Data Jobs Analysis article is a part of a larger project that involved the full data analytics lifecycle from data collection to data analysis and presentation. The project was launched primarily for learning purposes but also to gather relevant, fresh, and not readily available information that can be used for data-driven decision making in the very fast moving landscape of IT jobs. As the analysis of the data and its presentation is the logical conclusion of a data intelligence project, this article's primary objective is to provide a view on the jobs market in Bulgaria. The focus is put on the presentation of the data, choosing the right data variables to track, and picking the best charts to show various trends.  

To meet a secondary objective set for this project and keep the research done in some sort of knowledge base, the process of creating each of the charts shown is captured in a separate workbook along with a brief explanation on the steps performed. These workbooks are linked in the appendix at the end of this article.

## Job Offers Analyzed

### Defining Data Jobs

So what exactly is considered a *data job*? As a matter of personal preference of the author, this analysis selects offers that are related to the field of Data Engineering, Business Intelligence (BI), Data Analytics and Reporting. I made an attempt to exclude basic data entry and digitization jobs and false hits that are not related to the previously described categories. More on offer selection criteria follows.

The short descriptions below are taken from [Simplilearn's "Top 12 Interesting Careers to Explore in Big Data Infographic"](https://www.slideshare.net/Simplilearn/top-12-interesting-careers-in-big-data-70520677).

#### Data Visualization Developers
They design, develop and provide production support of interactive data visualizations used across the enterprise. They possess an artistic mind that conceptualizes, designs, and develops reusable graphics/data visualizations applying strong technical knowledge to implement them using the latest technologies.

#### Data Engineers
They ensure uninterrupted flow of data between servers and applications and are also responsible for data architecture. They develop, maintain, test, and evaluate Big Data solutions within organizations.

#### Data Analysts
Subject-matter experts with deep domain knowledge who support various development initiatives, assist in testing and perform research in order to understand business issues and to develop practical cost-effective solutions to problems.

#### Data Scientists
Experts with quantitative background skilled in statistics who help companies make sense of data. They use their analytical and technical abilities to extract meaningful insights from data.

### Data  Filtering

The process begins with exploration of those job offers deemed most promising, those in the field of Data Analysis, Data Integration, Data Wrangling, etc. The first step of the process is to correctly filter the offers relevant to our research. That was done using a series of targeted queries against the offer titles and content. All of the positions described below have shared traits, and it is common for an organization to look for skills in more than one of the job descriptions (e.g. a data visualization expert with Extract, Transform, Load (ETL) or Data Warehousing (DWH) knowledge).

The result of our initial filtering is a small subset of job offers which we can explore further:

In [1]:
%%HTML
<iframe width="100%" height="525px" seamless="seamless" src="./data_offers_pie_and_bar.html"></iframe>

## Temporal Analysis

Having identified and verified the set of offers targeted, we can now provide some historical perspective on the time period available to our study. We begin with an important visualization presenting how the quantity of submitted data jobs is changing over time.



### Offers over time bar chart

Presented as a standard bar chart with two aggregation levels (week and month), the data shows significant increase in the total data-related job offer submissions. It is interesting to note also that this increase is almost unaffected by the annual cyclical pattern (to be addressed in a separate article) in employment market activity where distinct seasonal high and low periods are observed.

In [2]:
%%HTML 
<iframe width="100%" height="525px" seamless="seamless" src="./data_offers_over_time_bar_chart.html"></iframe>

### Offers over time by type

This stacked bar chart shows the number of submissions by data job type. Note that some offers can belong to multiple job types (thus the total monthly count will not align with the count in the Total Submitted Data Jobs chart).

In [3]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offer_profiles_stacked_bar.html"></iframe>

### Offers over time heatmap

Heatmaps are another useful way to present historical data and visually explore the trends. The one below confirms the increase in demand shown in the previous chart, but also reveals weekly dynamics. Notice the heatmap getting darker as time progresses.

In [4]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_subm_heatmap.html"></iframe>

## Key Players & Locations


### Key Players by Total Number of Submitted Data Jobs



In [5]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_key_players_pareto.html"></iframe>

###  Job Listings  by Location


In [6]:
%%HTML
<iframe width="100%" height="550px" src="./data_offers_locations.html"></iframe>

## Skills and Tool Requirements

Use the requirements to produce a list of top technologies then look in the offers' contents for them and provide a summary.

In [7]:
%%HTML
<iframe width="100%" height="550px" src="./data_offers_tech_requirements_chord.html"></iframe>

## Remuneration Analysis


###  Data Job Salary Transparency Chart

A percent stacked area graph illustrating the ratio of data jobs with salary information revealed in the job offer. In this type of chart the value of each group is normalized at each time stamp and presented as percentage of the whole, allowing the reader to compare groups. In order to filter out the noise from daily fluctuations the data has been aggregated in time period bins spanning one week.

It is interesting to notice that the average ratio of published salaries for Data Jobs is much lower than the average for all job offers.

In [8]:
%%HTML 
<iframe width="100%" height="525px" src="./data_jobs_salary_transparency.html"></iframe>

### Data Offer Salary Statistics Boxplot

This chart should be taken with a grain of salt because of the very low monthly number of job offers with a disclosed salary range are available to work with.

In [9]:
%%HTML 
<iframe width="100%" height="525px" seamless="seamless" src="./monthly_salary_statistics_data_offers.html"></iframe>

### Data Offers Salary Scatter Plot

The shaded area illustrates the interquartile range ([IQR](https://en.wikipedia.org/wiki/Interquartile_range)) calculated for all job offers with disclosed salary. In other words, half of salaries offered in Bulgaria are somewhere in the dark band. While it was more or less expected for positions in technology to be higher paid than the average, it is interesting to see also how some positions remain unfilled for long periods of time even when high salary is offered.

In [10]:
%%HTML 
<iframe width="100%" height="525px" src="./data_offers_scatter.html"></iframe>

## Resources Used

### Visualization Recipes

Recipes for all charts used in this presentation: 

1. Data Offers Share Pie and Bar Chart [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Share_Pie_and_Bar_Chart.ipynb)
2. Data Offers Over Time Bar Chart [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Historical_Bar_Charts.ipynb)
3. Data Offers Over Time Heatmap [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Heatmap.ipynb)
4. Data Offers Salary Transparency [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Salary_Transparency.ipynb)
5. Key Players by Total Number of Submitted Data Jobs (Pareto Chart) [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Key_Players_Pareto_Chart.ipynb)
6. Deep Dive into Data Offers Requirements with NLTK [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Requirements_Deep_Dive.ipynb)
7. A Chord Diagram Revealing Key Data Jobs Technology Requirements [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Requirements_Relationships_Chord.ipynb)
8. Data Offers Locations Map [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_By_Location.ipynb)
9. Data Offers Salary Statistics Boxplot [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Monthly_Salary_Stats_Box.ipynb)
10. Data Offers Salary Scatterplot [open notebook](https://nbviewer.jupyter.org/github/nikolovdeyan/Job_Market_Trends_Bulgaria/blob/master/workbooks/Data_Offers_Salary_Scatter.ipynb)

### Technologies Used


This presentation is powered by:

<table>
    <tr>
        <td><img src='https://www.vectorlogo.zone/logos/python/python-ar21.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/postgresql/postgresql-ar21.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/jupyter/jupyter-ar21.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/plot_ly/plot_ly-ar21.svg' style="height: 60px;"></td>
    </tr>
    <tr>
        <td><img src='https://www.vectorlogo.zone/logos/github/github-ar21.svg' style="height: 60px;"></td>
        <td><img src='https://www.vectorlogo.zone/logos/javascript/javascript-ar21.svg' style="height: 60px;"></td>   
        <td><img src='https://www.vectorlogo.zone/logos/w3_html5/w3_html5-ar21.svg' style="height: 60px;"></td>
    </tr>
</table>

In [11]:
from IPython.core.display import HTML
with open('../resources/styles/datum.css', 'r') as f:
    style = f.read()
HTML(style)