Web Scraping-Google-jobs

Using the Selenium automation library paired with Python, this task was accomplished

Goals & Scope

The overall goal for this project is to understand and visualise the current Job search market in Tech and AI. Which is at a rather over-saturated stage at the moment, especially for the entry-level positions. This project is only the first stage towards understanding this data and additional correlated information is certainly to be considered in the future.

The scope of this data is from Google Jobs, which obviously only represent a small portion of the Jobs advertised among several popular job boards. However, the choice for this platform is due to the fact that google allows web scrapping (to a certain degree). Previously, I tried to accomplish this task via indeed, which resulted in a temporary blockage for my access. Nevertheless, in the future, we can also consider additional popular job boards.

Example of a Typical Job Webpage, Advertised sites, Base salary, Description Panel

Below are the areas of the webpage that we are interested in. Since this is a dynamic webpage, it requires interactions for additional information to be processed. E.g. Scrolling or clicking.

Description Button and Typical Pay range

The process or Algorithm

In order to extract the most relevant information, here is the overall walkthrough of this script:

As the page is scrolled, it loads new Job cards dynamically. (Area 1)
However, there are already 10 preloaded Job cards. First, we count the 10 preloaded Job cards and then perform additional scrolling to load new cards.
As we traverse through each job card, a clicking functionality is performed which loads the Job information (Area 2).
- First the advertised job board is extracted.
- Secondly, the base salary is checked (available at the top below the advertised job sites). If this is available, then it is extracted else a typical salary range is extracted (available at the bottom).
- If there isn't any salary information, then 0 is placed for this job entry.
- Next, Job descriptions are extracted, expanded by clicking the full description button.
Until there are no new cards left to load, we continue repeating these actions. (On average, each job search query loads about 150 jobs).

Data to be extracted

[Job Title, Company Name, Salary (Base Salary or Typical Salary range), Job Advertised website (LinkedIn / Indeed), The URL, Description]

Many Job titles can be searched in a session. Below are the Roles that were used in order to Scrape the data visualised. ['machine+learning','graduate+data+science','graduate+software+engineering', 'data+science']
The roles are Formatted with the shorted version of URL required by google: ["https://www.google.com/search?q={}&ibp=htl;jobs&=#htivrt=jobs"]

Graphs

The most common Job Ttiles from Scrapped data

Data Science proves to be the most often used as a job title by employers or recruiters in job adverts. Also, a varied range of Graduate positions however that reflects the Keywords that were used during the search!

Job-Boards with most jobs advertised

Indeed tops the list as expected with more than 40 jobs followed by LinkedIn and Jooble.

Job-Boards with most Entry Level Jobs

To create this bar chart Gradute Keywords were used to search the dataframe['Graduate', 'Entry Level', 'Junior', 'Recent Graduate','Grad']. Surprisingly, Indeed ends up at the bottom of the list with least graduate level positions advertised, at least from the job titles.

Salary Density for popular roles

A large portion of salaries are just below 40k, representing the entry-level jobs,. Whilst the peak of the density distribution being on the higher end, likely above £80k, indicates that senior-level positions dominate the job market. Which does imply that organisations are actively seeking experienced professionals in the current economic climate.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Density of salaries for most common roles.png		Density of salaries for most common roles.png
Most common Jobs.png		Most common Jobs.png
README.md		README.md
Sites with Most jobs advertised.png		Sites with Most jobs advertised.png
Sites with most entry level jobs.png		Sites with most entry level jobs.png
Webpage Example 1.jpg		Webpage Example 1.jpg
Webpage Example 2.jpg		Webpage Example 2.jpg
scrapping_google_jobs.ipynb		scrapping_google_jobs.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping-Google-jobs

Goals & Scope

Example of a Typical Job Webpage, Advertised sites, Base salary, Description Panel

Description Button and Typical Pay range

The process or Algorithm

Data to be extracted

Graphs

The most common Job Ttiles from Scrapped data

Job-Boards with most jobs advertised

Job-Boards with most Entry Level Jobs

Salary Density for popular roles

About

Releases

Packages

Languages

Ronnn007/Webscraping-Google-jobs

Folders and files

Latest commit

History

Repository files navigation

Web Scraping-Google-jobs

Goals & Scope

Example of a Typical Job Webpage, Advertised sites, Base salary, Description Panel

Description Button and Typical Pay range

The process or Algorithm

Data to be extracted

Graphs

The most common Job Ttiles from Scrapped data

Job-Boards with most jobs advertised

Job-Boards with most Entry Level Jobs

Salary Density for popular roles

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages