Skip to content

ronoc2020/job-scraper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

job-scraper

Scraping jobs from Indeed or CWjobs

This repository

The module job-scraper.py enables you to web scrape job postings from Indeed.co.uk or CWjobs.co.uk.

Both require the package Beautiful Soup. For CWjobs, the Sel enium web driver is also required. These can be installed as follows:

$ pip install beautifulsoup4
$ pip install selenium

To use this module, import the job_scraper.py file and call the funciton "find_jobs_from()", which takes in several arguments. For an explanation and demonstration of the required arguments, see Demo.ipynb.

Terms and conditions

I do not condone scraping data from Indeed or CWjobs in any way. Anyone who wishes to do so should first read their statements on scraping software here and here.

Using the selenium web driver

At present, the default browser is set as Google Chrome. This can be modified within job_scraper.py.

In order to extract jobs from CWjobs using Selenium, the appropriate driver must be installed. The driver in this repository is for Google Chrome version 81. See this link to download an appropriate driver for the Google Chrome browser, if required, and place it in the same directory as the job-scraper.py function.

Accompanying blog post

A full description of this code and the process I followed to write it is available here.

About

Scraping jobs from Indeed or CV jobs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 81.9%
  • Jupyter Notebook 18.1%