Skip to content

ETL for finding most popular technologies in job offers on pracuj.pl.

Notifications You must be signed in to change notification settings

rafaluk/skillsearch

Repository files navigation

skillsearch

An ETL tool for searching skills in job offers.

High level design

to be created

Installation

  1. Clone this project.
  2. Install all dependencies: pip install -r requirements.txt
  3. Download chrome driver: https://chromedriver.chromium.org/downloads (if you don't have it already)
  4. In Config.py set:
    • path to chromedriver.exes

Run

Use run.py to start the ETL:

python run.py

Parameters

  • -p, --phases - specify the phases to be run, for example:
    • python run.py -p 1 - runs only first phase
    • python run.py -p 3 - runs only third phase
    • python run.py -p 2-4 - runs phases from second to fourth included

Skill list

There is a list of skills in skills.txt. You can easily add/remove positions. The ETL uses this list to cross check skills with offers.

Utils

@calculate_time - decorator for measuring execution time

About

ETL for finding most popular technologies in job offers on pracuj.pl.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages