Very simple scrapy scraper to get stackoverflow jobs
HTML Jupyter Notebook Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
notebooks
stackjobs
.gitignore
LICENSE
README.md
clean_data.py
create_technology_data.py
enhance_data_with_pandas.py
export_from_mongo.py
merge_exported_jobs.py
scrapy.cfg

README.md

stackjobs

Very simple scrapy scraper to get stackoverflow jobs using mongodb as store and pandas to enhance data.

Articles written related to this project:

Workflow steps

The steps should be ran in the following order:

  • run the scraper
  • export from mongodb (export_from_mongo.py)
  • merge exported jobs (merge_exported_jobs.py)
  • enhance data (enhance_data_with_pandas.py)