Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Latest commit 82953dc Sep 14, 2017
Type Name Latest commit message Commit time
Failed to load latest commit information.
Client Materials Project 3 Sep 14, 2017
Code Project 3 Sep 14, 2017 Update Sep 14, 2017

Project 3: Web Scraping Indeeed Job Listings


I built a web scraper, which used BeautifulSoup to parse data science job listings in twenty different cities across the US. This scraper pulled 5,000 postings for jobs per location. This data was used to find out which factors most directly increased salaries for data scientists. Job listings were categories into either above the mean salary or below. Predicted salaries were developed with a random forests model and separately with support vector machines. L1 regularization was employed. The web scraper and all models were built with Python (BeautifulSoup, Scikit-Learn, NLTK, Pandas).


  • Jupyter Notebook: "Project 3 - Web Scraping Indeed Job Listings Jupyter Notebook.ipynb"
  • Presentation: "Presentation Web Scraping Indeed Data Science Positions.pdf"
  • Executive Summary: "Exec Summary - Web Scraping Indeed Job Listings.pdf"
You can’t perform that action at this time.