The objective of this project was two-fold. First, I wanted to explore the current state of data related job market. Second, I wanted to gain first-hand experience in web-scraping.
Glassdoor is an American company established in 2007. It started off as a platform where employees can anonymously post reviews about salary and workplace environment. It has grown to be one of the most trusted websites for company research and also for job hunting. Glassdoor provides their own salary estimate for many job posting on the website, and also provides a rating for a company.
I wrote two python scripts, namely "scrape.py" and "helper.py", to responsibly scrape the Glassdoor website. I sent queries about Data Scientist/Data Engineer/Data Analyst jobs in select U.S. cities to website, and using the selenium package collected information about the job postings.
I ran my analysis on about 10,000 job postings. The analysis code is in the Jypyter python notebook "analysis2.ipynb".