I scrape energy-modeling related job sites and try to figure out the questions from a young engineer.
The goal is to apply Exploratory Analysis, provide quality data through Cleaning and Preparation, and lastly apply in-depth analysis such as using NLTK package and/or Machine Learning methods.
I consider myself a beginner, especially with Python and data science. Therefore, a lot of the work here are referenced from online bloggers. This project is a mean for me to practice the principles I learned as well as digging into something I'm interested in.
- Features: I am having trouble with the "annual salary" feature because there's not much data available from the web-scrape itself.
- Data Quality: I think the search result contains irrelevant jobs to the Building Construction industry. So I'm still working on this.
- Apply Machine Learning Methods: I'm struggling to apply a Machine Learning method because I need a dataset to train, test, and evaluate. This bit is still a bit fuzzy.