Scraping Data Science jobs from LinkedIn and Storing them into a SQL Server database.
See the notebook Job-Mystery.ipynb
.
As I was looking for an internship in Data Science, I needed to gain more information about the skills required to get a job in Data Science. However, manually gathering this info would have been a tedious task. Inspired by some friends, I decided to create a bot to do it for me.
I use the following tools:
- Selenium: a perfect tool for interacting with a browser from the comfort of my PyCharm code.
- Python: elegant, beautiful, easy to use.
The bot was successfully launched and performed the following tasks:
- Scraped job postings from LinkedIn.
- Stored the scraped data into a Microsoft SQL Server database.
TODO
- Build a scraper bot
- Scrape jobs
- Create a word cloud of skills.
The project is organized into the following files:
Scraper is the bot that interacts with LinkedIn to scrape jobs.
Database is an object that interacts with the database.
CSV.py contains utility functions to save the data into a datframe.
- Clone the repo
git clone https://github.com/BecayeSoft/Jobs-Mystery.git
cd Jobs-Mystery
- Install the requirements
pip install -r requirements.txt
- Set your credentials
The script below a your LinkedIn credentials to a file named "credentials.env". Make sure to replace "youremail" and "yourpassword" with your LinkedIn credentials.
echo "EMAIL=youremail" > credentials.env
echo "PASSWORD=yourpassword" >> credentials.env
- Run the main file
Edit the main file to change the job title and the location of the job. Also, remove the database part if you don't have a Microsoft SQL Server database.
python main.py