Vacancies parser from HH

Vacancies parser from HH

Vacancies parser from HH

This parser can:

Execute some vacancies search on Headhunter
Get all the pagination and find each vacancies
Collect data from each vacancy
Convert data into Excel
Incredible!

Al this with web-interface for your users to work with!

And an additional instruction, how to convert this into an exe-file!!

Development launch

Tested on Windows!

Load this shiny repository to wherever you like using pip:

Press Win + R, type cmd or powershell, and press Enter
Use the cd command to change the current directory to the location where you want to clone the repository

cd path\to\desired\directory

Use the git clone command to clone the repository. Replace <repository_url> with the actual URL of the GitHub repository

git clone https://github.com/aaskorohodov/hh_parser.git

Now it's time to install python!

Visit the official Python website: https://www.python.org/downloads/.
Click on the "Downloads" tab.
Choose the latest version for your system (tested with Python 3.10.xx).
Scroll down to the Files section and download the installer for Windows (usually a .exe file)
Double-click the downloaded installer.
Check the box that says "Add Python to PATH" during installation.
Click "Install Now" to start the installation.
Open Command Prompt or PowerShell.
Type the following command to check the installed Python version:

python --version

You should see the Python version number.

Now you will need a venv!

Open Command Prompt or PowerShell. If one encounters some problems - select another!
1. To open CMD:
  1. Press WIN
  2. Type CMD
  3. Open this thing!
Type the following command to install virtualenv using pip:

pip install virtualenv

Open Command Prompt or PowerShell and navigate to downloaded repository (hh_parser, most-likely).

cd path\to\desired\directory

Create a virtual environment by running:

python -m venv venv

In the same Command Prompt or PowerShell window, activate the virtual environment:

.\venv\Scripts\activate

You should see the virtual environment's name in your command prompt.

Install all libraries required! In the same Command Prompt or PowerShell window:

pip install -r requirements.txt

Make sure that you terminal is in the correct directory! You need root-folder with repository, there should be a file named 'requirements.txt'!

Now, you can launch this in development-mode! In the same Command Prompt or PowerShell window:

python main.py

Make sure that you terminal is in the correct directory! You need root-folder with repository, there should be a file named 'main.py', this is the one you are launching!

Now, you should see something like this:

Copy this address and open this thing with your browser! Now you should see this:

Now use it!

Usage

Type vacancy you need, for example:

Wait a bit:

Great! Now you can load this into Excel:

Excel will appear in 'results' folder, which will be created in the downloaded repository.

Troubleshooting

Most-likely you will face some form of outdated parser-code. This is due to the fact, that websites are being updated from time to time. To troubleshoot this, you will need some skills in Python and HTML!

Make sure url is correct! You can find it in Parser._base_url.
1. Get to the actual page and check if this URL actually works (in your browser)
Check if some elements were changed on the page, for example:
1. 'company = vacancy.find('div', {'class': 'vacancy-serp-item__meta-info-company'}).text'
2. Is it still div?
3. Is it still 'vacancy-serp-item__meta-info-company'?

That should be it!

Convert into exe

To simplify usage, you may want to pack this script into exe. This way you will be able to ssend this script to a User, with no need for this User to install any additional software like Python.

Install this:

pip install auto-py-to-exe

Launch a beautiful GUI:

auto-py-to-exe

Select main.py (or other entrypoint, if you changed the name):
Select 2 additional folders (static and templates):
Push a big button to create your exe!
If all goes well – you will face a folder named 'output'. Find main.exe there, to test it!
Zip all this folder and send to who ever you want!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
docs		docs
output		output
static		static
templates		templates
.env		.env
.gitignore		.gitignore
README.md		README.md
main.py		main.py
parser.py		parser.py
request_maker.py		request_maker.py
requirements.txt		requirements.txt
save_results.py		save_results.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vacancies parser from HH

Development launch

Usage

Troubleshooting

Convert into exe

About

Releases

Packages

Languages

aaskorohodov/hh_parser

Folders and files

Latest commit

History

Repository files navigation

Vacancies parser from HH

Development launch

Usage

Troubleshooting

Convert into exe

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages