A graphical representation of relations between programming languages, technologies and skills in demand, based on thousands of job postings.
Dynamic visualization: https://avk0.github.io/skills_graph/
Article on Habr.com (in Russian): https://habr.com/ru/post/500952/
~500 vacancies parced by keyword "Machine Learning" (headhunter.ru)
We can see, that essential skills for machine learning jobs are Python, SQL, Linux and others
Backend: tags scraper and parser
Frontend: dynamic graph visualization
Simple way:
- run
/ipython notebook/hh-ru_scraper.ipynb
, - set SEARCH_WORD to desired,
- run all cells.
For full dynamic visualization
- run
/scraper/hh-ru_scraper.py
folder to scrape more data, - run
/scraper/preprocess.py
to format data, - add resulting file to
/data/for_visualization/
folder, - add new file name in
/data/for_visualization/index.json
file, - load index.html, new button with dynamic graph should appear.
Visualization is based on JavaScript and few Observable notebooks.
Some additional Python visualization can be done using Ipython Notebook: ./scrapers/hh-ru_scraper.ipynb
Any ideas are highly appreciated!
You can add data in json for visualization to the ./data/for_visualization
folder and also insert the title and the name of the file in ./data/for_visualization/index.json
. The data should have the following structure (example):
{
"items": {
"nodes": [{"id": "data science", "popularity": 28}, ...],
"links": [{"source": "data science", "target": "spark", "value": 8}, ...]
}
}
Popularity denotes the total number of occurences of a term (for nodes). Value denotes the number of co-occurences of source and target (for links).
- Used:
- hh.ru
- To try:
- Stackoverflow jobs
- Who is hiring hackernews
- Indeed
- Glassdoor