Problem Statement: Find suitable candidates for project based on required skills and candidates skills
Approach to solve problem:
My approach to solve given problem includes following steps:
- Get data
- Data extraction
- Data cleansing
- Data manipulation
- Calculation of matching factor
- Get suitable candidates on the basis of matching factor
Algorithm:
- Start
- Create two dataframes for candidate and project details.
- Extract required columns ( project id, location, required skills ) from main dataframe of project.
- Data extraction: Extract required columns ( candidate id, location, skills ) from main dataframe of candidate.
- Data manipulation: Convert both dataframes data into lower case for simplicity of comparison.
- Data cleansing: for both dataframes replace multiples by one. Example : java programming, java language, core java, programming in java, java se can be replaced by simple one-word java.
- Calculate matching factor of every candidate for every project Matching factor = 90% of (matching skills) + 10% of (location match)
- Get top 5 candidates having high matching factor for the project.
- Final dataframe (project id, location, required skills, top 5 suitable candidates, matching factor for top 5 candidates.)
Data visualization:
-
Matching factor vs projects
This graph shows project against matching factor
-
Matching factors range
In below graph, orange green shows number of projects which has matching factor between 41 – 60% and so on.
Requirements:
Programming language: Python 3
Python library: Pandas, matplotlib, numpy
Tool used: Google Colab
Time complexity :
Time complexity to reach solution using above mentioned approach is:
T(n) = O(Log( n²))