Resume Processing

Description

A simple NLP project of resume processing using python 3.0.

Python libraries and the dataset

Show categories of resumes present in the dataset

Visualize the number of categories in the dataset

Visualize the distribution of categories in the dataset

Remove the wrong format inside the resume and make it clean and able to read

Wrong format included:

URLs
hashtags
mentions
special letters
puctuations

Left column is original dataset which contains lots of wrong format informations. Right column is the resume dataset after cleaned.

Import NLTK and visualize the most numbers of words larger and vice versa inside the resume

Its easily to read that "Details" appeared 484 times,"Experience" 446 times, as well as "company", "less", "year", "Machine Learning", and etc. These are those most numbers of words appeared in one resume.

Train maching learning model for resume processing and here is the classification report of this dataset

Here I used the onevsrest classifier and KNN classifier. First, split the data into training and data sets.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
UpdatedResumeDataSet.csv		UpdatedResumeDataSet.csv
resume_screen.ipynb		resume_screen.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Resume Processing

Description

Python libraries and the dataset

Show categories of resumes present in the dataset

Visualize the number of categories in the dataset

Visualize the distribution of categories in the dataset

Remove the wrong format inside the resume and make it clean and able to read

Import NLTK and visualize the most numbers of words larger and vice versa inside the resume

Train maching learning model for resume processing and here is the classification report of this dataset

About

Releases

Packages

Languages

PearlCoastal/ResumeProcessing

Folders and files

Latest commit

History

Repository files navigation

Resume Processing

Description

Python libraries and the dataset

Show categories of resumes present in the dataset

Visualize the number of categories in the dataset

Visualize the distribution of categories in the dataset

Remove the wrong format inside the resume and make it clean and able to read

Import NLTK and visualize the most numbers of words larger and vice versa inside the resume

Train maching learning model for resume processing and here is the classification report of this dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages