The Resume Parser is a Python script that extracts relevant information such as educational background, work experience, and skills from a resume in PDF format.
The Resume Parser follows these steps to extract information from a resume:
-
The PDF file is opened and processed using the
pdfminer
library, which extracts the text content from each page of the PDF. -
The extracted text is stored as a string.
-
Regular expressions are used to search for patterns and extract the educational background and work experience sections from the resume text. These regular expressions can be customized in the
extract_education
andextract_experience
functions of thenewparser.py
file. -
If a
skills_list.csv
file is provided, the script reads the file and creates a list of skills to search for in the resume. Each skill should be placed on a separate line in the CSV file. -
The script searches for each skill in the resume text using case-insensitive matching. If a skill is found, it is added to the list of extracted skills.
-
The extracted educational background, work experience, and skills are displayed in the console output.
-
Clone the repository to your local machine:
git clone https://github.com/mkswagger/Amazing-Python-Scripts/tree/master/Resume_parser
-
Place the resume PDF file you want to parse in the project directory.
-
Modify the
file_name
variable in thenewparser.py
file to match the name of your resume file. -
Optionally, if you have a wide range of skills to extract, create a CSV file named
skills_list.csv
in the project directory. Each skill should be placed on a separate line. -
Run the script:
python newparser.py
The script will extract the educational background, work experience, and skills from the resume and display them in the console. Feel free to customize the regular expressions and add additional extraction logic based on your specific requirements.