# Simple resume analyzer

This notebook helps recruiters and hiring managers to detect the best resumes very quickly based on their custom keywords. This code can accelerate finding a matched employee.

This notebook is developed by Melanee on this repository: https://github.com/Melanee-Melanee/Resume_Analyzer

# Usage
1- Install PyPDF2: ``` !pip install PyPDF2 ```

2- Move all the resumes in a same directory and define your directory path in: ``` directory_path ```

3- Replace your custom keywords in: ``` keywords ```

4- Run the cell

In [1]:
!pip install PyPDF2 

Defaulting to user installation because normal site-packages is not writeable
    sys-platform (=="darwin") ; extra == 'objc'
                 ~^[0m[33m
[0m

In [5]:
#count function in the strings for some keywords in multiple PDFs and sort them

import os
import PyPDF2

def count_keywords_in_pdf(pdf_path, keywords):
    
    keyword_counts = {keyword: 0 for keyword in keywords}

    with open(pdf_path, 'rb') as file:
        
        reader = PyPDF2.PdfReader(file)
        for page in reader.pages:
            
            text = page.extract_text()
            if text: # Check if text extraction was successful
                
                text = text.lower() # Convert to lower case for case-insensitive matching
                for keyword in keywords:
                    
                    keyword_lower = keyword.lower()
                    keyword_counts[keyword] += text.count(keyword_lower)

    return keyword_counts

def count_keywords_in_directory(directory, keywords):
    
    pdf_keyword_counts = []
    
    for filename in os.listdir(directory):
        
        if filename.endswith('.pdf'):
                
            pdf_path = os.path.join(directory, filename)
            counts = count_keywords_in_pdf(pdf_path, keywords)
            total_count = sum(counts.values())
            pdf_keyword_counts.append((pdf_path, counts, total_count))
            
    pdf_keyword_counts.sort(key=lambda x: x[2], reverse=True)
    
    # Print sorted results
    for pdf_path, counts, total_count in pdf_keyword_counts:
           
        print(f" PDF: {pdf_path} (Total Keywords: {total_count})")
        for keyword, count in counts.items():
            print(keyword, ":", count)


# Example usage
directory_path = '/home/melanee/' # Change this to your PDF directory

keywords = ['python', 'data', 'science', 'AI', 'machine learning'] # Replace with your custom keywords

count_keywords_in_directory(directory_path, keywords)

 PDF: /home/melanee/Resume_2.pdf (Total Keywords: 19)
python : 7
data : 2
science : 3
AI : 6
machine learning : 1
 PDF: /home/melanee/Resume_1.pdf (Total Keywords: 14)
python : 4
data : 1
science : 1
AI : 6
machine learning : 2
 PDF: /home/melanee/Resume_3.pdf (Total Keywords: 10)
python : 0
data : 2
science : 3
AI : 2
machine learning : 3


# Warning
This notebook is under license and any publishing without referring to my repository is prohibited!

Developer: [Melanee](https://github.com/Melanee-Melanee/Resume_Analyzer)