Ranks candidate resumes against a job description using TF-IDF and BERT semantic embeddings. Outputs a ranked table with match scores.
Manually reading hundreds of resumes is slow and inconsistent. ResumeRank automates this by:
- Reading your job description from a text file
- Extracting text from all candidate resumes (PDF / DOCX / TXT)
- Computing a similarity score between the JD and each resume
- Printing a ranked table of best-matched candidates
- Exporting results to a CSV file
Job Description → TF-IDF Vector (keyword frequencies)
Each Resume → TF-IDF Vector
Cosine Similarity = how "parallel" these vectors are
= 1.0 means perfect match, 0.0 means no overlap
Best for: pure keyword matching — when skills listed in the JD must appear in the resume.
Job Description → 384-dimensional meaning vector (via MiniLM-L6 model)
Each Resume → 384-dimensional meaning vector
Cosine Similarity between the two meaning vectors
Best for: semantic matching — understands that "Python Engineer" ≈ "Software Developer".
ResumeRank/
├── config.py # All settings (paths, method, thresholds)
├── extract_text.py # Extract text from PDF / DOCX / TXT files
├── ranker.py # TF-IDF and BERT ranking logic
├── main.py # CLI entry point ← Run this
├── job_description.txt # Paste your job description here
├── requirements.txt
├── resumes/
│ ├── README.md # Drop candidate resumes here
│ └── (your .pdf/.docx files)
└── results/
└── ranked_resumes.csv ← Auto-generated output
git clone https://github.com/ramlasyaa/ResumeRank.git
cd ResumeRankpython3 -m venv venv
source venv/bin/activate # macOS/Linux
pip install -r requirements.txtEdit job_description.txt with the role you're hiring for.
Drop .pdf, .docx, or .txt resumes into the resumes/ folder.
# Default (TF-IDF, fast)
python main.py
# BERT-based semantic matching (more accurate)
python main.py --method bert
# Custom JD + custom folder
python main.py --jd path/to/jd.txt --resumes path/to/folder/
# Show top 5 only
python main.py --top 5╭──────┬──────────────────────┬─────────────┬────────────────╮
│ Rank │ Resume File │ Match Score │ Match Level │
├──────┼──────────────────────┼─────────────┼────────────────┤
│ 1 │ alice_cv.pdf │ 72.4% │ 🟢 Strong Match │
│ 2 │ bob_resume.docx │ 61.3% │ 🟢 Strong Match │
│ 3 │ carol_profile.pdf │ 48.7% │ 🟡 Moderate │
│ 4 │ dave_resume.pdf │ 31.2% │ 🔴 Weak Match │
╰──────┴──────────────────────┴─────────────┴────────────────╯
💾 Results saved to: results/ranked_resumes.csv
All settings live in config.py:
RANKING_METHOD = "tfidf" # "tfidf" or "bert"
TFIDF_MAX_FEATURES = 5000
TFIDF_NGRAM_RANGE = (1, 2) # unigrams + bigrams
BERT_MODEL = "all-MiniLM-L6-v2"
TOP_N = 10| Component | Tool / Library |
|---|---|
| Language | Python 3.9+ |
| TF-IDF Ranking | Scikit-learn |
| BERT Ranking | sentence-transformers (SBERT) |
| PDF Extraction | PyPDF2 |
| DOCX Extraction | python-docx |
| Data Output | Pandas (CSV) |
| CLI Table | tabulate |
- HR / Recruiting teams — automate first-round resume filtering
- Job portals — rank applicants by JD fit automatically
- University placement cells — shortlist students for company drives
- Freelancers — filter projects that match your own skill set
Built by Ram Lasya · Amrita Vishwa Vidyapeetham