# Resume JD Skill‑Match – Demo Notebook

This notebook walks through an end‑to‑end demo of the **Resume–Job‑Description Skill‑Matcher** project.

> **Repo:** <https://github.com/schrann/Resume-JD-SkillMatch.git>


## 0️⃣ Environment setup

Run the following once per new environment:

```bash
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm
```


In [6]:

import pathlib, time
import pandas as pd
import spacy
from transformers import pipeline


In [11]:
import sys
import pathlib
sys.path.append(str(pathlib.Path().resolve().parent))

from skill_matcher import extract_skill_phrases, zero_shot_scores # from project script


In [14]:
RESUME_PATH = pathlib.Path('../sample/sample_resume.txt')
JD_PATH = pathlib.Path('../sample/sample_jd.txt')
resume_text = RESUME_PATH.read_text(encoding='utf-8')
jd_text = JD_PATH.read_text(encoding='utf-8')
print(f'Resume chars: {len(resume_text):,} | JD chars: {len(jd_text):,}')

Resume chars: 1,620 | JD chars: 1,998


In [17]:
# extract candidate skill phrases from the JD
nlp = spacy.load('en_core_web_sm', disable=['ner','lemmatizer'])
skills = extract_skill_phrases(jd_text, nlp)
print(f'Found {len(skills)} unique skill phrases (showing first 20):')
skills[:20]

Found 57 unique skill phrases (showing first 20):


['us',
 'we',
 'ml',
 'sql',
 'gpt',
 'xyz',
 'eda',
 'nlp',
 'aws',
 'dask',
 'face',
 'role',
 'date',
 'that',
 'bert',
 'what',
 'nltk',
 'train',
 'spacy',
 'track']

In [22]:
# run zero‑shot entailment scoring
zshot = pipeline('zero-shot-classification',
                 model='facebook/bart-large-mnli')

scored = zero_shot_scores(resume_text, skills, zshot)

df_scores = pd.DataFrame(scored, columns=['skill','score'])
df_scores.head()

Device set to use cpu
Scoring: 100%|██████████| 57/57 [00:53<00:00,  1.07it/s]


Unnamed: 0,skill,score
0,data manipulation,0.994
1,big data processing,0.993
2,data pipelines,0.991
3,cloud platforms,0.991
4,python programming,0.991


In [23]:
# Save to CSV and display top‑N
ts = time.strftime('%Y%m%d_%H%M%S')
OUT_CSV = f'results_{ts}.csv'
df_scores.to_csv(OUT_CSV, index=False)
print(f'Saved → {OUT_CSV}')

TOP_N = 20
display(df_scores.head(TOP_N))

# Plot
df_scores.head(TOP_N).plot.barh(x='skill', y='score', figsize=(8,6), legend=False)


Saved → results_20250618_191445.csv


Unnamed: 0,skill,score
0,data manipulation,0.994
1,big data processing,0.993
2,data pipelines,0.991
3,cloud platforms,0.991
4,python programming,0.991
5,meaningful insights,0.991
6,workflows,0.99
7,strong nlp projects,0.989
8,models,0.987
9,analysis,0.986


ImportError: matplotlib is required for plotting when the default backend "matplotlib" is selected.

## Note :
* If u face any error in using "facebook/bart-large-mnli" swap with lighter models like "valhalla/distilbart-mnli"
