# AI Recruiting Agent v2 — Architecture & Demo

**Stack:** Elasticsearch · LangChain · OpenAI · sentence-transformers · FastAPI · Streamlit

## 1. New Data Model

Slimmer, focused schemas:

In [None]:
# Candidate
candidate_example = {
    'id': 'uuid',
    'name': 'Ivan Petrov',
    'email': 'ivan@example.com',
    'role': 'Senior Python Developer',        # job title
    'skills': ['python', 'fastapi', 'docker'],
    'education': 'Computer Science',          # specialty only
    'experience': '5 years backend at TechCorp, built FastAPI microservices...',
}

# Vacancy
vacancy_example = {
    'id': 'uuid',
    'title': 'Senior Python Developer',
    'role': 'Backend Developer',
    'required_skills': ['python', 'fastapi', 'docker'],
    'required_education': 'Computer Science',
    'description': 'We need a backend developer to build scalable REST APIs...',
}

print('Schemas defined')

## 2. LangChain Resume Parser

Instead of regex, we use **LangChain + OpenAI** for structured JSON extraction.

In [None]:
PARSE_SYSTEM = '''You are an expert HR data extractor.
Extract fields from the resume and return ONLY valid JSON:
- name, email, role, skills (list), education (specialty), experience (summary)
Do NOT invent data not present in the resume.'''

# Example LangChain chain (pseudocode):
from_chain = '''
llm = ChatOpenAI(model='gpt-4o-mini', temperature=0)
parser = JsonOutputParser(pydantic_object=ParsedCandidate)
chain = ChatPromptTemplate.from_messages([...]) | llm | parser
result = chain.invoke({'resume': raw_text})
'''

# Simulated result:
result = {
    'name': 'Ivan Petrov',
    'email': 'ivan@example.com',
    'role': 'Senior Backend Developer',
    'skills': ['python', 'fastapi', 'postgresql', 'docker', 'kubernetes', 'aws'],
    'education': 'Computer Science',
    'experience': '5 years backend. TechCorp 2020-2024: FastAPI microservices, PostgreSQL. StartupXYZ 2019-2020.'
}
import json
print(json.dumps(result, indent=2))

## 3. Elasticsearch Index Design

Candidates and vacancies stored with **dense_vector** field for semantic search.

In [None]:
CANDIDATE_MAPPING = {
    'mappings': {
        'properties': {
            'id':         {'type': 'keyword'},
            'name':       {'type': 'text'},
            'role':       {'type': 'text'},
            'skills':     {'type': 'keyword'},      # exact match + BM25
            'education':  {'type': 'text'},
            'experience': {'type': 'text'},
            'embedding':  {
                'type': 'dense_vector',
                'dims': 384,                        # all-MiniLM-L6-v2
                'index': True,
                'similarity': 'cosine',
            },
        }
    }
}

LLM_CACHE_MAPPING = {
    'mappings': {
        'properties': {
            'vacancy_id':   {'type': 'keyword'},
            'candidate_id': {'type': 'keyword'},
            'score':        {'type': 'float'},
            'explanation':  {'type': 'text'},
            'updated_at':   {'type': 'date'},
        }
    }
}
print('Mappings defined')

## 4. BM25 — Baseline

Elasticsearch `multi_match` with field boosting. Fastest, no ML required.

In [None]:
BM25_QUERY = {
    'query': {
        'bool': {
            'should': [
                {
                    'multi_match': {
                        'query': 'python fastapi docker postgresql microservices',
                        'fields': [
                            'experience^2',     # experience is most important
                            'skills^3',         # skills highest boost
                            'role^2',
                            'education',
                            'raw_text',
                        ],
                        'fuzziness': 'AUTO',
                    }
                },
                {
                    'terms': {
                        'skills': ['python', 'fastapi', 'docker'],
                        'boost': 4.0                               # exact skill match
                    }
                }
            ]
        }
    },
    'size': 20,
}
print('BM25 query built')
print(f'Top-20 candidates fetched for LLM pre-filter')

## 5. Semantic — Dense Vector KNN

Embedding model: `sentence-transformers/all-MiniLM-L6-v2` (384-dim)

In [None]:
import numpy as np

# Model encodes text to 384-dim normalized vector
# from sentence_transformers import SentenceTransformer
# model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

# ES KNN query
KNN_QUERY = {
    'knn': {
        'field': 'embedding',
        'query_vector': [0.12, -0.05, ...],  # vacancy embedding
        'k': 10,
        'num_candidates': 100,
    }
}

# Simulate cosine similarity ranking
candidates = ['Ivan Petrov', 'Maria Smirnova', 'Alex Kozlov']
# In reality: vacancy_emb @ candidate_emb (normalized = cosine sim)
fake_sims = [0.81, 0.42, 0.21]
for name, sim in sorted(zip(candidates, fake_sims), key=lambda x: -x[1]):
    print(f'{name:20s}: cosine_sim = {sim:.4f}')

## 6. LLM Scoring — with Cache

BM25 top-20 → GPT-4o-mini → structured JSON with explanation. Results cached in ES.

In [None]:
import json

SYSTEM_PROMPT = '''You are a senior technical recruiter AI.
Score the candidate fit 0.0-1.0 and return JSON:
  {"score": float, "matched_skills": [], "missing_skills": [], "explanation": "..."}'''

# Cache lookup before calling OpenAI:
# cache_id = f'{vacancy_id}_{candidate_id}'
# if cached := await es.get(index='llm_cache', id=cache_id): return cached

# Simulated LLM response
llm_response = {
    'score': 0.92,
    'matched_skills': ['python', 'fastapi', 'postgresql', 'docker'],
    'missing_skills': [],
    'explanation': 'The candidate has all required technical skills with 5 years of relevant experience. FastAPI and Docker expertise directly match the vacancy requirements.'
}
print(json.dumps(llm_response, indent=2))

# After scoring, write to cache:
# await es.index(index='llm_cache', id=cache_id, document={**llm_response, 'vacancy_id': ..., 'candidate_id': ..., 'updated_at': ...})

## 7. Hybrid — BM25 + Dense + RRF + Cosine Rerank + LLM

Full pipeline for maximum quality.

In [None]:
# ── RRF (Reciprocal Rank Fusion) ──────────────────────────────────────────
def rrf_fusion(bm25_hits, dense_hits, k=60):
    scores = {}
    meta = {}
    for rank, hit in enumerate(bm25_hits, 1):
        scores[hit['id']] = scores.get(hit['id'], 0) + 1/(k + rank)
        meta[hit['id']] = hit
    for rank, hit in enumerate(dense_hits, 1):
        scores[hit['id']] = scores.get(hit['id'], 0) + 1/(k + rank)
        if hit['id'] not in meta: meta[hit['id']] = hit
    return sorted([{**meta[id], '_rrf': s} for id, s in scores.items()], key=lambda x: -x['_rrf'])

# ── Simulate pipeline ─────────────────────────────────────────────────────
bm25 = [{'id':'c1','name':'Ivan','_bm25': 0.9}, {'id':'c2','name':'Maria','_bm25':0.4}, {'id':'c3','name':'Alex','_bm25':0.2}]
dense = [{'id':'c1','name':'Ivan','_sem': 0.81}, {'id':'c2','name':'Maria','_sem':0.45}, {'id':'c3','name':'Alex','_sem':0.18}]

fused = rrf_fusion(bm25, dense)
print('=== After BM25 + Dense + RRF ===')
for h in fused:
    print(f"{h['name']:20s}: RRF={h['_rrf']:.4f}")

print()
print('Next step: cosine rerank (re-score with stored embeddings)')
print('Then: LLM score top candidates (with cache)')

## 8. Pipeline Summary

```
Vacancy text
    │
    ├─► BM25 (ES multi_match) ──────────────────┐
    │                                             ├─► RRF Fusion
    ├─► Dense KNN (cosine sim, ES knn query) ────┘
    │                                             │
    │                              Cosine rerank (embedding dot product)
    │                                             │
    └─► LLM scoring (GPT-4o-mini) ◄──────────────┘
            │   ▲
            │   │ cache miss
            ▼   │
        ES llm_cache ──► cache hit (instant return)
```

**LLM Cache schema:**
```
llm_cache index:
  vacancy_id   | keyword
  candidate_id | keyword  
  score        | float
  explanation  | text
  updated_at   | date
```

## 9. Quick Start

```bash
# 1. Set API key
echo 'OPENAI_API_KEY=sk-...' > .env

# 2. Start everything
docker compose up --build

# 3. Seed sample data
docker compose exec api python scripts/seed_data.py

# 4. Open
# Swagger:   http://localhost:8000/docs
# Streamlit: http://localhost:8501
# ES:        http://localhost:9200

# 5. Test API
curl 'http://localhost:8000/api/v1/recommendations?job_id=<id>&method=hybrid&top_k=5'
```