# Recruiting API: Quick Start Guide

 This notebook shows you how to:
 1. Pull candidate data from PostgreSQL
 2. Format it for the API
 3. Send it to match against a job vacancy
 4. Get and analyze the results


In [3]:
# Import required libraries
import pandas as pd
import psycopg2
import json
import requests
import dotenv

# Load environment variables (for API keys)
dotenv.load_dotenv()


True

## 1. Connect to the Database

The PostgreSQL database stores candidate profiles with 3 key tables:
- `person_data`: Basic profile info (name, skills, location)
- `education_data`: Education history (schools, degrees, dates)
- `position_data`: Work experience (companies, titles, responsibilities)


In [None]:
# Connect to your PostgreSQL database
conn = psycopg2.connect(
    dbname="recruiting",
    user="postgres",
    password="postgres",
    host="127.0.0.1",
    port=6543  # Use your actual port forward
)


## 2. Pull Candidate Data

This SQL query joins all tables and formats the data for the API.


In [None]:
query = """
WITH edu AS (
    SELECT
        username,
        COALESCE(
            string_agg(
                format(
                    'edu: id=%s, start_date=%s, end_date=%s, fieldOfStudy=%s, degree=%s, grade=%s, schoolName=%s, description=%s, activities=%s, schoolId=%s',
                    id, start_date, end_date, "fieldOfStudy", degree, grade, "schoolName", description, activities, "schoolId"
                ),
                ' | '
            ),
            ''
        ) AS edu_text
    FROM education_data
    GROUP BY username
),
pos AS (
    SELECT
        username,
        COALESCE(
            string_agg(
                format(
                    'pos: id=%s, companyId=%s, companyName=%s, companyUsername=%s, companyIndustry=%s, companyStaffCountRange=%s, title=%s, location=%s, description=%s, employmentType=%s, start_date=%s, end_date=%s',
                    id, "companyId", "companyName", "companyUsername", "companyIndustry", "companyStaffCountRange", title, location, description, "employmentType", start_date, end_date
                ),
                ' | '
            ),
            ''
        ) AS pos_text
    FROM position_data
    GROUP BY username
)
SELECT
    p.id,
    p."fullName",
    p.summary,
    p.skills,
    p.location,
    p.country,
    p.city,
    concat_ws(' | ', edu_text, pos_text) AS combined_text
FROM person_data p
LEFT JOIN edu ON p.username = edu.username
LEFT JOIN pos ON p.username = pos.username;
"""

# Run the query and get the data
df_raw = pd.read_sql_query(query, conn)
df = df_raw.copy()

# Look at the data structure
df_raw.head(2)


## 3. Format for the API

The API requires JSON data with an 'id' field and a 'text' field.


In [None]:
# Create a copy of the raw data
df = df_raw.copy()

# Format candidate information into a structured text field
df["text"] = df.apply(lambda row: "\n".join([
    f"fullName: {row['fullName']}",
    f"summary: {row['summary']}",
    f"skills: {row['skills']}",
    f"location: {row['location']}",
    f"country: {row['country']}",
    f"city: {row['city']}",
    f"combined_text: {row['combined_text']}"
]), axis=1)

# Keep only the columns needed by the API
df = df[["id", "text"]]

# Show the formatted data
pd.set_option('display.max_colwidth', None)
print(df.iloc[0]["text"][:500] + "...")  # Show sample of first record


## 4. Create a Job Vacancy Description

Create a detailed job description with all requirements.


In [None]:
# Write your job vacancy description
vacancy_text = """
Название должности: Product Manager
Альтернативные тайтлы: Country Manager, Business Development Manager, Regional Manager
Опыт в годах: 3
Обязательные навыки: Product development, Advanced Spanish, Latam region, Native Russian
Дополнительно важно: важен русскоговорящий к-т
Полезные навыки: New markets, Sales management, market research, budgeting
Домены: must have iGaming, Gambling
Локации поиска: Кипр, Латвия, Грузия, Мальта, Польша, Сербия, Мексика, Испания, Эстония, Россия, Болгария, Литва, Казахстан, Португалия, Таиланд, Бразилия, Аргентина, Чили, Турция
Гражданство: не Армении, не Грузии
"""


In [1]:
import pandas as pd

vacancies = pd.read_excel("4 вакансии на 05.05.2025.xlsx")
vacancy_text = str(vacancies.iloc[0].to_dict())
vacancy_text

"{'id': 6436, 'title': 'Node.js Developer', 'description': 'вЂ‹Company Overview:\\n\\nJoin an international team at a company that has pioneered the records and information management industry. With over 225,000 businesses in our client base, including 95% of the Fortune 1000, we focus on building secure and scalable solutions to manage and protect corporate information. Now we are looking for a Node.js Developer with expertise in AWS to help solve business and technology challenges through our engineering and IT consulting services.\\n\\n  \\nProject Overview:\\n\\nLearning and skills platform which contains content and collaboration tools in one place. This platform combines the compliance requirements of a Learning Management System with personalization, social learning, comments and built-in skills functionality. The platform also allows user to gain insights about their learning progress and skills development in real-time, providing them with the necessary feedback to improve.\\n

## 5. Send Data to the API

Submit candidates for matching against the vacancy.


In [5]:

payload = {
    "vacancy_text": vacancy_text,
}

# Send the data to the matching endpoint
response = requests.post(
    "http://localhost:8000/api/v1/matching/match_candidates_batch",
    json=payload
)

print(f"API Response: {response.text}")

# Get the batch job ID from the response
batch_id = json.loads(response.text)["batch_id"]
print(f"Job started with ID: {batch_id}")


API Response: {"batch_id":"batch_681a7a53ee508190995256e066066723","status":"validating"}
Job started with ID: batch_681a7a53ee508190995256e066066723


In [None]:
# Step 2: Check the job status
response = requests.get(f"http://localhost:8000/api/v1/matching/batch_job/{batch_id}")
print(f"Current status: {response.text}")

# Note: The job may take some time to complete
# Keep checking until status is "completed"


## 6. Process the Results

When complete, load and analyze the match results.


In [9]:
# Load and analyze the results
import pandas as pd
# Replace the filename with your actual results file
results = pd.read_json("data/candidate_scores_20250506_191347.json")

# Convert to a flat DataFrame for easier analysis
df_candidates = pd.json_normalize(results["candidates"])

# Show the top matches
df_candidates = df_candidates.sort_values(by="info.score", ascending=False)
print(f"Found {len(df_candidates)} matching candidates")
df_candidates


Found 50 matching candidates


Unnamed: 0,name,sourceId,sourceUrl,sourceType,vacancyId,info.score,info.reasoning
0,Ulises Gascón,1182,https://www.linkedin.com/in/ulisesgascon,linkedin,0,9.5,Ulises Gascón has extensive experience with No...
1,Alasdair Butterworth-West,518,https://www.linkedin.com/in/alasdair-west,linkedin,0,9.0,The candidate has extensive experience with No...
2,Alexis Tondelier,668,https://www.linkedin.com/in/alexis-tondelier-3...,linkedin,0,9.0,The candidate has extensive experience with No...
3,Felipe Carvalho,873,https://www.linkedin.com/in/felipecarvalho07,linkedin,0,9.0,Felipe Carvalho has extensive experience with ...
4,Mir Imad Ahmed,960,https://www.linkedin.com/in/mirimad,linkedin,0,9.0,The candidate has extensive experience with No...
5,Jaxsan Sivanesan,964,https://www.linkedin.com/in/jaxsan-sivanesan-9...,linkedin,0,9.0,The candidate has over 4 years of experience w...
6,Zakariya Mohummed,968,https://www.linkedin.com/in/zakariya-mohummed,linkedin,0,9.0,Zakariya Mohummed has extensive experience wit...
7,Viviane Dias,971,https://www.linkedin.com/in/viviane-p-dias,linkedin,0,9.0,Viviane Dias has over six years of experience ...
8,Adnan Gobeljic,973,https://www.linkedin.com/in/adnan-gobeljic,linkedin,0,9.0,Adnan Gobeljic has extensive experience with N...
9,Andrey Enavin,1006,https://www.linkedin.com/in/andreye,linkedin,0,9.0,Andrey Enavin has extensive experience with No...


In [10]:
df_candidates.to_csv("data/candidates_NodeJS_06.05.2025.csv", index=False)