## Setting Up the Environment

We begin by importing necessary libraries:

- **os, json, pathlib.Path**: For file system operations and reading configuration files.
- **pandas**: To handle data in structured formats.
- **transformers**: To load and work with a language model from Hugging Face.
- **IPython.display**: To display output in Markdown format for readability.


In [20]:
import os
import json
import toml
import pandas as pd
from pathlib import Path
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    pipeline,
)
from IPython.display import display, Markdown


This step ensures the environment has all the required tools and libraries loaded.

## Configuration and Credentials

Next, we define and set up key file paths such as the root directory, data directory, and configuration directory. The script reads Hugging Face credentials from a `credentials.json` file. This allows secure access to private model resources. We also check if the environment variable `HUGGINGFACE_TOKEN` exists to ensure authentication with Hugging Face is configured properly.

In [21]:
# %%
# Paths
paths = {
    'root': Path.cwd().parent,
    'data': Path.cwd().parent / "data",
    "config": Path.cwd().parent / "config"
}

# Load Hugging Face credentials
with open(paths["config"] / 'credentials.json') as f:
    credentials = json.load(f)

if "HUGGINGFACE_TOKEN" in os.environ or "HUGGINGFACE_TOKEN" in credentials:
    print("Environment variable HUGGINGFACE_TOKEN set.")

Environment variable HUGGINGFACE_TOKEN set.


## Load External Text Prompts
Load system description and user prompt template from external files.

In [22]:
# Load system description and user prompt template from files
with open(paths["config"] / 'role_system_description.md', 'r') as f:
    role_system_description = f.read()

with open(paths["config"] / 'user_prompt_template.md', 'r') as f:
    user_prompt_template = f.read()

with open(paths["config"] / 'response_format.md', 'r') as f:
    response_format = f.read()

with open(paths["config"] / 'search_criteria.md', 'r') as f:
    search_criteria = f.read()


## Loading Instructions & Candidate Data

This section attempts to load candidate data from a Parquet file. If reading the Parquet file fails (e.g., due to format issues or missing file), it falls back to loading a CSV version. The data contains job titles and a ranking index, which is used for further processing.

In [23]:
# Load instructions
try:
    with open(paths["config"] / "instructions.md", 'r') as file:
        instructions = file.read()
        print("Instructions successfully read!")
except FileNotFoundError:
    instructions = ""
    print("Instructions file not found.")

# Load shortlisted candidates from parquet or CSV
try:
    list_of_candidates = (
        pd.read_parquet(paths['data'] / "processed/filtered.parquet", columns=['job_title'])['job_title']
          .to_list()
    )
except Exception as e:
    print(f"Failed to load parquet file: {e}. Loading CSV instead.")
    list_of_candidates = (
        pd.read_csv(paths['data'] / "processed/filtered.csv", index_col=['rank'], usecols=['rank', 'job_title'])
          ['job_title']
          .to_list()
    )


Instructions successfully read!


## Initializing the Language Model

Here, we initialize a language model pipeline for text generation:

1. **Select a Model**: We use Microsoft's Phi-3 model, a compact variant suitable for inference.
2. **Check Device**: The script checks if a GPU is available and uses it for faster computation; otherwise, it falls back to CPU.
3. **Load Model and Tokenizer**: The model and its tokenizer are loaded, and a text generation pipeline is set up for ease of use.

This setup allows us to generate human-like text based on specified prompts.

In [25]:
# %%
# Initialize the model and tokenizer
model_name = "microsoft/Phi-3-mini-128k-instruct"

device = "cuda" if torch.cuda.is_available() else "cpu"  # Use GPU if available

# Adjust model loading for GPU
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    # torch_dtype=torch.float16,  # Use half precision
    trust_remote_code=True,
    device_map=device,  # Use GPU if available
)

tokenizer = AutoTokenizer.from_pretrained(model_name)

# Set up the pipeline
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

Loading checkpoint shards: 100%|██████████| 2/2 [01:24<00:00, 42.25s/it]
Device set to use cuda


## Prepare Dynamic Prompt Parameters

In [26]:
# Define dynamic parameters for the prompt
search_terms_path = paths["config"] / 'search_terms.toml'
search_terms = toml.load(search_terms_path)

list_similar_roles = ', '.join(search_terms['search_phrases'])
location = search_terms['location']

# Format the user prompt using the template and parameters
user_prompt = user_prompt_template.format(
    location=location,
    list_similar_roles=list_similar_roles,
    search_criteria=search_criteria,
    list_of_candidates="\n".join(list_of_candidates)  # Joining list elements for display
)

# Combine system description and user prompt for models that don't support system roles
combined_prompt = f"{role_system_description}\n\n{user_prompt}"

## Generating and Displaying Output

In the final step:

1. **Define Search Criteria**: We specify a job search phrase and location.
2. **Sample Candidate Data**: A random sample of job titles from the loaded data is selected.
3. **Prepare Prompt Messages**: Messages are structured for the model:
   - A system message sets the context for the AI.
   - A user message defines the task: searching for suitable candidates based on criteria, with sample data included.
4. **Generate Text**: The model generates text based on the prompt.
5. **Display Output**: The generated text is displayed in Markdown format, making it readable and formatted nicely.

This flow demonstrates how to integrate data processing, model interaction, and result display, offering both technical depth and executive overview.

In [None]:
# Prepare messages in the format required by the model
messages = [
    {"role": "system", "content": role_system_description},
    {"role": "user", "content": user_prompt},
]

# Generate text
generation_args = {
    "max_new_tokens": 5_000,
    "return_full_text": False,
    "do_sample": False,
}

output = pipe(combined_prompt, **generation_args)
display(Markdown(f"Generated Output:", output[0]['generated_text']))