## 0. Install Required Dependencies

**Run this cell first if you encounter import errors**

In [1]:
import sys
import subprocess

# Install required dependencies
dependencies = ['accelerate>=0.26.0', 'datasets']

for dep in dependencies:
    print(f"Installing {dep}...")
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", dep])

print("‚úÖ All dependencies installed! Please restart the kernel (Kernel > Restart Kernel) and then run all cells.")

Installing accelerate>=0.26.0...
Installing datasets...
Installing datasets...
‚úÖ All dependencies installed! Please restart the kernel (Kernel > Restart Kernel) and then run all cells.
‚úÖ All dependencies installed! Please restart the kernel (Kernel > Restart Kernel) and then run all cells.


# Fine-tune SentenceTransformer Models for ITSM Tickets
This notebook fine-tunes the **all-mpnet-base-v2** embedding model (and can be adapted for others) using contrastive learning with pseudo-labeled training data from your ITSM tickets.
## Approach
- **Positive pairs**: Tickets from the same category (assumed similar)
- **Negative pairs**: Tickets from different categories (assumed dissimilar)
- **Loss function**: Cosine Similarity Loss (contrastive learning)
- **Base model**: sentence-transformers/all-mpnet-base-v2 (768-dim embeddings)

## 1. Setup and Imports

In [2]:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(os.getcwd()), '..'))

import json
import torch
from datetime import datetime
import logging
from datasets import DatasetDict  # <-- Added here

# Import sentence-transformers
from sentence_transformers import SentenceTransformer, InputExample, losses
from sentence_transformers.evaluation import EmbeddingSimilarityEvaluator
from torch.utils.data import DataLoader

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

print("‚úÖ Imports successful")
print(f"PyTorch version: {torch.__version__}")
print(f"Device: {'CUDA' if torch.cuda.is_available() else 'CPU'}")

  from .autonotebook import tqdm as notebook_tqdm


‚úÖ Imports successful
PyTorch version: 2.9.1
Device: CPU


## 2. Configuration

In [3]:
# Training configuration
CONFIG = {
    'base_model': 'sentence-transformers/all-mpnet-base-v2',
    'source_data': '../data/servicenow_incidents_full.json',  # Source incidents
    'output_dir': 'models/all-mpnet-finetuned',
    'epochs': 100,  # Start with fewer epochs, can increase if needed
    'batch_size': 32,
    'learning_rate': 2e-5,
    'warmup_steps': 100,
    'eval_split': 0.1  # 10% for evaluation
}

print("Configuration:")
for key, value in CONFIG.items():
    print(f"  {key}: {value}")

Configuration:
  base_model: sentence-transformers/all-mpnet-base-v2
  source_data: ../data/servicenow_incidents_full.json
  output_dir: models/all-mpnet-finetuned
  epochs: 100
  batch_size: 32
  learning_rate: 2e-05
  warmup_steps: 100
  eval_split: 0.1


## 3. Load Training Data

In [6]:
# Generate training pairs from ServiceNow incidents
import random
from collections import defaultdict

# Load ServiceNow incidents
incidents_file = "data/servicenow_incidents_full.json"
print(f"Loading incidents from: {incidents_file}")

with open(incidents_file, 'r') as f:
    incidents = json.load(f)

print(f"Loaded {len(incidents)} incidents")

# Group incidents by category
category_groups = defaultdict(list)
for incident in incidents:
    category = incident.get('category', 'Unknown')
    if category and category != '':
        # Create text representation combining short_description and description
        text = f"{incident.get('short_description', '')}. {incident.get('description', '')}"
        category_groups[category].append({
            'id': incident.get('incident_number', incident.get('sys_id', '')),
            'text': text.strip(),
            'category': category
        })

print(f"\nCategories found: {len(category_groups)}")
for cat, items in category_groups.items():
    print(f"  {cat}: {len(items)} incidents")

# Generate positive pairs (same category)
positive_pairs = []
for category, items in category_groups.items():
    if len(items) >= 2:
        # Create pairs within the same category
        for i in range(len(items)):
            for j in range(i + 1, min(i + 6, len(items))):  # Limit pairs per incident
                positive_pairs.append({
                    'ticket1_id': items[i]['id'],
                    'ticket2_id': items[j]['id'],
                    'text1': items[i]['text'],
                    'text2': items[j]['text'],
                    'category1': category,
                    'category2': category
                })

# Generate negative pairs (different categories)
negative_pairs = []
categories = list(category_groups.keys())
for i in range(len(categories)):
    for j in range(i + 1, len(categories)):
        cat1_items = category_groups[categories[i]]
        cat2_items = category_groups[categories[j]]

        # Sample random pairs between different categories
        num_pairs = min(len(cat1_items) * 2, len(cat2_items) * 2, 50)
        for _ in range(num_pairs):
            item1 = random.choice(cat1_items)
            item2 = random.choice(cat2_items)
            negative_pairs.append({
                'ticket1_id': item1['id'],
                'ticket2_id': item2['id'],
                'text1': item1['text'],
                'text2': item2['text'],
                'category1': item1['category'],
                'category2': item2['category']
            })

print(f"\nüìä Generated Training Pairs:")
print(f"  Positive pairs: {len(positive_pairs)}")
print(f"  Negative pairs: {len(negative_pairs)}")
print(f"  Total pairs: {len(positive_pairs) + len(negative_pairs)}")

# Save to training_pairs.json for future use
training_data = {
    'positive_pairs': positive_pairs,
    'negative_pairs': negative_pairs,
    'metadata': {
        'num_incidents': len(incidents),
        'num_categories': len(category_groups),
        'generated_on': datetime.now().isoformat()
    }
}

training_pairs_path = os.path.join(os.getcwd(), 'data', 'training_pairs.json')
os.makedirs(os.path.dirname(training_pairs_path), exist_ok=True)
with open(training_pairs_path, 'w') as f:
    json.dump(training_data, f, indent=2)

print(f"\n‚úÖ Training pairs saved to: {training_pairs_path}")

Loading incidents from: data/servicenow_incidents_full.json
Loaded 76 incidents

Categories found: 5
  Inquiry / Help: 41 incidents
  Network: 6 incidents
  Hardware: 10 incidents
  Software: 13 incidents
  Database: 2 incidents

üìä Generated Training Pairs:
  Positive pairs: 291
  Negative pairs: 118
  Total pairs: 409

‚úÖ Training pairs saved to: /Users/don/Documents/University/Current Classes/Capstone/let me try again/data/training_pairs.json


In [7]:
# The training pairs are already loaded from the previous cell
# Just display a summary
print(f"\nüìä Training Data Summary:")
print(f"  Positive pairs: {len(positive_pairs)}")
print(f"  Negative pairs: {len(negative_pairs)}")
print(f"  Total pairs: {len(positive_pairs) + len(negative_pairs)}")

# Show example pairs
if positive_pairs:
    print(f"\nüìù Example Positive Pair (same category):")
    example = positive_pairs[0]
    print(f"  Category: {example['category1']}")
    print(f"  Ticket 1 ({example['ticket1_id']}): {example['text1'][:100]}...")
    print(f"  Ticket 2 ({example['ticket2_id']}): {example['text2'][:100]}...")

if negative_pairs:
    print(f"\nüìù Example Negative Pair (different categories):")
    example = negative_pairs[0]
    print(f"  Category 1: {example['category1']}")
    print(f"  Category 2: {example['category2']}")
    print(f"  Ticket 1 ({example['ticket1_id']}): {example['text1'][:100]}...")
    print(f"  Ticket 2 ({example['ticket2_id']}): {example['text2'][:100]}...")


üìä Training Data Summary:
  Positive pairs: 291
  Negative pairs: 118
  Total pairs: 409

üìù Example Positive Pair (same category):
  Category: Inquiry / Help
  Ticket 1 (INC0010054): Equipment selection not saved for new location. During the onboarding process for an additional loca...
  Ticket 2 (INC0010053): Merchant unable to submit e-signed agreement. A Sales Agent's access to configure service fees for a...

üìù Example Negative Pair (different categories):
  Category 1: Inquiry / Help
  Category 2: Network
  Ticket 1 (INC0000053): The SAP HR application is not accessible. I've been trying to access the SAP HR application for the ...
  Ticket 2 (INC0010052): Equipment Configuration Freeze on Legacy Browser. An issue arose where fees were calculated inaccura...


In [13]:
import os
import json
import numpy as np
import pandas as pd
import time
import re # Import regex module
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import openai
from openai import OpenAI
from dotenv import load_dotenv

# Guard against missing `output_path` (cells may be executed out of order)
output_path = globals().get('output_path', None)
if output_path is None:
    default_dir = 'models/all-mpnet-finetuned'
    if 'CONFIG' in globals() and isinstance(CONFIG, dict):
        output_path = os.path.join(os.getcwd(), CONFIG.get('output_dir', default_dir))
    else:
        output_path = os.path.join(os.getcwd(), default_dir)
    print(f"Warning: `output_path` was not defined. Using fallback: {output_path}")

# ----------------------------
# Paths
# ----------------------------

RAW_JSON_PATH = "/Users/don/Documents/University/Current Classes/Capstone/let me try again/data/servicenow_incidents_full.json"
REL_OUT_CSV   = "data/relationship_pairs.csv"
REL_OUT_JSON  = "data/relationship_pairs.json"

MAX_TICKETS = 400       # cap to reduce cost ‚Äî adjust as needed
TOP_K_NEIGHBORS = 5     # candidate neighbors per ticket
SLEEP_BETWEEN_CALLS = 0.4

LLM_MODEL_NAME = "gpt-4o-mini"  # or whichever model you use

RELATION_PROMPT = """
You are an expert in IT Service Management (ITSM) and incident management.
Your task is to analyze two incident tickets and determine the relationship between them.
Based on the short descriptions and descriptions of the two tickets, classify their relationship into one of the following categories:

- **duplicate**: Ticket B is a duplicate of Ticket A. They describe the exact same underlying issue, and one ticket could be closed in favor of the other.
- **related**: Ticket A and Ticket B describe different but highly relevant issues. They might be part of the same larger problem, affect the same system, or require similar solutions, but neither is a direct duplicate of the other.
- **causal**: Ticket B is a direct consequence or cause of Ticket A. For example, Ticket A was created because of an event described in Ticket B, or vice-versa. There is a clear cause-and-effect link.
- **none**: There is no significant relationship between Ticket A and Ticket B based on the provided information.

If you classify a relationship as 'causal', you must also indicate the 'direction' of the causality:
- **A_causes_B**: Ticket A caused Ticket B.
- **B_causes_A**: Ticket B caused Ticket A.
- **mutually_causal**: A and B are mutually causative or part of a feedback loop.

Provide your output as a JSON object with the following keys:
- `label`: (string) One of "duplicate", "related", "causal", or "none".
- `explanation`: (string) A brief, clear explanation for your classification.
- `direction`: (string, required only if label is "causal") One of "A_causes_B", "B_causes_A", or "mutually_causal". If the label is not "causal", set this to "none".

Here are the two incident tickets:

---
**Ticket A (ID: {ticket_a_id})**
Created On: {ticket_a_created}
Affected Application: {ticket_a_app}
Short Description: {ticket_a_short}
Description: {ticket_a_desc}

---
**Ticket B (ID: {ticket_b_id})**
Created On: {ticket_b_created}
Affected Application: {ticket_b_app}
Short Description: {ticket_b_short}
Description: {ticket_b_desc}

---
Example Output for 'duplicate':
```json
{{
  "label": "duplicate",
  "explanation": "Both tickets describe the same login issue for the same application on the same day."
}}
```

Example Output for 'related':
```json
{{
  "label": "related",
  "explanation": "Ticket A reports a database connection error, and Ticket B reports an application outage. The application likely uses the database, suggesting they are related systems."
}}
```

Example Output for 'causal' (A causes B):
```json
{{
  "label": "causal",
  "explanation": "The network outage reported in Ticket A directly led to users being unable to access the application, as reported in Ticket B.",
  "direction": "A_causes_B"
}}
```

Example Output for 'none':
```json
{{
  "label": "none",
  "explanation": "The incidents describe unrelated issues affecting different systems and users."
}}
"""

# Initialize OpenAI client
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI(api_key=api_key)
print(f"DEBUG: OPENAI_API_KEY being used: {api_key}")

# ----------------------------
# 1. Load raw incidents JSON
# ----------------------------

print("Loading:", RAW_JSON_PATH)
with open(RAW_JSON_PATH, "r") as f:
    raw_data = json.load(f)

df = pd.DataFrame(raw_data)
print("Loaded", len(df), "incidents")

# Trim
if len(df) > MAX_TICKETS:
    df = df.iloc[:MAX_TICKETS].copy()
    print(f"Trimmed to first {MAX_TICKETS} incidents.")

# Build unified text field
df["text"] = (
    df["short_description"].fillna("") + "\n\n" +
    df["description"].fillna("")
)

print(df[["incident_number", "short_description"]].head())


# ----------------------------
# 2. Encode ticket embeddings
# ----------------------------

print("\nLoading embedding model from:", output_path)
try:
    embedder = SentenceTransformer(output_path)
except FileNotFoundError:
    print(f"Fine-tuned model not found at {output_path}. Loading base model: {CONFIG['base_model']}")
    embedder = SentenceTransformer(CONFIG['base_model'])

texts = df["text"].astype(str).tolist()
print("Encoding all incidents...")
emb = embedder.encode(
    texts, batch_size=32,
    convert_to_numpy=True,
    normalize_embeddings=True,
    show_progress_bar=True
)

# ----------------------------
# 3. Build similarity matrix
# ----------------------------

print("\nComputing similarities...")
sim = cosine_similarity(emb)

candidate_pairs = []
N = len(df)

for i in range(N):
    sims = sim[i].copy()
    sims[i] = -1.0

    top_idx = np.argsort(sims)[-TOP_K_NEIGHBORS:]
    for j in top_idx:
        if j <= i:
            continue
        candidate_pairs.append((i, j, float(sims[j])))

print("Candidate pairs:", len(candidate_pairs))


# ----------------------------
# 4. Helper: build prompt
# ----------------------------

def build_prompt(a, b):
    return RELATION_PROMPT.format(
        ticket_a_id      = a.get("incident_number", ""),
        ticket_a_created = a.get("sys_created_on", ""),
        ticket_a_app      = a.get("cmdb_ci", ""),
        ticket_a_short   = a.get("short_description", ""),
        ticket_a_desc    = a.get("description", ""),

        ticket_b_id       = b.get("incident_number", ""),
        ticket_b_created = b.get("sys_created_on", ""),
        ticket_b_app     = b.get("cmdb_ci", ""),
        ticket_b_short   = b.get("short_description", ""),
        ticket_b_desc    = b.get("description", "")
    )


# ----------------------------
# 5. LLM call helper
# ----------------------------

def call_llm(prompt):
    try:
        r = client.chat.completions.create(
            model=LLM_MODEL_NAME,
            messages=[{"role": "user", "content": prompt}],
            temperature=0.0,
        )
        raw_text = r.choices[0].message.content.strip()

        # Attempt to extract JSON from potentially markdown-formatted response
        match = re.search(r"```json\s*(\{.*\})\s*```", raw_text, re.DOTALL)
        if match:
            json_text = match.group(1)
        else:
            json_text = raw_text # Assume it's direct JSON if no markdown block

        return json.loads(json_text)
    except json.JSONDecodeError as e:
        # Log the raw response if JSON parsing fails for debugging
        print(f"JSONDecodeError: {e}. Raw LLM response: {raw_text}")
        return {"label": "none", "explanation": f"JSON parsing failed: {e}. Raw response: {raw_text[:200]}...", "direction": "none"}
    except Exception as e:
        return {"label": "none", "explanation": str(e), "direction": "none"}


# ----------------------------
# 6. Label all pairs with LLM
# ----------------------------

labeled = []

print("\nLabeling", len(candidate_pairs), "pairs using LLM‚Ä¶")
for idx, (i, j, sim_val) in enumerate(candidate_pairs, start=1):
    a = df.iloc[i].to_dict()
    b = df.iloc[j].to_dict()

    prompt = build_prompt(a, b)
    result = call_llm(prompt)

    labeled.append({
        "ticket_a_number": a["incident_number"],
        "ticket_b_number": b["incident_number"],
        "text_a": a["text"],
        "text_b": b["text"],
        "similarity": sim_val,
        "label": result.get("label", "none"),
        "direction": result.get("direction", "none"),
        "explanation": result.get("explanation", "")
    })

    if idx % 5 == 0:
        print(f"  ‚Üí {idx}/{len(candidate_pairs)} pairs labeled")

    time.sleep(SLEEP_BETWEEN_CALLS)


# ----------------------------
# 7. Save results
# ----------------------------

df_rel = pd.DataFrame(labeled)

os.makedirs(os.path.dirname(REL_OUT_CSV), exist_ok=True)

df_rel.to_csv(REL_OUT_CSV, index=False)
with open(REL_OUT_JSON, "w") as f:
    json.dump(labeled, f, indent=2)

print("\nSaved relationship pairs:")
print("CSV :", REL_OUT_CSV)
print("JSON:", REL_OUT_JSON)

print("\nSample:")
display(df_rel.head())

2025-11-24 01:07:58,552 - INFO - Use pytorch device_name: mps
2025-11-24 01:07:58,553 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 01:07:58,554 - INFO - Use pytorch device_name: mps
2025-11-24 01:07:58,555 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
2025-11-24 01:07:58,553 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 01:07:58,554 - INFO - Use pytorch device_name: mps
2025-11-24 01:07:58,555 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2


DEBUG: OPENAI_API_KEY being used: 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
Loading: /Users/don/Documents/University/Current Classes/Capstone/let me try again/data/servicenow_incidents_full.json
Loaded 76 incidents
  incident_number                                  short_description
0      INC0010054     Equipment selection not saved for new location
1      INC0010053       Merchant unable to submit e-signed agreement
2      INC0010052   Equipment Configuration Freeze on Legacy Browser
3      INC0010051                   Error in Equipment Configuration
4      INC0010050  Touchscreen malfunction on Merchant's device f...

Loading embedding model from: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
Fine-tuned model not found at /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-m

Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 3/3 [00:00<00:00,  4.67it/s]
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)



Computing similarities...
Candidate pairs: 163

Labeling 163 pairs using LLM‚Ä¶


2025-11-24 01:08:04,777 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:08,232 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:08,232 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:11,024 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:11,024 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:13,271 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:13,271 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:16,240 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:16,240 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 5/163 pairs labeled


2025-11-24 01:08:18,743 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:21,374 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:21,374 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:23,650 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:23,650 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:26,783 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:26,783 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:30,186 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:30,186 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 10/163 pairs labeled


2025-11-24 01:08:32,661 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:35,324 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:35,324 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:37,711 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:37,711 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:40,853 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:40,853 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:43,731 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:43,731 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 15/163 pairs labeled


2025-11-24 01:08:46,344 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:48,883 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:48,883 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:51,376 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:51,376 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:53,897 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:53,897 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:56,363 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:08:56,363 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 20/163 pairs labeled


2025-11-24 01:08:59,187 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:01,411 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:01,411 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:03,536 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:03,536 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:05,928 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:05,928 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:08,678 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:08,678 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 25/163 pairs labeled


2025-11-24 01:09:10,763 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:12,802 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:12,802 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:14,823 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:14,823 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:16,590 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:16,590 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:19,561 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:19,561 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 30/163 pairs labeled


2025-11-24 01:09:22,288 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:24,385 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:24,385 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:26,403 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:26,403 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:29,617 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:29,617 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:31,747 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:31,747 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 35/163 pairs labeled


2025-11-24 01:09:33,856 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:36,037 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:36,037 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:38,403 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:38,403 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:40,657 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:40,657 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:42,425 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:42,425 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 40/163 pairs labeled


2025-11-24 01:09:44,211 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:46,461 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:46,461 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:48,746 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:48,746 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:52,493 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:52,493 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:54,819 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:54,819 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 45/163 pairs labeled


2025-11-24 01:09:56,717 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:59,124 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:09:59,124 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:01,444 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:01,444 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:04,721 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:04,721 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:06,870 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:06,870 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 50/163 pairs labeled


2025-11-24 01:10:09,738 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:11,889 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:11,889 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:13,933 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:13,933 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:16,088 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:16,088 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:18,749 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:18,749 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 55/163 pairs labeled


2025-11-24 01:10:20,797 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:22,912 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:22,912 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:25,712 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:25,712 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:27,658 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:27,658 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:30,012 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:30,012 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 60/163 pairs labeled


2025-11-24 01:10:32,469 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:34,728 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:34,728 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:36,747 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:36,747 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:39,125 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:39,125 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:41,073 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:41,073 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 65/163 pairs labeled


2025-11-24 01:10:43,121 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:46,294 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:46,294 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:48,780 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:48,780 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:51,008 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:51,008 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:54,704 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:10:54,704 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 70/163 pairs labeled


2025-11-24 01:10:57,969 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:00,118 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:00,118 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:02,798 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:02,798 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:05,547 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:05,547 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:07,492 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:07,492 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 75/163 pairs labeled


2025-11-24 01:11:09,857 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:12,306 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:12,306 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:15,478 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:15,478 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:17,937 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:17,937 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:20,906 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:20,906 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 80/163 pairs labeled


2025-11-24 01:11:23,261 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:25,719 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:25,719 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:28,389 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:28,389 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:30,445 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:30,445 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:34,322 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:34,322 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 85/163 pairs labeled


2025-11-24 01:11:36,573 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:39,750 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:39,750 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:43,026 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:43,026 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:45,247 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:45,247 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:47,944 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:47,944 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 90/163 pairs labeled


2025-11-24 01:11:50,500 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:53,162 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:53,162 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:55,519 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:55,519 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:58,422 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:11:58,422 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:00,642 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:00,642 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 95/163 pairs labeled


2025-11-24 01:12:03,504 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:05,349 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:05,349 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:07,601 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:07,601 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:10,065 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:10,065 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:12,709 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:12,709 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 100/163 pairs labeled


2025-11-24 01:12:15,089 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:17,024 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:17,024 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:19,173 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:19,173 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:21,120 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:21,120 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:23,926 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:23,926 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 105/163 pairs labeled


2025-11-24 01:12:25,975 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:28,799 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:28,799 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:30,947 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:30,947 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:32,971 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:32,971 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:35,557 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:35,557 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 110/163 pairs labeled


2025-11-24 01:12:38,836 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:40,779 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:40,779 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:42,758 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:42,758 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:44,877 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:44,877 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:47,274 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:47,274 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 115/163 pairs labeled


2025-11-24 01:12:49,570 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:52,144 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:52,144 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:54,295 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:54,295 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:56,549 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:56,549 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:58,705 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:12:58,705 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 120/163 pairs labeled


2025-11-24 01:13:01,157 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:13:03,616 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:13:03,616 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:13:05,560 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:13:05,560 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:13:07,813 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:13:07,813 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:32,828 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:32,828 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 125/163 pairs labeled


2025-11-24 01:17:34,879 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:37,538 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:37,538 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:39,587 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:39,587 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:41,638 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:41,638 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:43,479 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:43,479 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 130/163 pairs labeled


2025-11-24 01:17:45,631 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:47,615 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:47,615 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:49,716 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:49,716 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:51,796 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:51,796 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:54,846 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:54,846 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 135/163 pairs labeled


2025-11-24 01:17:57,457 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:59,361 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:17:59,361 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:01,809 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:01,809 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:04,373 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:04,373 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:06,852 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:06,852 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 140/163 pairs labeled


2025-11-24 01:18:09,286 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:11,612 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:11,612 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:15,085 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:15,085 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:18,192 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:18,192 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:20,855 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:20,855 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 145/163 pairs labeled


2025-11-24 01:18:23,210 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:25,661 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:25,661 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:29,026 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:29,026 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:31,299 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:31,299 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:33,963 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:33,963 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 150/163 pairs labeled


2025-11-24 01:18:36,012 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:37,957 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:37,957 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:40,034 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:40,034 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:44,613 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:44,613 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:46,763 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:46,763 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 155/163 pairs labeled


2025-11-24 01:18:50,653 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:52,703 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:52,703 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:54,956 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:54,956 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:57,109 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:57,109 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:59,153 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:18:59,153 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "

  ‚Üí 160/163 pairs labeled


2025-11-24 01:19:01,307 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:19:04,375 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:19:04,375 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:19:06,731 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-11-24 01:19:06,731 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



Saved relationship pairs:
CSV : data/relationship_pairs.csv
JSON: data/relationship_pairs.json

Sample:


Unnamed: 0,ticket_a_number,ticket_b_number,text_a,text_b,similarity,label,direction,explanation
0,INC0010054,INC0010048,Equipment selection not saved for new location...,Access Rights Restriction\n\nMerchant reported...,0.420325,related,none,Ticket A describes an issue with equipment sel...
1,INC0010054,INC0010046,Equipment selection not saved for new location...,Access Rights Restriction\n\nMerchant reported...,0.420325,related,none,Ticket A describes an issue with equipment sel...
2,INC0010054,INC0010053,Equipment selection not saved for new location...,Merchant unable to submit e-signed agreement\n...,0.501539,related,none,Ticket A describes an issue with equipment sel...
3,INC0010054,INC0010052,Equipment selection not saved for new location...,Equipment Configuration Freeze on Legacy Brows...,0.515306,related,none,Ticket A describes an issue with onboarding a ...
4,INC0010054,INC0010051,Equipment selection not saved for new location...,Error in Equipment Configuration\n\nSales Agen...,0.780646,related,none,Both tickets involve issues encountered by a S...


In [14]:
# ============================================
# Relationship Classification from relationship_pair.json
# ============================================

import os
import json
import numpy as np
import pandas as pd

from sentence_transformers import SentenceTransformer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.linear_model import LogisticRegression
import joblib

# -------------------------------
# 1. Load fine-tuned embedding model
# -------------------------------

# Guard: ensure `output_path` is defined (not all notebook runs execute previous cells)
output_path = globals().get('output_path', None)
if output_path is None:
    # Try to build from CONFIG if available, otherwise fall back to known default
    default_dir = 'models/all-mpnet-finetuned'
    if 'CONFIG' in globals() and isinstance(CONFIG, dict):
        output_path = os.path.join(os.getcwd(), CONFIG.get('output_dir', default_dir))
    else:
        output_path = os.path.join(os.getcwd(), default_dir)
    print(f"Warning: `output_path` was not defined. Using fallback: {output_path}")
else:
    print(f"Loading fine-tuned SentenceTransformer model from: {output_path}")

try:
    relationship_embedder = SentenceTransformer(output_path)
except FileNotFoundError:
    print(f"Fine-tuned model not found at {output_path}. Loading base model: {CONFIG['base_model']}")
    relationship_embedder = SentenceTransformer(CONFIG['base_model'])

# -------------------------------
# 2. Load relationship_pair.json
# -------------------------------

json_path = "data/relationship_pairs.json"   # TODO: update path if needed

print("Loading relationship pairs from:", json_path)

with open(json_path, "r", encoding="utf-8") as f:
    data = json.load(f)

# Expected JSON structure:
# [
#   {
#       "text_a": "...",
#       "text_b": "...",
#       "label": "duplicate" | "related" | "causal" | "none"
#   },
#   ...
# ]

df_pairs = pd.DataFrame(data)

print("Loaded relationship dataset:")
display(df_pairs.head())

# -------------------------------
# 3. Clean labels
# -------------------------------

valid_labels = ["duplicate", "related", "causal", "none"]
df_pairs = df_pairs[df_pairs["label"].isin(valid_labels)].reset_index(drop=True)

texts_a = df_pairs["text_a"].astype(str).tolist()
texts_b = df_pairs["text_b"].astype(str).tolist()
y_labels = df_pairs["label"].tolist()

print(f"Valid dataset size: {len(df_pairs)}")

# -------------------------------
# 4. Encode ticket texts using SentenceTransformer
# -------------------------------

print("Encoding text_a...")
emb_a = relationship_embedder.encode(
    texts_a,
    batch_size=32,
    convert_to_numpy=True,
    normalize_embeddings=True,
    show_progress_bar=True
)

print("Encoding text_b...")
emb_b = relationship_embedder.encode(
    texts_b,
    batch_size=32,
    convert_to_numpy=True,
    normalize_embeddings=True,
    show_progress_bar=True
)

# -------------------------------
# 5. Build pairwise features
# -------------------------------

def build_pair_features(emb_a, emb_b):
    diff = np.abs(emb_a - emb_b)
    prod = emb_a * emb_b
    return np.hstack([emb_a, emb_b, diff, prod])

X = build_pair_features(emb_a, emb_b)

label2id = {lbl: i for i, lbl in enumerate(valid_labels)}
id2label = {i: lbl for lbl, i in label2id.items()}

y = np.array([label2id[lbl] for lbl in y_labels])

print("Feature matrix:", X.shape)

# -------------------------------
# 6. Train/validation split
# -------------------------------

X_train, X_val, y_train, y_val = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print("Training samples:", len(y_train))
print("Validation samples:", len(y_val))

# -------------------------------
# 7. Train classifier (Logistic Regression)
# -------------------------------

clf = LogisticRegression(
    max_iter=200,
    multi_class="multinomial",
    solver="lbfgs",
    n_jobs=-1
)

print("Training classifier...")
clf.fit(X_train, y_train)

# -------------------------------
# 8. Evaluate
# -------------------------------

y_pred = clf.predict(X_val)

print("\n=== Relationship Classifier Report ===")
print(classification_report(y_val, y_pred, target_names=valid_labels))

print("\nConfusion Matrix:")
print(confusion_matrix(y_val, y_pred))

# -------------------------------
# 9. Save model + label mapping
# -------------------------------

relationship_model_dir = os.path.join(output_path, "relationship_classifier")
os.makedirs(relationship_model_dir, exist_ok=True)

clf_path = os.path.join(relationship_model_dir, "relationship_classifier.joblib")
label_path = os.path.join(relationship_model_dir, "label_mapping.json")

joblib.dump(clf, clf_path)

with open(label_path, "w") as f:
    json.dump({"label2id": label2id, "id2label": id2label}, f, indent=4)

print("Saved classifier to:", clf_path)
print("Saved label mapping to:", label_path)

# -------------------------------
# 10. Inference helper
# -------------------------------

def predict_relationship(text_a, text_b):
    """
    Predict relationship between two ticket texts.
    Returns (label, probability_dict)
    """
    embA = relationship_embedder.encode(
        [text_a],
        convert_to_numpy=True,
        normalize_embeddings=True,
        show_progress_bar=False
    )
    embB = relationship_embedder.encode(
        [text_b],
        convert_to_numpy=True,
        normalize_embeddings=True,
        show_progress_bar=False
    )

    feats = build_pair_features(embA, embB)
    probs = clf.predict_proba(feats)[0]
    pred_id = int(np.argmax(probs))
    pred_label = id2label[pred_id]

    return pred_label, {id2label[i]: float(p) for i, p in enumerate(probs)}

# -------------------------------
# 11. Quick test
# -------------------------------

example_a = "Unable to log in after SAP server restart."
example_b = "SAP authentication error following system reboot."

pred, proba = predict_relationship(example_a, example_b)

print("\nExample Prediction:")
print("Prediction:", pred)
print("Probabilities:", proba)

2025-11-24 01:19:07,333 - INFO - Use pytorch device_name: mps
2025-11-24 01:19:07,333 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 01:19:07,334 - INFO - Use pytorch device_name: mps
2025-11-24 01:19:07,334 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
2025-11-24 01:19:07,333 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 01:19:07,334 - INFO - Use pytorch device_name: mps
2025-11-24 01:19:07,334 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2


Loading fine-tuned SentenceTransformer model from: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
Fine-tuned model not found at /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned. Loading base model: sentence-transformers/all-mpnet-base-v2
Loading relationship pairs from: data/relationship_pairs.json
Loaded relationship dataset:
Loading relationship pairs from: data/relationship_pairs.json
Loaded relationship dataset:


Unnamed: 0,ticket_a_number,ticket_b_number,text_a,text_b,similarity,label,direction,explanation
0,INC0010054,INC0010048,Equipment selection not saved for new location...,Access Rights Restriction\n\nMerchant reported...,0.420325,related,none,Ticket A describes an issue with equipment sel...
1,INC0010054,INC0010046,Equipment selection not saved for new location...,Access Rights Restriction\n\nMerchant reported...,0.420325,related,none,Ticket A describes an issue with equipment sel...
2,INC0010054,INC0010053,Equipment selection not saved for new location...,Merchant unable to submit e-signed agreement\n...,0.501539,related,none,Ticket A describes an issue with equipment sel...
3,INC0010054,INC0010052,Equipment selection not saved for new location...,Equipment Configuration Freeze on Legacy Brows...,0.515306,related,none,Ticket A describes an issue with onboarding a ...
4,INC0010054,INC0010051,Equipment selection not saved for new location...,Error in Equipment Configuration\n\nSales Agen...,0.780646,related,none,Both tickets involve issues encountered by a S...


Valid dataset size: 163
Encoding text_a...


Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 6/6 [00:00<00:00,  6.00it/s]
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 6/6 [00:00<00:00,  6.00it/s]


Encoding text_b...


Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 6/6 [00:00<00:00,  6.82it/s]

Feature matrix: (163, 3072)
Training samples: 130
Validation samples: 33
Training classifier...



huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- A


=== Relationship Classifier Report ===
              precision    recall  f1-score   support

   duplicate       0.00      0.00      0.00         1
     related       0.75      0.69      0.72        13
      causal       0.00      0.00      0.00         1
        none       0.76      0.89      0.82        18

    accuracy                           0.76        33
   macro avg       0.38      0.40      0.39        33
weighted avg       0.71      0.76      0.73        33


Confusion Matrix:
[[ 0  0  0  1]
 [ 0  9  0  4]
 [ 0  1  0  0]
 [ 0  2  0 16]]
Saved classifier to: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned/relationship_classifier/relationship_classifier.joblib
Saved label mapping to: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned/relationship_classifier/label_mapping.json


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))



Example Prediction:
Prediction: related
Probabilities: {'duplicate': 0.027555055780948468, 'related': 0.7095281186062677, 'causal': 0.0202735859234817, 'none': 0.24264323968930207}


## 4. Create Training Examples

In [15]:
# Convert to InputExample objects
train_examples = []

# Add positive pairs (label=1.0 for similar)
for pair in positive_pairs:
    train_examples.append(InputExample(
        texts=[pair['text1'], pair['text2']],
        label=1.0
    ))

# Add negative pairs (label=0.0 for dissimilar)
for pair in negative_pairs:
    train_examples.append(InputExample(
        texts=[pair['text1'], pair['text2']],
        label=0.0
    ))

print(f"Created {len(train_examples)} training examples")

# Split into train/eval
import random
random.shuffle(train_examples)
split_idx = int(len(train_examples) * (1 - CONFIG['eval_split']))
eval_examples = train_examples[split_idx:]
train_examples = train_examples[:split_idx]

print(f"\nüìä Data Split:")
print(f"  Training: {len(train_examples)} examples")
print(f"  Evaluation: {len(eval_examples)} examples")

Created 409 training examples

üìä Data Split:
  Training: 368 examples
  Evaluation: 41 examples


## 5. Load Base Model

In [16]:
print(f"Loading base model: {CONFIG['base_model']}")
print("This may take a minute...\n")

model = SentenceTransformer(CONFIG['base_model'])

print("‚úÖ Model loaded successfully")
print(f"\nModel details:")
print(f"  Max sequence length: {model.max_seq_length}")
print(f"  Embedding dimension: {model.get_sentence_embedding_dimension()}")

2025-11-24 01:19:15,010 - INFO - Use pytorch device_name: mps
2025-11-24 01:19:15,010 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
2025-11-24 01:19:15,010 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2


Loading base model: sentence-transformers/all-mpnet-base-v2
This may take a minute...

‚úÖ Model loaded successfully

Model details:
  Max sequence length: 384
  Embedding dimension: 768
‚úÖ Model loaded successfully

Model details:
  Max sequence length: 384
  Embedding dimension: 768


## 6. Setup Training Components

In [17]:
# Create DataLoader
train_dataloader = DataLoader(
    train_examples,
    shuffle=True,
    batch_size=CONFIG['batch_size']
)

# Define loss function (Cosine Similarity Loss for contrastive learning)
train_loss = losses.CosineSimilarityLoss(model)

# Create evaluator
eval_sentences1 = [ex.texts[0] for ex in eval_examples]
eval_sentences2 = [ex.texts[1] for ex in eval_examples]
eval_scores = [ex.label for ex in eval_examples]

evaluator = EmbeddingSimilarityEvaluator(
    eval_sentences1,
    eval_sentences2,
    eval_scores,
    name='itsm-eval'
)

# Output directory
output_path = os.path.join(os.getcwd(), CONFIG['output_dir'])
os.makedirs(output_path, exist_ok=True)

print("‚úÖ Training components ready")
print(f"\nTotal training batches: {len(train_dataloader)}")
print(f"Evaluation samples: {len(eval_examples)}")
print(f"Output path: {output_path}")

‚úÖ Training components ready

Total training batches: 12
Evaluation samples: 41
Output path: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned


## 7. Train the Model

‚ö†Ô∏è **Note**: Training on CPU will take 5-15 minutes per epoch. GPU is recommended for faster training.

In [18]:
import os
os.environ["WANDB_DISABLED"] = "true"
print("Wandb integration disabled.")

Wandb integration disabled.


In [19]:
print("üöÄ Starting training...")
print("=" * 60)
print(f"Epochs: {CONFIG['epochs']}")
print(f"Batch size: {CONFIG['batch_size']}")
print(f"Learning rate: {CONFIG['learning_rate']}")
print(f"Device: {'CUDA' if torch.cuda.is_available() else 'CPU'}")
print("=" * 60)
print()

# Train
model.fit(
    train_objectives=[(train_dataloader, train_loss)],
    epochs=CONFIG['epochs'],
    evaluator=evaluator,
    evaluation_steps=len(train_dataloader) // 2,  # Evaluate twice per epoch
    warmup_steps=CONFIG['warmup_steps'],
    output_path=output_path,
    optimizer_params={'lr': CONFIG['learning_rate']},
    save_best_model=True,
    show_progress_bar=True
)

print("\n" + "=" * 60)
print("‚úÖ Training complete!")
print("=" * 60)

Using the `WANDB_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WANDB_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WANDB_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).


üöÄ Starting training...
Epochs: 100
Batch size: 32
Learning rate: 2e-05
Device: CPU





Step,Training Loss,Validation Loss,Itsm-eval Pearson Cosine,Itsm-eval Spearman Cosine
6,No log,No log,0.269369,0.311712
12,No log,No log,0.275847,0.334974
18,No log,No log,0.279694,0.325669
24,No log,No log,0.278821,0.30706
30,No log,No log,0.275388,0.297755
36,No log,No log,0.270684,0.358236
42,No log,No log,0.257359,0.381499
48,No log,No log,0.321464,0.432675
54,No log,No log,0.435469,0.455937
60,No log,No log,0.470844,0.465242


2025-11-24 01:19:26,686 - INFO - EmbeddingSimilarityEvaluator: Evaluating the model on the itsm-eval dataset in epoch 0.5 after 6 steps:
2025-11-24 01:19:27,221 - INFO - Cosine-Similarity:	Pearson: 0.2694	Spearman: 0.3117
2025-11-24 01:19:27,223 - INFO - Save model to /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 01:19:27,221 - INFO - Cosine-Similarity:	Pearson: 0.2694	Spearman: 0.3117
2025-11-24 01:19:27,223 - INFO - Save model to /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 01:19:36,120 - INFO - EmbeddingSimilarityEvaluator: Evaluating the model on the itsm-eval dataset in epoch 1.0 after 12 steps:
2025-11-24 01:19:36,120 - INFO - EmbeddingSimilarityEvaluator: Evaluating the model on the itsm-eval dataset in epoch 1.0 after 12 steps:
2025-11-24 01:19:36,895 - INFO - Cosine-Similarity:	Pearson: 0.2758	Spearman: 0.3350
2025-11-24 01:19:36,897 - INFO - Sav


‚úÖ Training complete!


## 8. Save Training Metadata

In [20]:
# Save metadata
metadata = {
    "base_model": CONFIG['base_model'],
    "training_date": datetime.now().isoformat(),
    "epochs": CONFIG['epochs'],
    "batch_size": CONFIG['batch_size'],
    "learning_rate": CONFIG['learning_rate'],
    "num_train_examples": len(train_examples),
    "num_eval_examples": len(eval_examples),
    "num_positive_pairs": len(positive_pairs),
    "num_negative_pairs": len(negative_pairs),
    "device": "cuda" if torch.cuda.is_available() else "cpu"
}

metadata_path = os.path.join(output_path, 'training_metadata.json')
with open(metadata_path, 'w') as f:
    json.dump(metadata, f, indent=2)

print(f"üíæ Model saved to: {output_path}")
print(f"üìù Metadata saved to: {metadata_path}")

üíæ Model saved to: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
üìù Metadata saved to: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned/training_metadata.json


In [21]:
# ============================================
# 8. Relationship Classification (Duplicate / Related / Causal / None)
# ============================================

import os
import numpy as np
import pandas as pd

from sentence_transformers import SentenceTransformer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.linear_model import LogisticRegression
import joblib

# --------------------------------------------
# 8.1 Load fine-tuned embedding model
# --------------------------------------------

# If not already loaded earlier in the notebook:
# Guard against missing `output_path` (cells may be executed out of order)
output_path = globals().get('output_path', None)
if output_path is None:
    default_dir = 'models/all-mpnet-finetuned'
    if 'CONFIG' in globals() and isinstance(CONFIG, dict):
        output_path = os.path.join(os.getcwd(), CONFIG.get('output_dir', default_dir))
    else:
        output_path = os.path.join(os.getcwd(), default_dir)
    print(f"Warning: `output_path` was not defined. Using fallback: {output_path}")
else:
    print(f"Loading fine-tuned SentenceTransformer model from: {output_path}")
relationship_embedder = SentenceTransformer(output_path)

# --------------------------------------------
# 8.2 Load labelled ticket-pair dataset
# --------------------------------------------

# EXPECTED COLUMNS in the CSV:
#   text_a : string - ticket A text (e.g., short_description + description)
#   text_b : string - ticket B text
#   label  : string - one of {"duplicate", "related", "causal", "none"}
pairs_csv_path = "data/relationship_pairs.csv"  # TODO: adjust path

print("Loading relationship training data from:", pairs_csv_path)
df_pairs = pd.read_csv(pairs_csv_path)

# Basic sanity check
print("Sample of relationship dataset:")
display(df_pairs.head())

# Filter to supported labels (in case there is noise)
valid_labels = ["duplicate", "related", "causal", "none"]
df_pairs = df_pairs[df_pairs["label"].isin(valid_labels)].reset_index(drop=True)

# --------------------------------------------
# 8.3 Encode ticket texts into embeddings
# --------------------------------------------

texts_a = df_pairs["text_a"].astype(str).tolist()
texts_b = df_pairs["text_b"].astype(str).tolist()
y_labels = df_pairs["label"].tolist()

print("Encoding ticket pairs with fine-tuned model...")
emb_a = relationship_embedder.encode(
    texts_a,
    batch_size=32,
    show_progress_bar=True,
    convert_to_numpy=True,
    normalize_embeddings=True,
)

emb_b = relationship_embedder.encode(
    texts_b,
    batch_size=32,
    show_progress_bar=True,
    convert_to_numpy=True,
    normalize_embeddings=True,
)

# --------------------------------------------
# 8.4 Build pairwise feature vectors
# --------------------------------------------
# Common trick: combine embeddings using multiple operations:
#   - [emb_a, emb_b, |emb_a - emb_b|, emb_a * emb_b]
# You can tune this later if needed.

def build_pair_features(emb_a: np.ndarray, emb_b: np.ndarray) -> np.ndarray:
    diff = np.abs(emb_a - emb_b)
    prod = emb_a * emb_b
    return np.hstack([emb_a, emb_b, diff, prod])

X = build_pair_features(emb_a, emb_b)

# Map string labels to integers
label2id = {label: idx for idx, label in enumerate(valid_labels)}
id2label = {idx: label for label, idx in label2id.items()}
y = np.array([label2id[label] for label in y_labels])

print("Feature matrix shape:", X.shape)
print("Number of samples:", len(y))

# --------------------------------------------
# 8.5 Train / validation split
# --------------------------------------------

X_train, X_val, y_train, y_val = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print("Train size:", X_train.shape[0])
print("Validation size:", X_val.shape[0])

# --------------------------------------------
# 8.6 Train a simple classifier (Logistic Regression)
# --------------------------------------------

# You can swap this for RandomForest, XGBoost, or MLPClassifier later if desired.
clf = LogisticRegression(
    max_iter=200,
    multi_class="multinomial",
    solver="lbfgs",
    n_jobs=-1,
)

print("Training relationship classifier...")
clf.fit(X_train, y_train)

# --------------------------------------------
# 8.7 Evaluation
# --------------------------------------------

y_pred = clf.predict(X_val)

print("\nClassification report (validation set):")
print(classification_report(y_val, y_pred, target_names=valid_labels))

print("Confusion matrix:")
print(confusion_matrix(y_val, y_pred))

# --------------------------------------------
# 8.8 Save classifier + label mapping
# --------------------------------------------

relationship_model_dir = os.path.join(output_path, "relationship_classifier")
os.makedirs(relationship_model_dir, exist_ok=True)

clf_path = os.path.join(relationship_model_dir, "relationship_classifier.joblib")
labels_path = os.path.join(relationship_model_dir, "label_mapping.json")

joblib.dump(clf, clf_path)

import json
with open(labels_path, "w") as f:
    json.dump({"label2id": label2id, "id2label": id2label}, f)

print("Saved relationship classifier to:", clf_path)
print("Saved label mapping to:", labels_path)

# --------------------------------------------
# 8.9 Inference helper: predict relationship for a single pair
# --------------------------------------------

def predict_relationship(ticket_a_text: str, ticket_b_text: str):
    """
    Predict relationship type between two ticket texts.
    Returns (label, probs_dict).
    """
    # Encode
    emb_a = relationship_embedder.encode(
        [ticket_a_text],
        convert_to_numpy=True,
        normalize_embeddings=True,
        show_progress_bar=False,
    )
    emb_b = relationship_embedder.encode(
        [ticket_b_text],
        convert_to_numpy=True,
        normalize_embeddings=True,
        show_progress_bar=False,
    )
    # Build features
    feats = build_pair_features(emb_a, emb_b)
    # Predict proba
    probs = clf.predict_proba(feats)[0]
    pred_id = int(np.argmax(probs))
    pred_label = id2label[pred_id]
    probs_dict = {id2label[i]: float(p) for i, p in enumerate(probs)}
    return pred_label, probs_dict

# Quick smoke test (replace with real ticket texts)
example_a = "User cannot log into SAP after the weekend maintenance."
example_b = "SAP login fails with authentication error since Sunday night."

pred_label, probs = predict_relationship(example_a, example_b)
print("\nExample prediction:")
print("Ticket A:", example_a)
print("Ticket B:", example_b)
print("Predicted relationship:", pred_label)
print("Class probabilities:", probs)

2025-11-24 02:59:39,887 - INFO - Use pytorch device_name: mps
2025-11-24 02:59:39,887 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 02:59:39,887 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned


Loading fine-tuned SentenceTransformer model from: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
Loading relationship training data from: data/relationship_pairs.csv
Sample of relationship dataset:
Loading relationship training data from: data/relationship_pairs.csv
Sample of relationship dataset:


Unnamed: 0,ticket_a_number,ticket_b_number,text_a,text_b,similarity,label,direction,explanation
0,INC0010054,INC0010048,Equipment selection not saved for new location...,Access Rights Restriction\n\nMerchant reported...,0.420325,related,none,Ticket A describes an issue with equipment sel...
1,INC0010054,INC0010046,Equipment selection not saved for new location...,Access Rights Restriction\n\nMerchant reported...,0.420325,related,none,Ticket A describes an issue with equipment sel...
2,INC0010054,INC0010053,Equipment selection not saved for new location...,Merchant unable to submit e-signed agreement\n...,0.501539,related,none,Ticket A describes an issue with equipment sel...
3,INC0010054,INC0010052,Equipment selection not saved for new location...,Equipment Configuration Freeze on Legacy Brows...,0.515306,related,none,Ticket A describes an issue with onboarding a ...
4,INC0010054,INC0010051,Equipment selection not saved for new location...,Error in Equipment Configuration\n\nSales Agen...,0.780646,related,none,Both tickets involve issues encountered by a S...


Encoding ticket pairs with fine-tuned model...


Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 6/6 [00:01<00:00,  4.62it/s]
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 6/6 [00:01<00:00,  4.62it/s]
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 6/6 [00:01<00:00,  4.70it/s]
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 6/6 [00:01<00:00,  4.70it/s]


Feature matrix shape: (163, 3072)
Number of samples: 163
Train size: 130
Validation size: 33
Training relationship classifier...


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggi


Classification report (validation set):
              precision    recall  f1-score   support

   duplicate       0.00      0.00      0.00         1
     related       0.50      0.46      0.48        13
      causal       0.00      0.00      0.00         1
        none       0.57      0.67      0.62        18

    accuracy                           0.55        33
   macro avg       0.27      0.28      0.27        33
weighted avg       0.51      0.55      0.52        33

Confusion matrix:
[[ 0  0  0  1]
 [ 0  6  0  7]
 [ 0  0  0  1]
 [ 0  6  0 12]]
Saved relationship classifier to: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned/relationship_classifier/relationship_classifier.joblib
Saved label mapping to: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned/relationship_classifier/label_mapping.json


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))



Example prediction:
Ticket A: User cannot log into SAP after the weekend maintenance.
Ticket B: SAP login fails with authentication error since Sunday night.
Predicted relationship: related
Class probabilities: {'duplicate': 0.020470539960257002, 'related': 0.5068420711849255, 'causal': 0.02002132224494434, 'none': 0.4526660666098731}


## 9. Quick Evaluation

In [22]:
# Load the fine-tuned model
finetuned_model = SentenceTransformer(output_path)

# Test with example tickets
if positive_pairs:
    test_pair = positive_pairs[0]

    # Generate embeddings
    emb1 = finetuned_model.encode(test_pair['text1'])
    emb2 = finetuned_model.encode(test_pair['text2'])

    # Calculate similarity
    from sklearn.metrics.pairwise import cosine_similarity
    similarity = cosine_similarity([emb1], [emb2])[0][0]

    print("\nüìä Quick Test:")
    print(f"Category: {test_pair['category1']}")
    print(f"Ticket 1: {test_pair['ticket1_id']}")
    print(f"Ticket 2: {test_pair['ticket2_id']}")
    print(f"\nSimilarity Score: {similarity:.4f}")
    print(f"Expected: High (same category)")

    if similarity > 0.7:
        print("‚úÖ Good! Model correctly identifies similar tickets")
    elif similarity > 0.5:
        print("‚ö†Ô∏è  Moderate similarity - model needs more training")
    else:
        print("‚ùå Low similarity - model may need different approach")

2025-11-24 02:59:45,647 - INFO - Use pytorch device_name: mps
2025-11-24 02:59:45,648 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 02:59:45,648 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1/1 [00:00<00:00, 13.65it/s]
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1/1 [00:00<00:00, 13.65it/s]
Batches: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1/1 [00:00<00:00, 17.85it/s]


üìä Quick Test:
Category: Inquiry / Help
Ticket 1: INC0010054
Ticket 2: INC0010053

Similarity Score: 0.9537
Expected: High (same category)
‚úÖ Good! Model correctly identifies similar tickets





## 10. Next Steps

Now that you have fine-tuned the all-mpnet-base-v2 model, you can:

1. **Use the model locally**:
   ```python
   from sentence_transformers import SentenceTransformer
   model = SentenceTransformer('scripts/finetuning/models/all-mpnet-finetuned')
   embeddings = model.encode(["ticket text here"])
   ```

2. **Update your embedding service** (`app/services/embedding_service.py`) to use this fine-tuned model instead of LM Studio

3. **Run full evaluation** to compare fine-tuned model with LM Studio models:
   ```bash
   python scripts/performance_eval/compare_models.py
   ```

4. **Regenerate embeddings** for all tickets using the fine-tuned model:
   ```bash
   python scripts/populate_embeddings.py
   ```

## 11. Load and Test Fine-tuned Model

In [23]:
# You can reload the model anytime with:
print("Loading fine-tuned model...")
finetuned = SentenceTransformer(output_path)
print(f"‚úÖ Fine-tuned model loaded from: {output_path}")
print(f"Embedding dimension: {finetuned.get_sentence_embedding_dimension()}")

2025-11-24 02:59:46,236 - INFO - Use pytorch device_name: mps
2025-11-24 02:59:46,236 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
2025-11-24 02:59:46,236 - INFO - Load pretrained SentenceTransformer: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned


Loading fine-tuned model...
‚úÖ Fine-tuned model loaded from: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
Embedding dimension: 768
‚úÖ Fine-tuned model loaded from: /Users/don/Documents/University/Current Classes/Capstone/let me try again/models/all-mpnet-finetuned
Embedding dimension: 768
