# LLM Reasoning Re-ranker (Gemini API, Schema-Safe)

This notebook demonstrates **constraint-aware re-ranking** using the **Gemini API**
with **native JSON schema enforcement**.

Key properties:
- Uses Gemini JSON mode + schema
- Deterministic output (`temperature=0`)
- No regex, no repair loops
- Designed to run **1–5 times on free tier**


## 1. Install dependencies

In [3]:
!pip -q install -U google-genai

## 2. Load API key from Colab Secrets

In [4]:
import os
from google.colab import userdata

os.environ["Gemini_API_key"] = userdata.get("Gemini_API_key")
assert os.environ["Gemini_API_key"], "Missing Gemini_API_key in Colab Secrets"
print("Gemini API key loaded")

Gemini API key loaded


## 3. Imports and schema definition

In [5]:
import json
from typing_extensions import TypedDict
from google import genai
from google.genai import types

class Excluded(TypedDict):
    id: int
    reason: str

class Rationale(TypedDict):
    id: int
    rationale: str

class RerankResult(TypedDict):
    ranked_ids: list[int]
    excluded: list[Excluded]
    rationales: list[Rationale]

## 4. User context and candidate set (mock retrieval output)

In [6]:
user_context = {
    "budget": 50,
    "diet": "vegetarian",
    "goal": "high protein",
    "avoid": ["spicy"],
    "preference_notes": "likes quick meals",
}

candidates = [
    {"id": 1, "name": "Veggie Burrito", "price": 12, "protein_g": 18, "spicy": True,  "prep_min": 10},
    {"id": 2, "name": "Tofu Salad",     "price": 14, "protein_g": 22, "spicy": False, "prep_min": 8},
    {"id": 3, "name": "Protein Shake",  "price": 8,  "protein_g": 30, "spicy": False, "prep_min": 2},
    {"id": 4, "name": "Cheese Pizza",   "price": 20, "protein_g": 15, "spicy": False, "prep_min": 15},
]

## 5. Prompt

In [7]:
prompt = f"""
Re-rank recommendation candidates under constraints.

User context:
{json.dumps(user_context)}

Candidates:
{json.dumps(candidates)}

Hard constraints:
- Must respect diet.
- Must avoid items matching 'avoid' list.
- Must be within budget.

Soft preferences:
- Prefer higher protein.
- Prefer quicker prep time if comparable.

Return ONLY valid JSON matching the required schema.
"""

## 6. Call Gemini with JSON mode + schema

In [9]:
client = genai.Client(api_key=os.environ["Gemini_API_key"])

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=prompt,
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        response_schema=RerankResult,
        temperature=0.0,
    ),
)

result = json.loads(response.text)
result

{'ranked_ids': [3, 2, 4],
 'excluded': [{'id': 1,
   'reason': "The item is marked as spicy, which is on the user's avoid list."}],
 'rationales': [{'id': 3,
   'rationale': "This item offers the highest protein content (30g) and the fastest preparation time (2 min), making it the best fit for the user's goals."},
  {'id': 2,
   'rationale': 'Provides a high protein content (22g) and a quick preparation time (8 min) while adhering to all dietary constraints.'},
  {'id': 4,
   'rationale': 'While it meets the dietary requirements, it has the lowest protein content and the longest preparation time among the valid candidates.'}]}

## 7. Apply ranking and print results

In [10]:
id_to_item = {c["id"]: c for c in candidates}
rationales = {r["id"]: r["rationale"] for r in result["rationales"]}

print("Excluded:")
for e in result["excluded"]:
    print(f"- {e['id']}: {e['reason']}")

print("\nRanked:")
for i, cid in enumerate(result["ranked_ids"], 1):
    item = id_to_item[cid]
    print(f"{i}. {item['name']} (id={cid}): {rationales.get(cid, '')}")

Excluded:
- 1: The item is marked as spicy, which is on the user's avoid list.

Ranked:
1. Protein Shake (id=3): This item offers the highest protein content (30g) and the fastest preparation time (2 min), making it the best fit for the user's goals.
2. Tofu Salad (id=2): Provides a high protein content (22g) and a quick preparation time (8 min) while adhering to all dietary constraints.
3. Cheese Pizza (id=4): While it meets the dietary requirements, it has the lowest protein content and the longest preparation time among the valid candidates.


## Notes: where this fits in a personalization system

Typical pipeline:
- Retrieval produces ~100–1000 candidates
- Ranking model scores candidates
- LLM reasoning re-ranks a *small* subset (e.g., top 20–50) under constraints

Strengths:
- Constraint satisfaction
- Explanations
- Rapid rule injection

Limitations:
- Latency-sensitive
- Must constrain outputs (JSON, tools, guardrails)
- Not used for large-scale retrieval
