# Model Setup
Here we are setting up our langauge models to test

In [None]:
# Use a pipeline as a high-level helper
from transformers import pipeline

print("Getting pipeline:")
liquid = pipeline("text-generation", model="LiquidAI/LFM2.5-1.2B-Instruct")
Qwen = pipeline("text-generation", model="Qwen/Qwen2.5-0.5B-Instruct")
Geilim = pipeline("text-generation", model="NoesisLab/Geilim-1B-Instruct",trust_remote_code=True)

Getting pipeline:


Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

Loading weights:   0%|          | 0/290 [00:00<?, ?it/s]

Loading weights:   0%|          | 0/642 [00:00<?, ?it/s]

## Prompt engineering our model
Here are the example quetions I'm gonna use:
1. What mood are you in right now?
(Lighthearted, intense, emotional, thought-provoking, scary, relaxing)

2. Which genres do you usually enjoy most?
(Action, sci-fi, comedy, romance, thriller, horror, drama, animation, documentary)

3. Do you prefer fast-paced movies or slower, character-driven stories?

4. Open-ended:
Describe a movie you really loved and what specifically made you enjoy it.

## JSON schema
We can pass the input using a specifc JSON schema:
{
  "mood": "lighthearted",
  "genres": ["comedy", "adventure"],
  "pace": "fast",
  "spectacle_vs_story": "balanced",
  "familiar_vs_new": "familiar",
  "open_ended": "I loved Spider-Man: Into the Spider-Verse because it was funny, stylish, and heartfelt."
}

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "mood": {
      "type": "string"
    },
    "genres": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "pace": {
      "type": "string"
    },
    "spectacle_vs_story": {
      "type": "string"
    },
    "familiar_vs_new": {
      "type": "string"
    },
    "open_ended": {
      "type": "string"
    }
  },
  "required": [
    "mood",
    "genres",
    "pace",
    "spectacle_vs_story",
    "familiar_vs_new",
    "open_ended"
  ],
  "additionalProperties": false
}


## System Prompt:
You are a movie recommender. Your job: recommend exactly ONE movie that matches the user's preferences.

Rules:
- Only output valid JSON, nothing else.
- Choose a movie that best matches mood, genres, pacing, and spectacle/story preference.
- Prefer widely-known, easy-to-find movies.
- Avoid recommending a movie the user explicitly mentions loving or disliking in the open_ended answer.
- If the input is unclear or conflicting, choose a broadly appealing movie that fits the stated mood and one genre.
- Do not ask questions. Do not explain your reasoning.
- Output schema:
  {"movie":"<title>","year":<year>,"why":"<1 short sentence>","confidence":0.00}

## User prompt guide:
Given the user's answers below, pick ONE movie recommendation.

User answers (JSON):
{{ANSWERS_JSON}}

Follow the rules from the system prompt. Output only the JSON object.

## 2-4 Shot Examples

Example 1:
User answers (JSON):
{"mood":"scary","genres":["horror","thriller"],"pace":"fast","spectacle_vs_story":"story","familiar_vs_new":"new","open_ended":"I like tense movies with clever twists, not gore."}

Assistant:
{"movie":"A Quiet Place","year":2018,"why":"Fast, tense, clever suspense with minimal gore.","confidence":0.78}

Example 2:
User answers (JSON):
{"mood":"emotional","genres":["drama"],"pace":"slow","spectacle_vs_story":"story","familiar_vs_new":"familiar","open_ended":"I love character growth and bittersweet endings."}

Assistant:
{"movie":"Good Will Hunting","year":1997,"why":"Character-driven drama with emotional growth and warmth.","confidence":0.74}


In [43]:
import json
### CREATED USING THE HELP OF LLMS : CHATGPT
# ── 1. System Prompt ─────────────────────────────────────────────────
SYSTEM_PROMPT = (
    "DO NOT RECCOMEND A MOVIE IN user_mentioned_movies.\n"
    "You are a movie recommender. Your job: recommend exactly ONE movie\n\n"
    "Output schema:\n"
    '{"why":"<1 short sentence>","user_mentioned_movies":["<title>","<title>"],"recommended_movie":"<title>","year":<year>,"confidence":0.00}'
)

# ── 2. Few-Shot Examples ─────────────────────────────────────────────
FEW_SHOT_EXAMPLES = [
    {
        "user": {
            "mood": "scary",
            "genres": ["horror", "thriller"],
            "pace": "fast",
            "spectacle_vs_story": "story",
            "familiar_vs_new": "new",
            "open_ended": "Recently I watched a cool movie called Skinamarink. I like tense movies with clever twists, not gore."
        },
        "assistant": {
            "why": "Fast, tense, clever suspense with minimal gore.",
            "user_mentioned_movies":["Skinamarink"],
            "recommended_movie":"A Quiet Place",
            "year": 2018,
            "confidence": 0.78
        }
    },
    {
        "user": {
            "mood": "emotional",
            "genres": ["drama"],
            "pace": "slow",
            "spectacle_vs_story": "story",
            "familiar_vs_new": "familiar",
            "open_ended": "I love character growth and bittersweet endings. Kinda like the Shawshank Redemption."
        },
        "assistant": {
            "why": "Character-driven drama with emotional growth and warmth.",
            "user_mentioned_movies":["Shawshank Redemption"],
            "recommended_movie":"A Quiet Place",
            "year": 1997,
            "confidence": 0.74
        }
    }
]

# ── 3. User Prompt Template ─────────────────────────────────────────
USER_PROMPT_TEMPLATE = (
    "Given the user's answers below, pick ONE movie recommendation.\n\n"
    "User answers (JSON):\n"
    "{answers_json}\n\n"
    "If the user mentions a movie, pick one similar to the movie they mention."
)

# ── 4. Build the message list ────────────────────────────────────────
def build_messages(user_answers: dict) -> list[dict]:
    """Assemble system prompt, few-shot examples, and user query into a
    chat-message list ready for the pipeline."""
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]

    for ex in FEW_SHOT_EXAMPLES:
        messages.append({
            "role": "user",
            "content": USER_PROMPT_TEMPLATE.format(
                answers_json=json.dumps(ex["user"])
            )
        })
        messages.append({
            "role": "assistant",
            "content": json.dumps(ex["assistant"])
        })

    messages.append({
        "role": "user",
        "content": USER_PROMPT_TEMPLATE.format(
            answers_json=json.dumps(user_answers)
        )
    })

    return messages


# ── 5. Run the model ────────────────────────────────────────────────
def recommend_movie(user_answers: dict, pipe):
    """Send the assembled prompt to the loaded pipeline and return the
    raw model output."""
    messages = build_messages(user_answers)
    result = pipe(messages, max_new_tokens=512, do_sample=True, temperature=0.01)
    return result

print("Model outline ready.")

Model outline ready.


In [44]:
# ── Sample call ──────────────────────────────────────────────────────
sample_answers = {
    "mood": "lighthearted",
    "genres": ["comedy", "adventure"],
    "pace": "fast",
    "spectacle_vs_story": "balanced",
    "familiar_vs_new": "familiar",
    "open_ended": "I loved Spider-Man: Into the Spider-Verse because it was funny, stylish, and heartfelt"
}

sample_answers2 = {
    "mood": "serious",
    "genres": [ "adventure"],
    "pace": "slow",
    "spectacle_vs_story": "story",
    "familiar_vs_new": "familiar",
    "open_ended": "Recently I watched this really cool Movie called Inglorious bastards. I really liked the ending."
}

sample_answers3 = {
    "mood": "lighthearted",
    "genres": ["comedy", "slice of life"],
    "pace": "fast",
    "spectacle_vs_story": "spectacle",
    "familiar_vs_new": "new",
    "open_ended": "I watched this anime Saiki K last week, I wanna see something that's just as funny!"
}

QwenOutput1 = recommend_movie(sample_answers,Qwen)
QwenOutput2 = recommend_movie(sample_answers2,Qwen)
QwenOutput3 = recommend_movie(sample_answers3,Qwen)

Both `max_new_tokens` (=512) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
Both `max_new_tokens` (=512) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
Both `max_new_tokens` (=512) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


In [None]:
import re

# Extract the assistant's final response, stripping the <think> block
def clean_output(output):
    raw_content = output[0]["generated_text"][-1]["content"]
    clean_out = re.sub(r"<think>.*?</think>\s*", "", raw_content, flags=re.DOTALL)
    return clean_out

print(clean_output(QwenOutput1))
print(clean_output(QwenOutput2))
print(clean_output(QwenOutput3))

{"why": "Funny adventure comedy that balances humor and heart.", "user_mentioned_movies": ["Spider-Man: Into the Spider-Verse"], "recommended_movie": "The Dark Knight Rises", "year": 2012, "confidence": 0.85}
{"why": "Adventure filled film with a twist ending that leaves you questioning your choices.", "user_mentioned_movies": ["Inglorious Bastards"], "recommended_movie": "The Dark Knight Rises", "year": 2008, "confidence": 0.85}
{"why": "Funny comedy about everyday life with unexpected twists.", "user_mentioned_movies": ["Saiki K"], "recommended_movie": "The Big Bang Theory", "year": 2005, "confidence": 0.83}


: 