# Personalization Agent Demo

## Connect to the Weaviate Cloud instance

> Reminder: Weaviate Agents are only available for Weaviate Cloud instances.

Connect to your Weaviate instance, using credentials from the Weaviate Cloud console. Here, they are loaded from the `.env` file.

In [1]:
from dotenv import load_dotenv
import weaviate
import os

load_dotenv()

weaviate_url = os.getenv("WEAVIATE_URL")
weaviate_api_key = os.getenv("WEAVIATE_API_KEY")

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=weaviate_url,
    auth_credentials=weaviate_api_key,
)

assert client.is_ready()

## Add data

We add two datasets here, one with books and another with movies. The datasets are loaded from the Hugging Face Hub, and they are pre-vectorized using `Snowflake/snowflake-arctic-embed-l-v2.0`. 

### Load data & inspect it briefly

In [2]:
from datasets import load_dataset

movies_dataset = load_dataset("jphwang/weaviate-demos", "movies", split="train", streaming=True)

In [3]:
for d in [movies_dataset]:
    print(f"Dataset: {d.config_name}")
    counter = 0
    for o in d:
        if counter >= 5:
            break
        print(o)
        counter += 1

Dataset: movies
{'properties': {'release_date': Timestamp('2021-12-15 00:00:00'), 'title': 'Spider-Man: No Way Home', 'overview': 'Peter Parker is unmasked and no longer able to separate his normal life from the high-stakes of being a super-hero. When he asks for help from Doctor Strange the stakes become even more dangerous, forcing him to discover what it truly means to be Spider-Man.', 'popularity': 5083.9541015625, 'vote_count': 8940, 'vote_average': 8.300000190734863, 'original_language': 'en', 'genre': 'Action, Adventure, Science Fiction', 'poster_url': 'https://image.tmdb.org/t/p/original/1g0dhYtq4irTY1GPXvft6k4YLjm.jpg'}, 'vector': [0.006379283033311367, 0.0007750422228127718, -0.011245766654610634, -0.047450534999370575, 0.0037151696160435677, -0.008632986806333065, 0.06163017451763153, 0.028453655540943146, -0.04843633621931076, -0.02299284003674984, 0.001112703001126647, -0.012430387549102306, -0.010701630264520645, -0.05831466242671013, -0.05151097849011421, 0.0872640982270

### Prepare the Collections

Here we create collections and add the objects. 

In [4]:
# ONLY run this if you want to delete the existing collection & data
client.collections.delete(["Movie"])

In [5]:
from weaviate.classes.config import Configure, Property, DataType

if not client.collections.exists("Movie"):
    client.collections.create(
        "Movie",
        description="A dataset that lists movies, their ratings, original language etc..",
        properties=[
            Property(
                name="title",
                data_type=DataType.TEXT,
                description="The title of the movie",
            ),
            Property(
                name="release_year",
                data_type=DataType.INT,
                description="The release year of the movie",
            ),
            Property(
                name="overview",
                data_type=DataType.TEXT,
                description="Short description of the movie",
            ),
            Property(
                name="genres",
                data_type=DataType.TEXT_ARRAY,
                description="The genres of the movie, in an array format",
            ),
            Property(
                name="vote_average",
                data_type=DataType.NUMBER,
                description="The average user rating of the movie; range is 0-10",
            ),
            Property(
                name="vote_count",
                data_type=DataType.INT,
                description="The number of user votes for the movie",
            ),
            Property(
                name="popularity",
                data_type=DataType.NUMBER,
                description="Calculated popularity of the movie by weighing multiple factors; range is 0-100",
            ),
            Property(
                name="poster_url",
                data_type=DataType.TEXT,
                description="A TMDB URL of the movie poster image",
            ),
            Property(
                name="original_language",
                data_type=DataType.TEXT,
                description="A two-letter code (e.g. 'en') representing the original language of the movie",
            ),
        ],
        vectorizer_config=[
            Configure.NamedVectors.text2vec_weaviate(
                name="default",
                source_properties=["title", "description"],
                model="Snowflake/snowflake-arctic-embed-l-v2.0"
            )
        ],
    )

Import data

In [6]:
from tqdm import tqdm
from weaviate.util import generate_uuid5

movies = client.collections.get("Movie")

with movies.batch.fixed_size(batch_size=100) as batch:
    for item in tqdm(movies_dataset):
        obj = item["properties"]

        # Convert release_date to release_year
        obj["release_year"] = obj["release_date"].year
        obj.pop("release_date")

        # Add object to batch for import
        batch.add_object(
            properties=item["properties"],
            uuid=generate_uuid5(item["properties"]["title"]),
            vector={"default": item["vector"]},
        )

# Check for any failed objects during import
if movies.batch.failed_objects:
    print(f"{len(movies.batch.failed_objects)} objects failed during import:")
    for failed in movies.batch.failed_objects[:3]:
        print(failed.message)

9826it [00:13, 703.67it/s] 


## Connect to the Personalization Agent

You can initialize the Personalization Agent, or connect to an existing one, as shown below.

In [7]:
from weaviate.agents.personalization import PersonalizationAgent
from weaviate.classes.config import DataType

collection_name = "Movie"

if PersonalizationAgent.exists(client, collection_name):
    pa = PersonalizationAgent.connect(
        client=client,
        reference_collection=collection_name,
        vector_name="default",
    )
else:
    pa = PersonalizationAgent.create(
        client=client,
        reference_collection=collection_name,
        vector_name="default",
        user_properties={
            "age": DataType.NUMBER,
            "favorite_genres": DataType.TEXT_ARRAY,
            "favorite_years": DataType.NUMBER_ARRAY,
            "language": DataType.TEXT,
        },
    )

## Create a persona

A "persona" is where the agent stores its knowledge about a user. You can add a persona or use an existing one. 

In [8]:
from weaviate.agents.classes import Persona
from weaviate.util import generate_uuid5
from uuid import uuid4  # If you want to generate a random UUID

persona_id = generate_uuid5("jphwang")  # To generate a deterministic UUID
# persona_id = uuid4()  # To generate a random UUID

# You can delete a persona if you want to remove it from the system
pa.delete_persona(persona_id)

if pa.has_persona(persona_id):
    print(f"Persona with ID {persona_id} already exists.")
else:
    print(f"Creating new persona with ID {persona_id}.")
    pa.add_persona(
        Persona(
            persona_id=persona_id,
            properties={
                "age": 18,
                "favorite_genres": ["Sci-Fi", "Fantasy", "Action"],
                "favorite_years": [1999, 2008, 2018, 2019],
                "language": "English",
            },
        )
    )

Creating new persona with ID 2fb2f796-8f16-5dc9-8e5d-c285a92a9dac.


## Add interactions

This is how the agent learns each persona's preferences.

In [9]:
from weaviate.agents.classes import PersonaInteraction
from helpers import get_movie_uuid  # Helper to get the UUID of a movie

pa.add_interactions(interactions=[
    PersonaInteraction(
        persona_id=persona_id, item_id=get_movie_uuid(client, "Independence Day"), weight=0.8
    ),
])

Fetched movie 'Independence Day' from the collection


## Queries

We can already perform queries.

### Basic queries

Fast, most basic personalized queries

- Uses vectors of interaction history only

In [10]:
response = pa.get_objects(persona_id, limit=50, use_agent_ranking=False)

In [11]:
from helpers import print_movie_response_details

print_movie_response_details(response, 5)

*****0*****
Independence Daysaster
None
original rank: 0, personalized rank: None
*****1*****
Independence Day
None
original rank: 1, personalized rank: None
*****2*****
Alien Resurrection
None
original rank: 2, personalized rank: None
*****3*****
Star Trek: Insurrection
None
original rank: 3, personalized rank: None
*****4*****
Redemption Day
None
original rank: 4, personalized rank: None


### Agent reranking

The agent can smartly rerank the results based on the information about the persona, as well as the interactions.

In [12]:
response = pa.get_objects(persona_id, limit=50, use_agent_ranking=True)

print_movie_response_details(response, 5)

Ranking rationale: Since you love sci-fi, action, and fantasy movies from around 1999, 2008, 2018, and 2019, I've prioritized thrilling alien invasion and space-themed action films that match your favorite genres and years. Movies with strong sci-fi elements, great action sequences, and connections to your positive interest in alien defense themes have been boosted, while less relevant or lower rated films are ranked lower.
*****0*****
Occupation
None
original rank: 9, personalized rank: 1
*****1*****
Pacific Rim: Uprising
None
original rank: 19, personalized rank: 2
*****2*****
The Matrix Resurrections
None
original rank: 41, personalized rank: 3
*****3*****
Serenity
None
original rank: 15, personalized rank: 4
*****4*****
Prometheus
None
original rank: 37, personalized rank: 5


### With Reranker + Instruction

- Uses vectors of interaction history and AI-based reranker
- Instructions used to guide the reranker

In [13]:
response = pa.get_objects(
    persona_id,
    limit=50,
    use_agent_ranking=True,
    instruction="I'm looking for something for the whole family, maybe a fun, light action film."
)

print_movie_response_details(response, 5)

Ranking rationale: Since you are looking for a fun, light action film for the whole family and enjoy sci-fi, fantasy, and action genres, we prioritized movies that blend action with family-friendly or adventurous fantasy elements. Titles like 'We Can Be Heroes' and 'Rise of the Guardians' are excellent choices for family-friendly fun, along with sci-fi classics and newer engaging action films that match your genre preferences. We've also boosted films that are popular and well-rated, especially those related to alien invasions and action-packed sci-fi adventures, ensuring a fun experience for you and your family.
*****0*****
Rise of the Guardians
None
original rank: 49, personalized rank: 1
*****1*****
Independence Day
None
original rank: 1, personalized rank: 2
*****2*****
Independence Daysaster
None
original rank: 0, personalized rank: 3
*****3*****
Pacific Rim: Uprising
None
original rank: 19, personalized rank: 4
*****4*****
Serenity
None
original rank: 15, personalized rank: 5


### Add more interactions

Over time, you will add more interactions to the agent, which will help it learn more about the persona's preferences.

Note each interaction can be positive or negative. 
(1: most positive, 0: neutral, -1: most negative)

In [14]:
interactions = [
    PersonaInteraction(
        persona_id=persona_id, item_id=get_movie_uuid(client, "Iron Man"), weight=0.9  # very positive
    ),
    PersonaInteraction(
        persona_id=persona_id, item_id=get_movie_uuid(client, "The Grand Budapest Hotel"), weight=0.9
    ),
    PersonaInteraction(
        persona_id=persona_id, item_id=get_movie_uuid(client, "Sleepless in Seattle"), weight=0.8
    ),
    PersonaInteraction(
        persona_id=persona_id, item_id=get_movie_uuid(client, "The Mummy"), weight=0.0  # neutral
    ),
    PersonaInteraction(
        persona_id=persona_id, item_id=get_movie_uuid(client, "A Nightmare on Elm Street"), weight=-0.3,  # little bit negative
    ),
    PersonaInteraction(
        persona_id=persona_id, item_id=get_movie_uuid(client, "The Cloverfield Paradox"), weight=-0.9  # very negative
    ),
]

pa.add_interactions(interactions=interactions)

Fetched movie 'Iron Man' from the collection
Fetched movie 'The Grand Budapest Hotel' from the collection
Fetched movie 'Sleepless in Seattle' from the collection
Fetched movie 'The Mummy' from the collection
Fetched movie 'A Nightmare on Elm Street' from the collection
Fetched movie 'The Cloverfield Paradox' from the collection


### Retry with the updated knowledge

In [15]:
response = pa.get_objects(persona_id, limit=50, use_agent_ranking=True)

print_movie_response_details(response, 5)

Ranking rationale: As you love sci-fi, action, and fantasy movies especially those around 1999, 2008, and 2018-2019, we've prioritized blockbuster superhero and sci-fi action films like Iron Man, Iron Man 3, Avengers: Infinity War, and Man of Steel. We also boosted well-known sci-fi franchises such as Independence Day and Serenity, while lowering the ranks of drama-heavy and less relevant romantic or horror movies. Your favorite genres and years strongly influenced this tailored order.
*****0*****
Iron Man 3
None
original rank: 6, personalized rank: 1
*****1*****
Iron Man
None
original rank: 4, personalized rank: 2
*****2*****
Man of Steel
None
original rank: 26, personalized rank: 3
*****3*****
Avengers: Age of Ultron
None
original rank: 38, personalized rank: 4
*****4*****
Independence Day
None
original rank: 8, personalized rank: 5


### With Reranker + Instruction + Filter

- The most complex personalized queries
- Uses vectors of interaction history and AI-based reranker
- Instructions used to guide the reranker
- Filters out items that are not relevant to the user

In [17]:
from weaviate.classes.query import Filter

# With Reranker + Instruction + Filter
response = pa.get_objects(
    persona_id,
    limit=50,
    use_agent_ranking=True,
    instruction="The user is looking for a classic drama, that is suitable for a date night.",
    filters=Filter.by_property("release_year").less_or_equal(2000)
)

In [18]:
print_movie_response_details(response, 10)

Ranking rationale: Considering your preference for classic dramas suitable for a date night, I prioritized films that blend timeless romance and drama with high acclaim. I gave extra weight to older titles that offer a warm, romantic atmosphere perfect for a cozy evening, while also balancing your interest in popular, engaging stories.
*****0*****
The Apartment
None
original rank: 41, personalized rank: 1
*****1*****
Forrest Gump
None
original rank: 19, personalized rank: 2
*****2*****
American Beauty
None
original rank: 49, personalized rank: 3
*****3*****
Roman Holiday
None
original rank: 39, personalized rank: 4
*****4*****
The Graduate
None
original rank: 9, personalized rank: 5
*****5*****
Great Expectations
None
original rank: 33, personalized rank: 6
*****6*****
Breakfast at Tiffany's
None
original rank: 7, personalized rank: 0
*****7*****
Groundhog Day
None
original rank: 13, personalized rank: 7
*****8*****
Tess
None
original rank: 25, personalized rank: 8
*****9*****
The Grea

## Combine personalization with other queries

From `pa.query`, you can perform the common Weaviate searches, such as `near_text`, `bm25` and `hybrid`

In [19]:
response = pa.query(persona_id=persona_id, strength=0.95).hybrid(
    query="historical adventure",
    limit=10
)

for o in response.objects:
    print(f"Title: {o.properties['title']}")
    print(f"Genres: {o.properties['genres']}")

Title: The Adventures of Baron Munchausen
Genres: None
Title: The Adventures of Robin Hood
Genres: None
Title: Timeline
Genres: None
Title: Bill & Ted's Excellent Adventure
Genres: None
Title: Firewalker
Genres: None
Title: The Extraordinary Adventures of Adèle Blanc-Sec
Genres: None
Title: Jack Hunter and the Quest for Akhenaten's Tomb
Genres: None
Title: Pilgrimage
Genres: None
Title: Hidalgo
Genres: None
Title: King Arthur
Genres: None


In [20]:
client.close()