<a href="https://colab.research.google.com/github/muffafa/advent-of-haystack-2024-2025-solutions/blob/main/SOLUTION_Haystack_Advent_Weaviate_Day.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advent of Haystack: Day 2

_Make a copy of this Colab to start_

In this challenge, your mission is to help a couple of fictional elves.
1. Find out what's happening in the North Pole in the film "A Very Weaviate Christmas"
2. This will lead you to a clue that will let you discover which Weaviate Collection to peak into.
3. While submitting the challenge, tell us what you find there!


### Components to use:
1. [`OpenAITextEmbedder`](https://docs.haystack.deepset.ai/docs/openaitextembedder)
2. [`OpenAIGenerator`](https://docs.haystack.deepset.ai/docs/openaigenerator)
3. [`PromptBuilder`](https://docs.haystack.deepset.ai/docs/promptbuilder)
4. [`WeaviateDocumentStore`](https://docs.haystack.deepset.ai/docs/weaviatedocumentstore)
5. [`WeaviateEmbeddingRetriever`](https://docs.haystack.deepset.ai/reference/integrations-weaviate#weaviateembeddingretriever)


🎄 **Your task is to complete steps 3 and 4**. But make sure you run the code cells before. You should know what each prior step is doing.

## 1) Setup and Installation

In [None]:
!pip install haystack-ai weaviate-haystack

Collecting haystack-ai
  Downloading haystack_ai-2.8.0-py3-none-any.whl.metadata (13 kB)
Collecting weaviate-haystack
  Downloading weaviate_haystack-4.0.2-py3-none-any.whl.metadata (1.8 kB)
Collecting haystack-experimental (from haystack-ai)
  Downloading haystack_experimental-0.4.0-py3-none-any.whl.metadata (16 kB)
Collecting lazy-imports (from haystack-ai)
  Downloading lazy_imports-0.4.0-py3-none-any.whl.metadata (10 kB)
Collecting posthog (from haystack-ai)
  Downloading posthog-3.7.4-py2.py3-none-any.whl.metadata (2.0 kB)
Collecting weaviate-client>=4.9 (from weaviate-haystack)
  Downloading weaviate_client-4.10.2-py3-none-any.whl.metadata (3.6 kB)
Collecting validators==0.34.0 (from weaviate-client>=4.9->weaviate-haystack)
  Downloading validators-0.34.0-py3-none-any.whl.metadata (3.8 kB)
Collecting authlib<1.3.2,>=1.2.1 (from weaviate-client>=4.9->weaviate-haystack)
  Downloading Authlib-1.3.1-py2.py3-none-any.whl.metadata (3.8 kB)
Collecting grpcio-tools<2.0.0,>=1.66.2 (from w

To get started, first provide your API keys below. We're providing you with a read-only API Key for Weaviate. This one will help you first find out what's going on in the North Pole.

In [None]:
import os
from getpass import getpass

os.environ["WEAVIATE_API_KEY"] = "b3jhGwa4NkLGjaq3v1V1vh1pTrlKjePZSt91"

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter OpenAI API key:")

Enter OpenAI API key:··········


## 2) Weaviate Setup

Next, you can connect to the right `WeaviateDocumentStore` (we've already added the right code for you below with the cleint URL in place).

In [None]:
from haystack_integrations.document_stores.weaviate import WeaviateDocumentStore, AuthApiKey
from haystack import Document
import os


auth_client_secret = AuthApiKey()

document_store = WeaviateDocumentStore(url="https://zgvjwlycsr6p5j1ziuyea.c0.europe-west3.gcp.weaviate.cloud",
                                       auth_client_secret=auth_client_secret)

## 3) The RAG Pipeline

Now, you're on your own. Complete the code blocks below.

First, create a RAG pipeline that can answer questions based on the overviews of the movies in your `document_store`.

⭐️ You should then be able to run the pipeline and answer the questions "What happens in the film 'A Very Weaviate Christmas'?"

**💚 Hint 1:** The embedding model that was used to populate the vectors was `text-embedding-3-small` by OpenAI.

**💙 Hint 2:** We've added an import to the OpenAIGenerator but feel free to use something else!

In [None]:
from haystack import Pipeline
from haystack.components.embedders import OpenAITextEmbedder
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack_integrations.components.retrievers.weaviate import WeaviateEmbeddingRetriever


prompt = """
Answer the question based on the movie overviews below.

{% for movie in movies %}
  Title: {{movie.meta["title"]}}
  Overview: {{movie.content}}
{% endfor %}

Query: {{query}}
Answer:
"""

rag = Pipeline()
rag.add_component("query_embedder",  OpenAITextEmbedder(model="text-embedding-3-small"))
rag.add_component("retriever",  WeaviateEmbeddingRetriever(document_store=document_store, top_k=3))
rag.add_component("prompt", PromptBuilder(prompt))
rag.add_component("generator", OpenAIGenerator(model="gpt-4o-mini"))

rag.connect("query_embedder.embedding", "retriever.query_embedding")
rag.connect("retriever.documents", "prompt.movies")
rag.connect("prompt", "generator")

<haystack.core.pipeline.pipeline.Pipeline object at 0x79bcea229b10>
🚅 Components
  - query_embedder: OpenAITextEmbedder
  - retriever: WeaviateEmbeddingRetriever
  - prompt: PromptBuilder
  - generator: OpenAIGenerator
🛤️ Connections
  - query_embedder.embedding -> retriever.query_embedding (List[float])
  - retriever.documents -> prompt.movies (List[Document])
  - prompt.prompt -> generator.prompt (str)

In [None]:
query = "What happens in the film 'A Very Weaviate Christmas'?"
reply = rag.run({"query_embedder": {"text": query}, "prompt": {"query": query}})

print(reply['generator']["replies"][0])

In the film 'A Very Weaviate Christmas,' we follow the adventures of two of Santa's elves, Daniel and Philip, as they embark on a mission to recover stolen vectors that have been hidden in an unknown Collection. With Christmas Day approaching, they urgently search for these vectors, which end up being located in "Santas_Grotto." The film features the Weaviate DevRel and Growth teams in this thrilling Christmas drama.


## 4) Solve the Mystery

By this point, you should know what's happening.. There is a Collection where everything has been hidden.

Complete the code cell below by providing the right Collection name, and tell us the following:

1. Who is the culprit? Watch out, because there may be `decoys`.
2. What have they stolen?

**💚 Hint:** Once you've connected to the right collection, take a look at all the Objects in there. Then, you may be able to use filters to avoid the decoys!

- [Weaviate Documentation: Read all Objects](https://weaviate.io/developers/weaviate/manage-data/read-all-objects)
- [Weaviate Documentation: Filters](https://weaviate.io/developers/weaviate/search/filters)

In [None]:
import weaviate

from weaviate.classes.init import Auth
from weaviate.classes.query import Filter


headers = {"X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY")}
client = weaviate.connect_to_weaviate_cloud(cluster_url="https://zgvjwlycsr6p5j1ziuyea.c0.europe-west3.gcp.weaviate.cloud",
                                            auth_credentials=Auth.api_key(os.getenv("WEAVIATE_API_KEY")),
                                            headers=headers)

# Provide the name of the collection in client.collections.get() below 👇
plot = client.collections.get("Santas_Grotto")

for item in plot.iterator():
    print(item.properties)

plot.query.fetch_objects(filters=Filter.by_property("decoy").not_equal(True)).objects[0].properties


{'plot': 'Tuana is here with not just all the vectors but also all the presents that are supposed to be delivered around the World!', 'decoy': False}
{'plot': "Sebastian is here, but he seems unsure what's going on", 'decoy': True}
{'plot': "JP is here, looks like he's feasting on cookies", 'decoy': True}


{'plot': 'Tuana is here with not just all the vectors but also all the presents that are supposed to be delivered around the World!',
 'decoy': False}