# Generator

## Generator untuk Retrieval Process

In [26]:
from haystack import Pipeline, component
from haystack.components.builders import PromptBuilder
from haystack_integrations.components.retrievers.mongodb_atlas import MongoDBAtlasEmbeddingRetriever
from haystack.components.embedders import SentenceTransformersTextEmbedder

from haystack_integrations.document_stores.mongodb_atlas import MongoDBAtlasDocumentStore
import os
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret
from typing import List
from getpass import getpass

  from .autonotebook import tqdm as notebook_tqdm


Define Connection Environment

In [None]:
os.environ['MONGO_CONNECTION_STRING'] = getpass("Enter your MongoDB connection string: ")

In [None]:
os.environ['OPENAI_API_KEY'] = getpass("Enter your OpenAI API key: ")

membuat pipeline

In [None]:
document_store = MongoDBAtlasDocumentStore(
    database_name="depato_store",
    collection_name="products",
    vector_search_index="vector_index",
    full_text_search_index="search_index",
)

In [32]:
TEMPLATE = """"
You are a shop assiistant that helps users find the best products in a shopping mall.
You will be give a query and list of products. Your task is to generate a list of products that best match the query.

The output should be a list of products in the following format:

<summary_of_query>
<index>. <product_name> 
Price: <product_price>
Material: <product_material>
Category: <product_category>
Brand: <product_brand>
Recommendation: <product_recommendation>

From the format above, you should pay attention to the following:
1. <summary_of_query> should be a short summary of the query.
2. <index> should be a number starting from 1.
3. <product_name> should be the name of the product, this product name can be found from the product_name field.
4. <product_price> should be the price of the product, this product price can be found from the product_price field.
5. <product_material> should be the material of the product, this product material can be found from the product_material field.
6. <product_category> should be the category of the product, this product category can be found from the product_category field.
7. <product_brand> should be the brand of the product, this product brand can be found from the product_brand field.
8. <product_recommendation> should be the recommendation of the product, you should give a recommendation why this product is recommended, please pay attentation to the product_content field. 


You should only return the list of products that best match the query, do not return any other information.

The query is: {{query}}
the products are:
{% for product in documents %}
===========================================================
{{loop.index + 1}}. product_name: {{ product.meta.title }}
product_price: {{ product.meta.price }}
product_material: {{ product.meta.material }}
product_category: {{ product.meta.category }}
product_brand: {{ product.meta.brand }}
product_content: {{ product.content}}
{% endfor %}

===========================================================

Answer:

"""

In [33]:
pipeline = Pipeline()
pipeline.add_component("embedder", SentenceTransformersTextEmbedder())
pipeline.add_component("retriever", MongoDBAtlasEmbeddingRetriever(document_store=document_store, top_k=5))
pipeline.add_component("prompt_builder", PromptBuilder(template=TEMPLATE))
pipeline.add_component("generator", OpenAIGenerator(model="gpt-4.1", api_key=Secret.from_token(os.environ['OPENAI_API_KEY'])))

PromptBuilder has 2 prompt variables, but `required_variables` is not set. By default, all prompt variables are treated as optional, which may lead to unintended behavior in multi-branch pipelines. To avoid unexpected execution, ensure that variables intended to be required are explicitly set in `required_variables`.


In [34]:
pipeline.connect("embedder", "retriever")
pipeline.connect("retriever", "prompt_builder")
pipeline.connect("prompt_builder", "generator")

<haystack.core.pipeline.pipeline.Pipeline object at 0x000001AFD898C830>
🚅 Components
  - embedder: SentenceTransformersTextEmbedder
  - retriever: MongoDBAtlasEmbeddingRetriever
  - prompt_builder: PromptBuilder
  - generator: OpenAIGenerator
🛤️ Connections
  - embedder.embedding -> retriever.query_embedding (List[float])
  - retriever.documents -> prompt_builder.documents (List[Document])
  - prompt_builder.prompt -> generator.prompt (str)

In [35]:
query="I want to find an Outerwear that is not make me hot"
response = pipeline.run(
    {
        "embedder":{
            "text": query
        },
        "prompt_builder":{
            "query": query
        }
    }
)

Batches: 100%|██████████| 1/1 [00:00<00:00,  3.13it/s]


In [37]:
response['generator']['replies'][0]

"Lightweight and breathable outerwear options that won't make you feel hot.\n\n1. Under Armour Heatgear Alpha Muscle Crop Tee - Women's  \nPrice: 35.01  \nMaterial: Polyester  \nCategory: Tops  \nBrand: Under Armour  \nRecommendation: This crop tee features super-breathable mesh construction and Under Armour's HeatGear technology, designed specifically to wick sweat and keep you dry and cool. Its lightweight, stretchy material and cropped silhouette make it ideal for staying comfortable in warm conditions.\n\n2. INGEAR Ladies Rash Guard Long Sleeve Shirt Swimwear  \nPrice: 16.99  \nMaterial: Unknown  \nCategory: Tops  \nBrand: In Gear  \nRecommendation: This rash guard is intended for active wear in and out of water, providing coverage while remaining lightweight. It's suitable for hot weather activities like swimming and beach outings, ensuring you don’t overheat while being protected from the sun."