# Use Gradient Models for Notion RAG

In this Colab, we will:
- Creating a custom Haystack component called `NotionExporter`
- Building an indexing pipeline to write our Notion pages into an `InMemoryDocumentStore` with embeddings
- Build a custom RAG pipeline to do question answering on our Notion pages

In [None]:
!pip install haystack-ai gradient-haystack notion-haystack
!pip install nest-asyncio

import nest_asyncio

nest_asyncio.apply()

In [None]:
import getpass
import os

notion_api_key = getpass.getpass("Enter Notion API key:")
gradient_access_token = getpass.getpass("Gradient Access token:")

### Test the NotionExporter

- You can follow the steps outlined in the Notion [documentation](https://developers.notion.com/docs/create-a-notion-integration#create-your-integration-in-notion) to create a new Notion integration, connect it to your pages, and obtain your API token.
- Page IDs in Notion are the tailing numbers at the end of the page URL, separated by a '-' at 8-4-4-4-12 digits

In [None]:
from notion_haystack import NotionExporter

exporter = NotionExporter(api_token=notion_api_key)

In [None]:
exporter.run(page_ids=["6f98e9a6-a880-40e9-b191-1c4f41efec87"])

## Build an Indexing Pipeline to Write Notion Pages to a Document Store

- Documentation on [`GradientDocumentEmbedder`](https://haystack.deepset.ai/integrations/gradient#usage)
- Documentation on [`DocumentSplitter`](https://docs.haystack.deepset.ai/v2.0/docs/documentsplitter)
- Documentation on [`DocumentWriter`](https://docs.haystack.deepset.ai/v2.0/docs/documentwriter)

In [None]:
from haystack.components.preprocessors import DocumentSplitter
from haystack_integrations.components.embedders.gradient import GradientDocumentEmbedder
from haystack.components.writers import DocumentWriter
from haystack.document_stores import InMemoryDocumentStore


document_store = InMemoryDocumentStore()
exporter = NotionExporter(api_token=notion_api_key)
splitter = DocumentSplitter()
document_embedder = GradientDocumentEmbedder(access_token=gradient_access_token, workspace_id="9ee7071c-2fa9-4155-8edd-94ed371f1750_workspace")
writer = DocumentWriter(document_store=document_store)


In [None]:
from haystack import Pipeline

indexing_pipeline = Pipeline()
indexing_pipeline.add_component(instance=exporter, name="exporter")
indexing_pipeline.add_component(instance=splitter, name="splitter")
indexing_pipeline.add_component(instance=document_embedder, name="document_embedder")
indexing_pipeline.add_component(instance=writer, name="writer")

In [None]:
indexing_pipeline.connect("exporter.documents", "splitter.documents")
indexing_pipeline.connect("splitter.documents", "document_embedder.documents")
indexing_pipeline.connect("document_embedder.documents", "writer.documents")

In [None]:
indexing_pipeline.run(data={"exporter":{"page_ids": ["6f98e9a6-a880-40e9-b191-1c4f41efec87"]}})

## Build a RAG Pipeline with Cohere

- Documentation on [`GradientTextEmbedder`](https://haystack.deepset.ai/integrations/gradient#usage)
- Documentation on [`PromptBuilder`](https://docs.haystack.deepset.ai/v2.0/docs/promptbuilder)
- Documentation on [`GradientGenerator`](GradientTextEmbedder)

In [None]:
import torch
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack_integrations.components.embedders.gradient import GradientTextEmbedder
from haystack_integrations.components.generators.gradient import GradientGenerator

prompt = """ Answer the query, based on the
content in the documents.

Documents:
{% for doc in documents %}
  {{doc.content}}
{% endfor %}

Query: {{query}}
"""
text_embedder = GradientTextEmbedder(access_token=gradient_access_token, workspace_id="9ee7071c-2fa9-4155-8edd-94ed371f1750_workspace")
retriever = InMemoryEmbeddingRetriever(document_store=document_store)
prompt_builder = PromptBuilder(template=prompt)
generator = GradientGenerator(access_token=gradient_access_token,
                              workspace_id="9ee7071c-2fa9-4155-8edd-94ed371f1750_workspace",
                              model_adapter_id="905db818-d031-4378-bd67-ac9804cb0961_model_adapter",
                              max_generated_token_count=350)


In [None]:
rag_pipeline = Pipeline()

rag_pipeline.add_component(instance=text_embedder, name="text_embedder")
rag_pipeline.add_component(instance=retriever, name="retriever")
rag_pipeline.add_component(instance=prompt_builder, name="prompt_builder")
rag_pipeline.add_component(instance=generator, name="generator")

rag_pipeline.connect("text_embedder", "retriever")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "generator")

In [None]:
question = "What are the steps for creating a custom component?"
result = rag_pipeline.run(data={"text_embedder":{"text": question},
                                "prompt_builder":{"query": question}})

In [None]:
print(result['generator']['replies'][0])