## Initialize your OpenSearchDocumentStore

You can use the `opensearch_indexing_pipeline.ipynb` notebook to index the example files to your own `OpenSearchDocumentStore`. You may do this locally, or deploy it on AWS. Depending on your setup, once you have a running DocumentStore, connect to it in the cell below by providing the right credentials to `host`, `port`, `username` and `password`.

In [None]:
from haystack.document_stores import OpenSearchDocumentStore

doc_store = OpenSearchDocumentStore(host='localhost', port=9200, username= "admin", password="admin", embedding_dim=384)

## Initialize a PromptNode with your SageMaker Endpoint Credentials

Once you've deployed your model on SageMaker provide your own credentials in `model_name_or_path`, `profile_name` and `region_name`

In [66]:
from haystack.nodes import AnswerParser, EmbeddingRetriever, PromptNode, PromptTemplate

question_answering_with_references = PromptTemplate("deepset/question-answering-with-references", output_parser=AnswerParser(reference_pattern=r"Document\[(\d+)\]"))

gen_qa_with_references = PromptNode(default_prompt_template=question_answering_with_references,  model_name_or_path="YOUR_FALCON_40B_INSTRUCT_ENDPOINT", model_kwargs={"profile_name": "YOUR_POFILE", "region_name": "YOUR_REGION"})


In [16]:
retriever = EmbeddingRetriever(document_store=doc_store, embedding_model="sentence-transformers/all-MiniLM-L12-v2", devices=["mps"], top_k=5)

  return self.fget.__get__(instance, owner)()


## Create a retrieval-augmented QA pipeline

In [67]:
from haystack import Pipeline

pipe = Pipeline()
pipe.add_node(component=retriever, name="Retriever", inputs=['Query'])
pipe.add_node(component=gen_qa_with_references, name="GenQAWithRefPromptNode", inputs=["Retriever"])

In [75]:
from haystack.utils import print_answers

result = pipe.run("What is opensearch?", params={"Retriever":{"top_k": 5}})

Batches: 100%|██████████| 1/1 [00:00<00:00,  2.78it/s]


In [76]:
print_answers(results=result)

'Query: What is opensearch?'
'Answers:'
[   <Answer {'answer': '\nOpenSearch is an open-source project built on top of Apache Lucene, a powerful indexing and search library. It is used for distributed search and analytics, and it can scale up and down as the needs of the application grow or shrink. OpenSearch also serves as a user interface for many of the OpenSearch plugins, including security, alerting, Index State Management, SQL, and more.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_ids': [], 'meta': {'prompt': 'Create a concise and informative answer (no more than 50 words) for a given question \nbased solely on the given documents. You must only use information from the given documents. \nUse an unbiased and journalistic tone. Do not repeat text. Cite the documents using Document[number] notation. \nIf multiple documents contain the answer, cite those documents like ‘as stated in Document[number], Docu