# Multilingual RAG: Using Gemini for question answering on private data

In this notebook, our aim is to develop a RAG system utilizing [Google's Gemini](https://gemini.google.com/app) model. We'll generate vectors with [E5](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-e5.html) model and store them in Elasticsearch. Additionally, we'll explore semantic retrieval techniques and present the top search results as a context window to the Gemini model.

## Setup

**Elastic Credentials** - Create an [Elastic Cloud deployment](https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud) to get all Elastic credentials (`ELASTIC_CLOUD_ID`,` ELASTIC_API_KEY`).

**Google Credentials** - To use the Gemini API, you need to [create an API key in Google AI Studio](https://ai.google.dev/tutorials/setup) (`GOOGLE_API_KEY`).

## Install packages

In [None]:
!pip install -q -U elasticsearch langchain langchain-elasticsearch langchain_community

## Import packages

In [None]:
import json
import os
from getpass import getpass
from urllib.request import urlopen

from elasticsearch import Elasticsearch, helpers
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import ElasticsearchStore
from langchain import HuggingFacePipeline
from langchain.chains import RetrievalQA
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoTokenizer, pipeline
from langchain_google_genai import ChatGoogleGenerativeAI

## Get Credentials

In [None]:
os.environ["GOOGLE_API_KEY"] = getpass("Google API Key :")
ELASTIC_API_KEY = getpass("Elastic API Key :")
ELASTIC_CLOUD_ID = getpass("Elastic Cloud ID :")
ELASTIC_INDEX_NAME = "multi-lang-rag"
ELASTIC_DEPLOYED_MODEL_ID = ".multilingual-e5-small_linux-x86_64"

## Add documents

### Let's download the sample dataset and deserialize the document.

In [None]:
url = "https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/datasets/workplace-documents.json"

response = urlopen(url)

workplace_docs = json.loads(response.read())

### Split Documents into Passages

In [None]:
metadata = []
content = []

for doc in workplace_docs:
    content.append(doc["content"])
    metadata.append(
        {
            "name": doc["name"],
            "summary": doc["summary"],
            "rolePermissions": doc["rolePermissions"],
        }
    )

# text_splitter = CharacterTextSplitter(chunk_size=50, chunk_overlap=0)
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500, chunk_overlap=0, separators=[" ", ",", "\n"]
)
docs = text_splitter.create_documents(content, metadatas=metadata)

## Index Documents into Elasticsearch using E5

Before we begin indexing, ensure you have [downloaded and deployed the E5 model](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-e5.html) in your deployment and is running on the ML node.

In [None]:
db = ElasticsearchStore(
    es_cloud_id=ELASTIC_CLOUD_ID,
    es_api_key=ELASTIC_API_KEY,
    index_name=ELASTIC_INDEX_NAME,
    query_field="text_field",
    vector_query_field="vector_query_field.predicted_value",
    strategy=ElasticsearchStore.ApproxRetrievalStrategy(
        query_model_id=ELASTIC_DEPLOYED_MODEL_ID
    ),
)

db

### Setup Ingest Pipeline

In [None]:
db.client.ingest.put_pipeline(
    id="multi-lang-pipeline",
    processors=[
        {
            "inference": {
                "model_id": ELASTIC_DEPLOYED_MODEL_ID,
                "field_map": {"query_field": "text_field"},
                "target_field": "vector_query_field",
            }
        }
    ],
)

### Create an Index

In [None]:
db.client.indices.create(
    index=ELASTIC_INDEX_NAME,
    mappings={
        "properties": {
            "text_field": {"type": "text"},
            "vector_query_field": {
                "properties": {
                    "predicted_value": {
                        "type": "dense_vector",
                        "dims": 384,
                        "index": True,
                        "similarity": "l2_norm",
                    }
                }
            },
        }
    },
    settings={"index": {"default_pipeline": "multi-lang-pipeline"}},
)

### Insert documents

In [None]:
db.from_documents(
    docs,
    es_cloud_id=ELASTIC_CLOUD_ID,
    es_api_key=ELASTIC_API_KEY,
    index_name=ELASTIC_INDEX_NAME,
    query_field="text_field",
    vector_query_field="vector_query_field.predicted_value",
    strategy=ElasticsearchStore.ApproxRetrievalStrategy(
        query_model_id=ELASTIC_DEPLOYED_MODEL_ID
    ),
)

db

## Multilingual Search 

In [None]:
db.similarity_search(
    "हमारी कंपनी की बिक्री संरचना कैसी है?", k=5
)  # Asking in Hindi - How is the sales structure of our company?

## Format Docs

In [None]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

## Create a Chain using Prompt Template + `gemini-pro` model

In [None]:
retriever = db.as_retriever(search_kwargs={"k": 5})

template = """Answer the question based only on the following context. Detect language of question and answer in detail in same language.\n

{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)


chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | ChatGoogleGenerativeAI(model="gemini-pro", temperature=0.2)
    | StrOutputParser()
)

In [None]:
# what are the sales goals for 2023?
# When I have to come to the office and why?
# How leaves will be calcluated?
# मैं कब से ऑफिस जा सकता हूं कब से जा सकता हूं
# বিক্রয় কৌশল কি (In begali asking - what are the sales strategy)
# explain detailed onboarding steps in hindi
# jak funguje kompenzace? Řekni mi v angličtině (In czech asking - how compensation works ? tell me in english)


chain.invoke("When I have to come to the office and why?")