# Building RAG with Elastic and [Mistral](https://docs.mistral.ai/getting-started/quickstart/)

This notebook is a hands-on demonstration of how to create a multilingual RAG system. The following steps were extracted from the article ["Building RAG with Elastic and Mistral"](https://www.elastic.co/search-labs/blog/building-multilingual-rag-with-elastic-and-mistral).


## Install Packages and Import Necessary Modules


In [None]:
# install packages
!python3 -m pip install elasticsearch==8.14 mistralai

# import modules
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage
from elasticsearch import Elasticsearch, exceptions
from elasticsearch.helpers import bulk
from getpass import getpass
import json

Collecting elasticsearch==8.14
  Downloading elasticsearch-8.14.0-py3-none-any.whl (480 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m480.2/480.2 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting mistralai
  Downloading mistralai-0.4.2-py3-none-any.whl (20 kB)
Collecting elastic-transport<9,>=8.13 (from elasticsearch==8.14)
  Downloading elastic_transport-8.13.1-py3-none-any.whl (64 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.5/64.5 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting httpx<1,>=0.25 (from mistralai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting orjson<3.11,>=3.9.10 (from mistralai)
  Downloading orjson-3.10.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (141 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m141.1/141.1 kB[0m 

## Declaring Variables

This code will create inputs where you can enter your credentials.
Here you can learn how to retrieve your Elasticsearch credentials: [Finding Your Cloud ID](https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id)


In [None]:
ELASTIC_CLUSTER_ID = getpass("Elastic Cloud ID: ")
ELASTIC_API_KEY = getpass("Elastic Api Key: ")
MISTRAL_API_KEY = getpass("MISTRAL Api key: ")

Elastic Cloud ID: ··········
Elastic Api Key: ··········
MISTRAL Api key: ··········


## Instance a Elasticsearch client


In [None]:
# Create the client instance
es_client = Elasticsearch(
    cloud_id=ELASTIC_CLUSTER_ID,
    api_key=ELASTIC_API_KEY,
)

## Creating embeddings endpoint


In [None]:
try:
    es_client.options(
        request_timeout=60, max_retries=3, retry_on_timeout=True
    ).inference.put_model(
        task_type="text_embedding",
        inference_id="multilingual_embeddings",
        body={
            "service": "elasticsearch",
            "service_settings": {
                "model_id": ".multilingual-e5-small",
                "num_allocations": 1,
                "num_threads": 1,
            },
        },
    )

    print("Embedding endpoint created successfully.")
except exceptions.BadRequestError as e:
    if e.error == "resource_already_exists_exception":
        print("Embedding endpoint already created.")
    else:
        raise e

Embedding endpoint created successfully.


## Creating Mappings


In [None]:
try:
    es_client.indices.create(
        index="multilingual-mistral",
        body={
            "mappings": {
                "properties": {
                    "super_body": {
                        "type": "semantic_text",
                        "inference_id": "multilingual-embeddings",
                    }
                }
            }
        },
    )
except exceptions.RequestError as e:
    if e.error == "resource_already_exists_exception":
        print("Index already exists.")
    else:
        raise e

## Indexing documents


In [None]:
# Support tickets to add to the index
support_tickets = [
    """
        _Support Ticket #EN1234_
        **Subject**: Calendar sync not working with Google Calendar

        **Description**:
        I'm having trouble syncing my project deadlines with Google Calendar. Whenever I try to sync, I get an error message saying "Unable to connect to external calendar service."

        **Resolution**:
        The issue was resolved by following these steps:
        1. Go to Settings > Integrations
        2. Disconnect the Google Calendar integration
        3. Clear browser cache and cookies
        4. Reconnect the Google Calendar integration
        5. Authorize the app again in Google's security settings

        The sync should now work correctly. If problems persist, ensure that third-party cookies are enabled in your browser settings.
    """,
    """
        _Support-Ticket #DE5678_
        **Betreff**: Datei-Upload funktioniert nicht

        **Beschreibung**:
        Ich kann keine Dateien mehr in meine Projekte hochladen. Jedes Mal, wenn ich es versuche, bleibt der Ladebalken bei 99% stehen und dann erscheint eine Fehlermeldung.

        **Lösung**:
        Das Problem wurde durch folgende Schritte gelöst:
        1. Überprüfen Sie die Dateigröße. Die maximale Uploadgröße beträgt 100 MB.
        2. Deaktivieren Sie vorübergehend den Virenschutz oder die Firewall.
        3. Versuchen Sie, die Datei im Inkognito-Modus hochzuladen.
        4. Wenn das nicht funktioniert, leeren Sie den Browser-Cache und die Cookies.
        5. Als letzten Ausweg, versuchen Sie einen anderen Browser zu verwenden.

        In den meisten Fällen lag das Problem an zu großen Dateien oder an Interferenzen durch Sicherheitssoftware. Nach Anwendung dieser Schritte sollte der Upload funktionieren.
    """,
    """
        _Q3 Marketing Campaign Ideas_

        1. Social media contest: "Share Your Productivity Hack"
        - Users share tips using our software, best entry wins a premium subscription

        2. Webinar series: "Mastering Project Management"
        - Invite industry experts to share insights using our tool

        3. Email campaign: "Unlock Hidden Features"
        - Series of emails highlighting lesser-known but powerful features

        4. Partner with a productivity podcast for sponsored content

        5. Create a "Project Management Memes" social media account for lighter, shareable content
    """,
    """
        _Mitarbeiter des Monats: Juli 2023_

        Wir freuen uns, bekannt zu geben, dass Sarah Schmidt zur Mitarbeiterin des Monats Juli gewählt wurde!

        Sarah hat außergewöhnliche Leistungen in folgenden Bereichen gezeigt:
        - Kundenbetreuung: Sarah hat durchschnittlich 95% positive Bewertungen erhalten.
        - Teamarbeit: Sie hat maßgeblich zur Verbesserung unseres internen Wissensmanagementsystems beigetragen.
        - Innovation: Sarah hat eine neue Methode zur Priorisierung von Support-Tickets vorgeschlagen, die unsere Reaktionszeiten um 20% verbessert hat.

        Bitte gratulieren Sie Sarah zu dieser wohlverdienten Anerkennung!
    """,
]

In [None]:
# This function will create a bulk object for the given id and body
def build_bulk_obj(id, body):
    return {
        "_index": "multilingual-mistral",
        "_id": id,
        "_source": {"super_body": body},
    }

In [None]:
data = []

# Constructing bulk object for each detail
for i, details in enumerate(support_tickets):
    data.append(build_bulk_obj(i + 1, details))

try:
    # Using the bulk API to index the data
    bulk(es_client, data)
    print("Data indexed successfully.")
except exceptions.RequestError as e:
    print("Error indexing data.")
    print(e)

Data indexed successfully.


## Retrieving documents


In [None]:
response = es_client.search(
    index="multilingual-mistral",
    body={
        "size": 2,
        "_source": {"excludes": ["*embeddings", "*chunks"]},
        "query": {
            "semantic": {
                "field": "super_body",
                "query": "Hola, estoy teniendo problemas para ocupar su aplicación, estoy teniendo problemas para sincronizar mi calendario, y encima al intentar subir un archivo me da error.",
            }
        },
    },
)

# Print results
formatted_json = json.dumps(response.body, indent=4)

print(formatted_json)

{
    "took": 48,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 0.9155389,
        "hits": [
            {
                "_index": "multilingual-mistral",
                "_id": "1",
                "_score": 0.9155389,
                "_source": {
                    "super_body": {
                        "inference": {
                            "inference_id": "multilingual-embeddings",
                            "model_settings": {
                                "similarity": "cosine",
                                "element_type": "float",
                                "task_type": "text_embedding",
                                "dimensions": 384
                            }
                        },
                        "text": "\n        _Support Ticket #EN123

## Answering the question

Now we will use Mistral to answer the question.


In [None]:
# Joining the Elasticsearch retrieve context
elastic_context = []

for r in response.body["hits"]["hits"]:
    elastic_context.append(r["_source"]["super_body"]["text"])

context_str = "\n".join(elastic_context)

In [None]:
client = MistralClient(api_key=MISTRAL_API_KEY)

system_message = "You are a helpful multilingual agent that help users with their problems. You have access to a knowledge base on different languages and you must answer in the same language the question was asked."
user_message = f"""
    ## Question:

    Hola, estoy teniendo problemas para ocupar su aplicación, estoy teniendo problemas para sincronizar mi calendario, y encima al intentar subir un archivo me da error.

    ## Related knowledge:

    {context_str}
"""

messages = [
    ChatMessage(role="system", content=system_message),
    ChatMessage(role="user", content=user_message),
]

model = "open-mixtral-8x22b"

chat_response = client.chat(
    model=model,
    messages=messages,
)


    ## Question:

    Hola, estoy teniendo problemas para ocupar su aplicación, estoy teniendo problemas para sincronizar mi calendario, y encima al intentar subir un archivo me da error.

    ## Related knowledge:

    
        _Support Ticket #EN1234_
        **Subject**: Calendar sync not working with Google Calendar

        **Description**:
        I'm having trouble syncing my project deadlines with Google Calendar. Whenever I try to sync, I get an error message saying "Unable to connect to external calendar service."

        **Resolution**:
        The issue was resolved by following these steps:
        1. Go to Settings > Integrations
        2. Disconnect the Google Calendar integration
        3. Clear browser cache and cookies
        4. Reconnect the Google Calendar integration
        5. Authorize the app again in Google's security settings

        The sync should now work correctly. If problems persist, ensure that third-party cookies are enabled in your browser setti

The answer is on point!


In [None]:
print(chat_response.choices[0].message.content)

Hola, me alegra que te hayas comunicado con nosotros. Parece que hay dos problemas distintos.

En cuanto a la sincronización del calendario, puedes seguir estos pasos para resolver el problema:

1. Ve a Configuración > Integraciones
2. Desconecta la integración del Calendario de Google
3. Borra la caché y las cookies del navegador
4. Vuelve a conectar la integración del Calendario de Google
5. Autoriza de nuevo la aplicación en la configuración de seguridad de Google

Si sigues teniendo problemas, asegúrate de que las cookies de terceros están habilitadas en la configuración de tu navegador.

En cuanto al problema de subir un archivo, hay varias cosas que puedes probar:

1. Comprueba el tamaño del archivo. El tamaño máximo de carga es de 100 MB.
2. Desactiva temporalmente el antivirus o el cortafuegos.
3. Intenta cargar el archivo en modo incógnito.
4. Si eso no funciona, borra la caché y las cookies del navegador.
5. Como último recurso, prueba a usar un navegador diferente.

En la ma

## Deleting

Finally, we can delete the resources used to prevent them from consuming resources.


In [None]:
# Cleanup - Delete Index
es_client.indices.delete(index="multilingual-mistral", ignore=[400, 404])

# Cleanup - Delete Embeddings Endpoint
es_client.inference.delete_model(
    inference_id="multilingual_embeddings", ignore=[400, 404]
)

  es_client.indices.delete(index='multilingual-mistral', ignore=[400, 404])
  es_client.inference.delete_model(inference_id='multilingual_embeddings', ignore=[400, 404])


ObjectApiResponse({'acknowledged': True, 'pipelines': [], 'indexes': []})