# Request documents (RAG)

In [8]:
! pip install -qU wget openai

In [9]:
import os
import requests
from openai import OpenAI
import wget

In [17]:
# OpenAI client configuration
base_url = "https://albert.api.etalab.gouv.fr/v1"
api_key = os.getenv("ALBERT_API_KEY")


client = OpenAI(base_url=base_url, api_key=api_key)

session = requests.session()
session.headers = {"Authorization": f"Bearer {api_key}"}

Let's start by uploading the document we want to query. This document can be a PDF, an HTML file, or a JSON file.

In [18]:
# Download a file
file_path = "/Users/acor/marker_v2/albert-api/docs/tutorials/IA107.pdf"
if not os.path.exists(file_path):
    doc_url = "https://lafrenchtech.gouv.fr/app/uploads/2024/05/20240521-MFT_CP_FTNext40_120_ENG_V3.pdf"
    wget.download(doc_url, out=file_path)


100% [..........................................................................] 1114051 / 1114051

To begin, we create a collection named tutorial. To do this, we make a GET request to the /v1/models endpoint to retrieve the list of available models and define the embedding model to use.

We will also need a language model. We call the /v1/models endpoint to get the list of models. Language models have the type text-generation, and embedding models have the type text-embeddings-inference.

In [19]:
language_model, embeddings_model = None, None

for model in client.models.list().data:
    if model.type == "text-generation" and language_model is None:
        language_model = model.id
    if model.type == "text-embeddings-inference" and embeddings_model is None:
        embeddings_model = model.id

print(f"language model: {language_model}\nembeddings model: {embeddings_model}")

language model: albert-small
embeddings model: embeddings-small


In [20]:
collection = "tutorial"

response = session.post(f"{base_url}/collections", json={"name": collection, "model": embeddings_model})
response = response.json()
collection_id = response["id"]
print(f"Collection ID: {collection_id}")

Collection ID: 849


Finally, we import the document into the collection of our vector database using the POST /v1/files endpoint.

In [21]:
files = {"file": (os.path.basename(file_path), open(file_path, "rb"), "application/pdf")}
data = {"request": '{"collection": "%s"}' % collection_id}
response = session.post(f"{base_url}/files", data=data, files=files)
assert response.status_code == 201

We can see that the file we imported is indeed in the collection using the GET /v1/collections endpoint.

In [24]:
response = session.get(f"{base_url}/collections/{collection_id}")


In [27]:
print(f"Number of files in collection: {response.json()['documents']}")

Number of files in collection: 1


Now that we have our collection and our file, we can perform a vector search using the POST /v1/search endpoint. These vector search results will be used to generate a response using the language model.

In [28]:
prompt = "What is The French Tech Mission ?"

## Semantic (default method)
The semantic method is based on vector similarity (cosine similarity) between the question and the vector representation of the documents.

In [34]:
prompt = "What is The French Tech Mission ?"
data = {"collections": [collection_id], "k": 6, "prompt": prompt, "method": "semantic"}
response = session.post(url=f"{base_url}/search", json=data)

prompt_template = "Answer following question using available documents: {prompt}\n\nDocuments :\n\n{chunks}"
chunks = "\n\n\n".join([result["chunk"]["content"] for result in response.json()["data"]])
sources = set([result["chunk"]["metadata"]["document_name"] for result in response.json()["data"]])
prompt = prompt_template.format(prompt=prompt, chunks=chunks)

response = client.chat.completions.create(
    messages=[{"role": "user", "content": prompt}],
    model=language_model,
    stream=False,
    n=1,
)

response = response.choices[0].message.content
print(response)

The French Tech Mission is a government initiative that supports the French startup ecosystem. It assists the most mature startups through various programs, such as the French Tech Next40/120 program and French Tech 2030 program. 

The mission is part of the Ministry for the Economy, Finances and Industrial and Digital Sovereignty. It aims to promote French Tech ecosystems nationwide, reduce funding gaps, and support the development of a competitive and efficient market.


## Internet Search

You can also add an internet search by specifying *web_search=true* in request body.


In [None]:
data = {"collections": [collection_id], "web_search": True, "k": 6, "prompt": prompt}
response = session.post(url=f"{base_url}/search", json=data)

chunks = "\n\n\n".join([result["chunk"]["content"] for result in response.json()["data"]])
sources = set([result["chunk"]["metadata"]["document_name"] for result in response.json()["data"]])
rag_prompt = prompt_template.format(prompt=prompt, chunks=chunks)

response = client.chat.completions.create(
    messages=[{"role": "user", "content": rag_prompt}],
    model=language_model,
    stream=False,
    n=1,
)

response = response.choices[0].message.contenta
print(response)

Internet pages used to generate answer are available.

In [7]:
for source in sources:
    print(source)

https://www.lefigaro.fr/conjoncture/ulrich-tan-cet-ingenieur-qui-introduit-l-ia-dans-les-administrations-pour-les-rendre-plus-efficaces-20240422
https://www.etalab.gouv.fr/datalab/equipe/


## Search with `/chat/completions`

`/chat/completions` endpoint also provide RAG feature. To do so, you need to specify `search=True` and `search_args` as below :

In [47]:
response = session.post(
    url=f"{base_url}/chat/completions",
    json={
        "messages": [{"role": "user", "content": prompt}],
        "model": language_model,
        "stream": False,
        "n": 1,
        "search": True,
        "search_args": {"collections": [collection_id], "k": 6, "method": "semantic"},
    },
)
response = response.json()

sources = [result["chunk"]["content"] for result in response["search_results"]]

print(f"""- Model answer: {response['choices'][0]['message']['content']}

- Sources used:

{'\n'.join(sources)}""")

- Model answer: The French Tech Mission est une mission du gouvernement français, rattachée au Ministère de l'Économie, des Finances et de l'Industrie et de la Souveraineté numérique. Son objectif est de soutenir l'écosystème entrepreneurial français, en particulier les startups les plus matures.

La mission offre plusieurs programmes pour accompagner les startups, notamment :

* Le programme French Tech Next40/120, qui sélectionne les 120 startups les plus avancées en France, en fonction de critères objectifs de performance économique.
* Le programme French Tech 2030, qui vise à soutenir les startups opérant dans des secteurs identifiés comme stratégiques dans le plan "France 2030".
* Les initiatives French Tech Tremplin et French Tech Rise, qui visent à permettre à des individus sans antécédents entrepreneuriaux de créer leur propre startup dans toute la France.

La mission vise également à promouvoir la reconnaissance internationale de l'écosystème français, en particulier des start