# Using the API

First of, you need to **run the following command on terminal** to start running the api on the local server:

`uvicorn main:api --reload`

### Testing on the first URL : The wikipedia page about Brazil

In [1]:
url_to_index = 'https://en.wikipedia.org/wiki/Brazil'
query1 = "What is the population of Brazil?"
query2 = "when was the Treaty of Tordesillas?"
query3 = "When did Pedro Álvares Cabral land in Brazil?"
query4 = "Who was Pedro Alvares Cabral?"
query5 = "How many states does Brazil have?"
query6 = "What is the capital of Brazil?"
followup_query11 = "and how big is its territory?"
followup_query12 = "and when was the first settlement established?"

url_to_index2 = "https://en.wikipedia.org/wiki/France"
query21 = "What is capital of France?"
followup_query21 = "and how many people live there?"
followup_query22 = "and who is the its current president?"
followup_query23 = "What have I just asked?"

In [2]:
import requests

BASE_URL = "http://127.0.0.1:8000"  # Make sure FastAPI is running
user_id = "defaultuser" 

In [3]:
# Test Indexing
index_response = requests.post(f"{BASE_URL}/index_url/", params={"url": url_to_index})

print(index_response.json())  # Should return "URL indexed successfully"

{'message': 'URL indexed successfully'}


In [4]:
# Test Asking
ask_response = requests.get(f"{BASE_URL}/ask/", params={"url": url_to_index, "question": query1})

print(ask_response.json())  # Should return the answer

{'answer': 'The population of Brazil was approximately 210.86 million on July 1, 2022.'}


In [5]:
# Test retrieval 
retrieval_response = requests.get(f"{BASE_URL}/get_retrieval_text_and_similarity/", params={"url": url_to_index, "question": query1})

print(retrieval_response.json()['context']) # Should return the best mathcing paragraph and its cossine similarity
print(retrieval_response.json()['cossine_similarity'])
print(retrieval_response.json()['rerank_scores'])

According to the latest official projection, it is estimated that Brazil’s population was 210,862,983 on July 1, 2022—an adjustment of 3.9% from the initial figure of 203 million reported by the 2022 census.[354] The population of Brazil, as recorded by the 2008 PNAD, was approximately 190 million[355] (22.31 inhabitants per square kilometer or 57.8/sq mi), with a ratio of men to women of 0.95:1[356] and 83.75% of the population defined as urban.[357] The population is heavily concentrated in the Southeastern (79.8 million inhabitants) and Northeastern (53.5 million inhabitants) regions, while the two most extensive regions, the Center-West and the North, which together make up 64.12% of the Brazilian territory, have a total of only 29.1 million inhabitants.

[0.7869417]
[8.908005]


In [6]:
# Test chat
chat_response = requests.get(f"{BASE_URL}/chat/", params={"url": url_to_index, "question": query1,
                                                           "user_id": user_id})

print(chat_response.json())  # Should return the answer

{'answer': 'The population of Brazil was approximately 211 million in 2022.'}


In [7]:
# Test followup questions
followup11_response = requests.get(f"{BASE_URL}/chat/", params={"url": url_to_index, "question": followup_query11,
                                                           "user_id": user_id})
print(followup11_response.json())  # Should return the answer

{'answer': 'The territory of Brazil covers approximately 8.5 million square kilometers.'}


In [8]:
# Test followup questions
followup12_response = requests.get(f"{BASE_URL}/chat/", params={"url": url_to_index, "question": followup_query12,
                                                           "user_id": user_id})
print(followup12_response.json())  # Should return the answer

{'answer': 'The first settlement in Brazil was established in 1532.'}


In [9]:
# Test get chat history
chat_history_response = requests.get(f"{BASE_URL}/get_chat_history/", params={"user_id": user_id, "url": url_to_index})

print(chat_history_response.json()['chat_history'])  # Should return the chat history # we just keep the last 10 messages

User:What is the population of Brazil?
Chatbot:The population of Brazil was approximately 211 million in 2022.
User:and how big is its territory?
Chatbot:The territory of Brazil covers approximately 8.5 million square kilometers.
User:and when was the first settlement established?
Chatbot:The first settlement in Brazil was established in 1532.


### Testing the api on another URL with other questions

In [10]:
# Test Indexing
index_response = requests.post(f"{BASE_URL}/index_url/", params={"url": url_to_index2})

print(index_response.json())  # Should return "URL indexed successfully"

{'message': 'URL indexed successfully'}


In [11]:
# Test Asking
ask_response = requests.get(f"{BASE_URL}/ask/", params={"url": url_to_index2, "question": query21})

print(ask_response.json())  # Should return the answer

{'answer': 'Paris is the capital of France.'}


In [12]:
# Test retrieval 
retrieval_response = requests.get(f"{BASE_URL}/get_retrieval_text_and_similarity/", params={"url": url_to_index2, "question": query21})

print(retrieval_response.json()['context']) # Should return the best mathcing paragraph and its cossine similarity
print(retrieval_response.json()['cossine_similarity'])

France,[IX] officially the French Republic,[X] is a country located primarily in Western Europe. Its overseas regions and territories include French Guiana in South America, Saint Pierre and Miquelon in the North Atlantic, the French West Indies, and many islands in Oceania and the Indian Ocean, giving it one of the largest discontiguous exclusive economic zones in the world. Metropolitan France shares borders with Belgium and Luxembourg to the north, Germany to the northeast, Switzerland to the east, Italy and Monaco to the southeast, Andorra and Spain to the south, and a maritime border with the United Kingdom to the northwest. Its metropolitan area extends from the Rhine to the Atlantic Ocean and from the Mediterranean Sea to the English Channel and the North Sea. Its eighteen integral regions—five of which are overseas—span a combined area of 632,702 km2 (244,288 sq mi) and have an estimated total population of over 68.6 million as of January 2025[update]. France is a semi-presiden

In [13]:
# Test chat
chat_response = requests.get(f"{BASE_URL}/chat/", params={"url": url_to_index2, "question": query21,
                                                           "user_id": user_id})

print(chat_response.json())  # Should return the answer

{'answer': 'Paris is the capital of France.'}


In [14]:
# Test followup questions
followup21_response = requests.get(f"{BASE_URL}/chat/", params={"url": url_to_index2, "question": followup_query21,
                                                           "user_id": user_id})
print(followup21_response.json())  # Should return the answer

{'answer': 'The population of France is approximately 68.6 million people.'}


Since we are using an LLM and intentionally did not set a seed to avoid limiting the model’s responses, at times, the model interprets the question above as being related to Paris and, therefore, responds with “I don’t have enough information.” However, in other instances, it understands that the question is about France and finds the information in the context, responding as shown above. My conclusion is that the question is indeed ambiguous, and to me, both of the model’s reactions make sense!

In [15]:
# Test followup questions
followup22_response = requests.get(f"{BASE_URL}/chat/", params={"url": url_to_index2, "question": followup_query22,
                                                           "user_id": user_id})
print(followup22_response.json())  # Should return the answer

{'answer': 'The current president of France is Emmanuel Macron.'}


In [16]:
# Test followup questions
followup23_response = requests.get(f"{BASE_URL}/chat/", params={"url": url_to_index2, "question": followup_query23,
                                                           "user_id": user_id})
print(followup23_response.json())  # Should return the answer

{'answer': "What have I just asked for?      I've asked for information about the current president of France."}


Once again, we see a limitation of an LLM (especially a smaller and quantized one!): even when explicitly instructing in the prompt not to repeat the question in the response, the model still does so, making the output cluttered, even if the answer is correct. For the same reason, it’s also worth mentioning that restricting the model’s knowledge to the given context doesn’t always work as expected.

In [17]:
# Test get chat history
chat_history_response = requests.get(f"{BASE_URL}/get_chat_history/", params={"user_id": user_id, "url": url_to_index2})

print(chat_history_response.json()['chat_history'])  # Should return the chat history # we just keep the last 10 messages

User:What is capital of France?
Chatbot:Paris is the capital of France.
User:and how many people live there?
Chatbot:The population of France is approximately 68.6 million people.
User:and who is the its current president?
Chatbot:The current president of France is Emmanuel Macron.
User:What have I just asked?
Chatbot:What have I just asked for?      I've asked for information about the current president of France.


Overall, I believe we can conclude that the results were quite satisfactory. Obviously, with more powerful hardware, the time overhead caused by using LLMs would be mitigated. There are several areas for improvement, including how to handle the model’s responses and the fact that it can still hallucinate and make mistakes, even with the temperature set to 0.0. However, I think all functionalities have been implemented in a simple, direct, and objective manner. Moreover, they work well, and the results are interesting!