# Implementing a simple Retrieval QA system using LLM Vinuca 7B locally in Spanish

By Rodrigo Gonzalez

## Project Description

The objective of this notebook is to create a simple example of a Retrieval Question-Answering (QA) system in Spanish using the LLM Vinuca 7B language model (LLM) locally. The system aims to provide accurate and efficient responses to user queries by leveraging the powerful capabilities of LLM Vinuca 7B.

## The key components of this project include:

1. LLM Vinuca 7B running locally is lauched by using the Text Generation Web UI framework. The model is executed using an NVIDIA GeForce RTX 3060 12GB.

2. I want to test how well Vicuna 7B performs working with questions in Spanish. 

3. The chosen database is one that I assume that the model doesn't have any previous knowledge of. I chose to use the information provided in Wikipedia about Babasónicos, an Argentine rock band that the LLM may be unfamiliar with.

4. Chroma DB is used as the vector store DB.

4. SBERT is used to create embeddings that will represent each text chunk in the vector DB.

5. No "automagical" Langchain functions. Langchain provides great classes to perform complex NLP tasks easily. However, the drawback of this approach is that a lot of things are happening "under the hood," and developers often lose some control.

## Installing required packages

In [None]:
#!pip install -r ../../requirements.txt

## Libraries

In [2]:
from langchain.vectorstores import Chroma
from langchain.document_loaders import UnstructuredFileLoader
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.text_splitter import NLTKTextSplitter
import langchain
import yaml
import os
import sys

## Defining the path to the project folder

In [3]:
PROJECT_PATH = os.path.join(os.getenv('HOME'), 'JurisGPT')

# Change to the project folder
os.chdir(PROJECT_PATH)

## Adding local libraries

In [4]:
sys.path.append(PROJECT_PATH + '/code/python/libraries')

import text_generator_api as tg
import jurisgpt_functions as jur

## Langchain configuration

In [5]:
with open("config/config.yaml", "r") as f:
    config_data = yaml.load(f, Loader=yaml.FullLoader)

langchain_api_key = config_data['langchain']['api_key']

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.langchain.plus"
os.environ["LANGCHAIN_API_KEY"] = langchain_api_key

## Embedding function

The "all-MiniLM-L6-v2" is a sentence-transformers model that maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search. More information at https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

In [None]:
embedding_fnc = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

## Data loading

Data was previously download from the Babasonicos Wikipedia using the `jus.get_webpage()` funcion and save it in the `babasonicos.txt`.

In [7]:
loader = UnstructuredFileLoader("./data/babasonicos.txt")
docs = loader.load()

In [25]:
print(docs[0].page_content[:300])

Período de actividad

1991–presente

Discográfica(s)

Sony Music (1992



1999, 2012



presente)

Bultaco Discos

PopArt Discos (2001



2005)

Universal Music (2005



2012)

WebSitio web

Babasonicos.com

Miembros Adrián «Dárgelos» Rodríguez, Diego «Uma» Rodríguez, Diego «Uma-T» Tuñón, Diego «Pan


## Splitting the  document in chunks 

The NLTK splitter is a "content-aware" chunking method. It provides a sentence tokenizer that can split the text into sentences, helping to create more meaningful chunks. More information about different splitting strategies at https://www.pinecone.io/learn/chunking-strategies/


In [9]:
text_splitter = NLTKTextSplitter(chunk_size=1000, chunk_overlap=0)
docs_split = text_splitter.split_documents(docs)

## Creating a vector database with Chroma

Chroma is a AI-native open-source vector database focused on developer productivity and happiness. 

In [10]:
db = Chroma.from_documents(docs_split, embedding_fnc)

## Prompt

Next, I will assess the knowledge of the LLM regarding Babasónicos. Therefore, I will pose a question without providing any specific context or prior information. The prompt will be in Spanish, as it is the appropriate language for this topic.

In [14]:
instruction = "Una conversación entre un usuario curioso y un asistente de inteligencia artificial. \
    El asistente brinda respuestas útiles, detalladas y educadas a las preguntas del usuario."
question = '¿Conoces a la banda de rock Babasónicos?'
prompt = jur.simple_prompt(instruction, question)

## LLM query

In [15]:
print ("Asking a question to the LLM...")

start_time = jur.tic()
response = tg.query_llm(prompt)
end_time = jur.toc()

elapsed_time = end_time - start_time

print ("Human question: ", question)
print ("LLM response: ", response)
print(f"LLM response time: {elapsed_time:.3f} seconds.")

Asking a question to the LLM...
Human question:  ¿Conoces a la banda de rock Babasónicos?
LLM response:   Sí, conozco a Babasónicos. Son una banda argentina de rock formada en 1985 en Buenos Aires. Han publicado varios álbumes y han sido muy populares en su país natal y en América Latina. Su música se caracteriza por tener influencias tanto del rock clásico como del pop y el funk.
LLM response time: 17.619 seconds.


Well, it seems that the LLM knows something about Babasónicos. Nevertheless, it hallucinates a little bit since Babasónicos was formed in 1991, not 1985.

## Setting the context

Now, let's utilize the information I have already retrieved from Wikipedia to enhance our response and assist the LLM in providing a more accurate answer.

I'll utilize the vector database to search for relevant chunks that bear some similarity to the question.

I'll ask a new particular question: What year was Babasónicos formed?

In [22]:
question = '¿En qué año se fundó la banda Babasónicos?'

docs_query = db.similarity_search(question, k=3)
context = '\n'.join([doc.page_content for doc in docs_query])

In [23]:
for doc in docs_query:
    print(doc.page_content[:500])
    print("----------")

En el 2002, Babasónicos se embarca en una gira por EE. UU.

y México.

A continuación, editaron varios simples con nuevos remixes de los cortes de difusión.

El álbum también fue nominado como «Mejor Álbum de Rock» para los Grammy Latinos.

Etapa Post-Jessico (2003-2007) Babasónicos en 2006.

En el 2003 cerraron la quinta jornada del Quilmes Rock, un festival organizado en la cancha auxiliar de River que convocó a las bandas más importantes del momento.

Además, participaron del Cosquín Rock y d
----------
Babasónicos aparece en 3 temas: «La muerte es mujer», «Arenas Movedizas» y «DJ Beverly Hills».

Ese año también editaron su primer disco de lados B, Vórtice Marxista (compuesto por lados B de sus primeros tres discos), el cual se vendía en sus recitales.

Ese mismo año, telonearon a la banda irlandesa U2 en sus espectáculos en River, como paso previo al lanzamiento de Miami, lanzado en 1999 y que fue el último disco en el que participa DJ Peggyn quién se desvincula del proyecto por d

The Chroma DB has retrieved 3 chunks containing dates, and the last chunk indicates the year when Babasónicos was formed.

## LLM query with context

In [24]:
instruction = "Usa el siguiente contexto para responder la pregunta en español. Si no sabes la respuesta, solo di que no sabes la respuesta, no trataré de inventar una respuesta."
prompt = jur.format_prompt(instruction, question, context)

print ("Asking a question to the LLM...")

start_time = jur.tic()
response = tg.query_llm(prompt)
end_time = jur.toc()

elapsed_time = end_time - start_time

print ("Human question: ", question)
print ("LLM response: ", response)
print(f"LLM response time: {elapsed_time:.3f} seconds.")

Asking a question to the LLM...
Human question:  ¿En qué año se fundó la banda Babasónicos?
LLM response:   En 1991.
LLM response time: 2.795 seconds.


## Conclusions

In this notebook, I have demonstrated the process of creating a basic Spanish Retrieval QA system using the LLM Vicuña 7B, Chroma DB, SBERT for embeddings, and without relying on any "automagical" Langchains features.