# Sistema RAG


### Moises Bustillo
En este proyecto, se contruye el sistema RAG completo. Utilizando los pasajes recuperados en la fase de construcci√≥n del corpus, decidiendo cu√°les de ellos y c√≥mo se le inyectan a la entrada del LLM para mejorar la calidad de la respuesta.

## Instalaci√≥n de librer√≠as necesarias

En esta celda se instalan las librer√≠as b√°sicas que se utilizar√°n a lo largo del notebook para la **recopilaci√≥n, procesamiento y an√°lisis de texto**, as√≠ como para la generaci√≥n de **embeddings sem√°nticos** que ser√°n empleados en el modelo de lenguaje.

Las librer√≠as instaladas son:

- `requests` : Para realizar peticiones HTTP y consumir APIs o recursos web de forma program√°tica.
- `pandas` : Para la manipulaci√≥n y an√°lisis de datos estructurados en formato tabular.
- `tqdm` : Para mostrar barras de progreso durante procesos iterativos o tareas de larga duraci√≥n.
- `sentence-transformers` : Para generar embeddings de texto utilizando modelos preentrenados basados en transformers, fundamentales para tareas de similitud sem√°ntica y recuperaci√≥n de informaci√≥n.


In [66]:
! pip install requests
! pip install pandas
! pip install tqdm
! pip install sentence_transformers



## Importaci√≥n de liber√≠as
Hacemos los imports distintos imports que realizaremos a lo largo del proyecto

In [None]:
from sentence_transformers import SentenceTransformer, util
from tqdm import tqdm
import pandas as pd
import requests
import random
import torch
import json
import time
import re
import os

Podemos verificar que Ollama est√° corriendo

In [68]:
try:
    response = requests.get('http://localhost:11434')
    if response.status_code == 200 and 'Ollama is running' in response.text:
        print("Ollama est√° corriendo y accesible")
    else:
        print(f"Ollama respondi√≥ con estado {response.status_code}.")
except requests.exceptions.ConnectionError:
    print("Error de Conexi√≥n")

Ollama est√° corriendo y accesible


## Configuraci√≥n general del experimento RAG

En esta secci√≥n se definen los **par√°metros globales de ejecuci√≥n** del experimento, incluyendo los archivos de entrada y salida, la variante de generaci√≥n utilizada (RAG con opci√≥n Few-Shot), el modelo de lenguaje empleado en Ollama y el **prompt base** que establece las reglas de comportamiento y formato de las respuestas del modelo.

In [None]:
# Definimos los valores para el modo RAG (Retrieval-Augmented Generation)

# Archivos de entrada/salida
ESTADO = "test"     # podemos indicar que utilizaremos los datos de train o test para responder a las preguntas.
VARIANTE = "rag"    # nombre de la variante
VARIANTE_FS = True  # indicamos si es few-shot o no

# archivos y configuraciones
ARCHIVO_TRAIN_JSONL = "train.jsonl"
ARCHIVO_TEST = "test.jsonl"
ARCHIVO_CORPUS = "corpus.json"          
OLLAMA_URL = "http://localhost:11434/api/generate"
MODEL = "gemma3:12b"
ARCHIVO_CSV_SALIDA = VARIANTE + ".csv"
NUM_FS_EXAMPLES = 3

PROMPT_BASE = """You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introductions, explanations, notes, or any extraneous text.
    No use of abbreviations, acronyms, or initialisms. Provide full names and complete terms.
    No use of terms of puntuation like . , or ;
    If the answer requires multiple components (e.g., names, locations, dates), you must provide all essential components in their complete form.
    """

## Construcci√≥n din√°mica del prompt para RAG y Few-Shot

En esta secci√≥n se definen las funciones responsables de **construir el prompt final** que se enviar√° al modelo de lenguaje.  
El prompt se compone a partir de las **instrucciones base**, los **pasajes de contexto recuperados** (RAG) y, de forma opcional, **ejemplos Few-Shot** utilizados como gu√≠a de estilo.  
Este dise√±o permite controlar expl√≠citamente la estructura del prompt y adaptar su contenido seg√∫n la variante de generaci√≥n utilizada.


La siguiente funci√≥n se encarga de montar el prompt.
Se le pueden enviar una serie de pasajes, una pregunta, un prompt y una serie de ejemplos y monta el prompt.

In [None]:
def build_prompt_rag(passages, question, PROMPT_BASE, few_shot_examples=None):
    fs_block = ""
    if few_shot_examples:
        fs_block = "\n\n**FEW-SHOT EXAMPLES (Style Guide):**\n"
        for i, ex in enumerate(few_shot_examples):
            fs_block += f"Example {i+1}:\n"
            fs_block += f"  - Question: {ex['question']}\n" 
            
            answer_text = ex['answer']
            if isinstance(answer_text, list):
                answer_text = ", ".join(map(str, answer_text)) 
                
            fs_block += f"  - Answer: {answer_text}\n---\n"
        fs_block += "\n"

    context_lines = []
    context_lines.append("\n\n**EXTERNAL KNOWLEDGE:**\n")

    for p in passages:
        text_content = p.get("text", p.get("texto", "Pasaje sin contenido"))
        context_lines.append(f"Knowledge: {text_content}\n---")
    
    context_block = '\n'.join(context_lines)
        
    prompt = PROMPT_BASE.strip() 
    prompt += fs_block
    prompt += context_block
    
    prompt += f"\n\nFINAL TASK: Based on your internal knowledge and the external information provided above, answer the next question: {question}"
    return prompt

## Env√≠o de consultas al modelo mediante Ollama

En esta secci√≥n se define la funci√≥n encargada de **comunicarse con el modelo de lenguaje a trav√©s de Ollama**.  
La funci√≥n construye din√°micamente el prompt seg√∫n la variante seleccionada (**base**, **Few-Shot** o **RAG**), env√≠a la consulta al modelo con par√°metros de generaci√≥n controlados y gestiona la respuesta en modo *streaming*, incluyendo el tratamiento b√°sico de errores de conexi√≥n o ejecuci√≥n.

A continuaci√≥n definimos una funci√≥n ask_ollama que dada una pregunta, unos pasajes de contexto y unos ejemplos few_shot, llama a la funcion build prompt definida anteriormente y hace una llamada al modelo ollama

In [None]:
# definimos las opciones del modelo
options = {
        "temperature": 0.05
    }

def ask_ollama(question, context_data = None, few_shot_examples=None):
    if VARIANTE == "rag" and context_data:
        prompt = build_prompt_rag(context_data, question,PROMPT_BASE, few_shot_examples=few_shot_examples)
    else:
        # Prompt base si no hay contexto
        prompt = PROMPT_BASE.format(question=question) 
        
    # print(prompt)
    full_response = ""
    
    try:
        response = requests.post(OLLAMA_URL, json={
            "model": MODEL,
            "prompt": prompt,
            "stream": True,
            "options": options
        }, timeout=120)

        response.raise_for_status()

        for line in response.iter_lines():
            if line:
                part = json.loads(line)
                if "response" in part:
                    full_response += part["response"]
                if part.get("done"):
                    break
        
        return full_response.strip()
        
    except requests.exceptions.ConnectionError:
        return f"[ERROR: Conexi√≥n fallida con Ollama en {OLLAMA_URL}]"
    except requests.exceptions.HTTPError as e:
        return f"[ERROR: HTTP {e}. ¬øModelo '{MODEL}' instalado?]"
    except Exception as e:
        return f"[ERROR: Inesperado: {e}]"

## Carga y validaci√≥n de los conjuntos de datos

En esta secci√≥n se cargan los **datos de entrenamiento** utilizados para el enfoque **Few-Shot**, las **preguntas de evaluaci√≥n** desde el conjunto de test y el **corpus de pasajes** que servir√° como contexto externo en el modo RAG.  
Adem√°s, se realiza una **verificaci√≥n b√°sica** para comprobar que los archivos se han le√≠do correctamente y que la estructura de los datos es la esperada antes de continuar con el pipeline.

In [None]:
# Definici√≥n de variables necesarias
preguntas_cargadas = [] # El conjunto que se iterar√° (Test o Train)
train_qa_data = []      # El conjunto (Q,A) usado para Few-Shot (siempre del train)
train_data_rag = []     # El corpus de pasajes para RAG

try:
    # 1. Cargar DATOS QA COMPLETOS (Fuente del Few-Shot)
    with open(ARCHIVO_TRAIN_JSONL, "r", encoding="utf-8") as f:
        for i, line in enumerate(f):
            data = json.loads(line)
            train_qa_data.append({
                "n√∫mero_pregunta": i + 1,
                "question": data.get("question", ""), 
                "answer": data.get("answer", "") 
            })
    
    print(f"‚úÖ Se cargaron {len(train_qa_data)} pares QA del archivo de entrenamiento (para Few-Shot).")

    # Cargamos las preguntas de test
    test_questions = []
    with open(ARCHIVO_TEST, "r", encoding="utf-8") as f:
        for i, line in enumerate(f):
            data = json.loads(line)
            test_questions.append({
                "n√∫mero_pregunta": i + 1,
                "pregunta": data.get("question", "")
            })
            
    # Asignamos el conjunto a evaluar
    if ESTADO == "test" :
        preguntas_cargadas = test_questions
    else :
        preguntas_cargadas = train_qa_data
    
    print(f"Se cargaron {len(preguntas_cargadas)} preguntas del archivo a evaluar.")

    # 3. Cargar DATOS DE CONTEXTO RAG (ARCHIVO_CORPUS: Pasajes)
    with open(ARCHIVO_CORPUS, "r", encoding="utf-8") as f:
        train_data_rag = json.load(f)
    
    if isinstance(train_data_rag, list):
        print(f"Se cargaron {len(train_data_rag)} pasajes (Contexto RAG) de \"{ARCHIVO_CORPUS}\".")
    else:
        print(f"ADVERTENCIA: El archivo '{ARCHIVO_CORPUS}' no es una lista ({type(train_data_rag)}).")

except FileNotFoundError as e:
    print(f"ERROR: El archivo '{e.filename}' no se encontr√≥. ¬°Verifica la ruta!")
    preguntas_cargadas = None
except json.JSONDecodeError as e:
    print(f"ERROR: Fallo al decodificar JSON en uno de los archivos. Detalle: {e}")
    preguntas_cargadas = None


# --- Verificaci√≥n de Carga ---
if preguntas_cargadas and train_data_rag and isinstance(train_data_rag, list) and len(train_data_rag) > 0:
    print("\n--- Verificaci√≥n de Datos ---")
    
    # Conjunto a iterar (TEST)
    print("Primeras 3 preguntas de TEST cargadas:")
    for item in preguntas_cargadas[:3]:
        print(f"  {item['n√∫mero_pregunta']}: {item['pregunta'][:50]}...")
    
    # Conjunto para Few-Shot (TRAIN QA)
    print("\nPrimeros 3 ejemplos Few-Shot (Q, A) cargados (del train):")
    for item in train_qa_data[:3]:
        print(f"  Q: {item['question'][:50]}... | A: {item['answer']}")
    
    # Corpus RAG
    ejemplo_completo = train_data_rag[0]
    texto_pasaje = ejemplo_completo.get('text', ejemplo_completo.get('texto', 'N/A'))
    print("\nVerificaci√≥n de un ejemplo de CONTEXTO RAG (Pasaje):")
    print(f"  Pasaje ID: {ejemplo_completo.get('id', 'N/A')}")
    print(f"  Contenido (Inicio): {texto_pasaje[:100]}...")

‚úÖ Se cargaron 100 pares QA del archivo de entrenamiento (para Few-Shot).
‚úÖ Se cargaron 50 preguntas del archivo de test (a evaluar).
‚úÖ Se cargaron 6928 pasajes (Contexto RAG) de "corpus.json".

--- Verificaci√≥n de Datos ---
Primeras 3 preguntas de TEST cargadas:
  1: who played rum tum tugger in the movie cats...
  2: how many states play in the mega millions lottery...
  3: when did america switch from tea to coffee...

Primeros 3 ejemplos Few-Shot (Q, A) cargados (del train):
  Q: where did the vietnam war mainly take place... | A: ['Cambodia', 'Vietnam', 'Laos']
  Q: when was krakauer considered a success as a writer... | A: ['November 1983']
  Q: who played mrs garrett's son on facts of life... | A: ['Tom Fitzsimmons', 'Joel Brooks']

Verificaci√≥n de un ejemplo de CONTEXTO RAG (Pasaje):
  Pasaje ID: 064_001
  Contenido (Inicio): 1815. The Battle of Waterloowas a very importantbattle fought in 1815. It was between the French arm...


## Generaci√≥n y carga de embeddings del corpus

En esta secci√≥n se preparan los **embeddings sem√°nticos del corpus de pasajes** que ser√°n utilizados en el modo RAG.  
El proceso detecta autom√°ticamente si hay **GPU disponible** para acelerar el c√°lculo y reutiliza una **cach√© de embeddings** si ya ha sido generada previamente, evitando recomputaciones innecesarias.  
Si la cach√© no existe, los embeddings se calculan a partir del texto del corpus y se almacenan para su uso en ejecuciones posteriores.


In [None]:
# Determina el dispositivo ('cuda' si hay GPU, 'cpu' si no)
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Dispositivo de procesamiento seleccionado: {DEVICE}")

# Definir la ruta del cach√© y el modelo de embeddings
EMBEDDINGS_CACHE_FILE = "corpus_embeddings_mpnet.pt" 
text_key = "texto"

# Inicializar el modelo expl√≠citamente en el dispositivo detectado
model = SentenceTransformer("sentence-transformers/all-mpnet-base-v2", device=DEVICE)
print(f"Modelo de embeddings cargado en el dispositivo: {DEVICE}")

train_data = train_data_rag # El corpus de pasajes

# --- 1. PREPARACI√ìN DE EMBEDDINGS RAG (CON CACHE) ---
train_embeddings = None
num_pasajes = 0 

if VARIANTE == "rag":
    print("entramos a la preparaci√≥n de embeddings RAG")
    
    # Intenta cargar los embeddings si el archivo existe
    try:
        if os.path.exists(EMBEDDINGS_CACHE_FILE):
            print(f"Cargando embeddings desde la cach√©: {EMBEDDINGS_CACHE_FILE}")
            # Cargamos el tensor en el dispositivo correcto
            train_embeddings = torch.load(EMBEDDINGS_CACHE_FILE, map_location=DEVICE)
            num_pasajes = len(train_data)
            print("Embeddings cargados r√°pidamente.")
        else:
            # Si no existe, los codificamos
            print("Cach√© no encontrada. Codificando embeddings del corpus de pasajes (Puede tardar)...")
            if train_data and isinstance(train_data, list) and text_key in train_data[0]:
                train_texts = [ex.get(text_key, "") for ex in train_data]
                train_embeddings = model.encode(train_texts, convert_to_tensor=True, show_progress_bar=True)
                num_pasajes = len(train_data)

                torch.save(train_embeddings, EMBEDDINGS_CACHE_FILE)
                print(f"Embeddings guardados en la cach√©: {EMBEDDINGS_CACHE_FILE}")
            else:
                print(f"ERROR CR√çTICO: La clave '{text_key}' no se encontr√≥ en el corpus para RAG.")
                train_embeddings = None
    except Exception as e:
        print(f"ERROR al cargar/guardar la cach√© de embeddings: {e}")
        train_embeddings = None

Dispositivo de procesamiento seleccionado: cpu
Modelo de embeddings cargado en el dispositivo: cpu
entramos a la preparaci√≥n de embeddings RAG
üîÑ Cargando embeddings desde la cach√©: corpus_embeddings_mpnet.pt
‚úÖ Embeddings cargados r√°pidamente.


## Ejecuci√≥n del pipeline RAG con recuperaci√≥n h√≠brida

En esta secci√≥n se ejecuta el **bucle principal de evaluaci√≥n**. Para cada pregunta de test, se recupera un conjunto de pasajes de contexto combinando **b√∫squeda sem√°ntica por similitud** y **muestreo aleatorio**, generando as√≠ un contexto h√≠brido.  
Opcionalmente, se incorporan **ejemplos Few-Shot** desde el conjunto de entrenamiento.  
Cada pregunta, junto con su contexto, se env√≠a al modelo de lenguaje a trav√©s de Ollama, se limpia la salida generada y los resultados se almacenan de forma incremental en un archivo CSV para su posterior an√°lisis.


In [None]:
# Detecta el dispositivo
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'

# --- CONFIGURACI√ìN DE RECUPERACI√ìN H√çBRIDA ---
# A continuaci√≥n indicamos cuantos pasajes similares y cuantos aleatorios a√±adir
K_SIMILAR = 25 
K_RANDOM = 3
HEADERS = ["n√∫mero_pregunta", "pregunta", "salida_llm", "nombre_variante"]

# Variables de control del loop
es_primera_escritura = True

if preguntas_cargadas is None:
    print("No se puede ejecutar el bucle porque las preguntas no se cargaron correctamente.")
elif train_embeddings is None:
    print("No se puede ejecutar el bucle RAG porque los embeddings no se cargaron correctamente.")
else:
    print(f"\nIniciando consulta al LLM con la variante RAG (H√≠brida: {K_SIMILAR} Similares + {K_RANDOM} Aleatorio) en {DEVICE}...")
    
    # --- 2. BUCLE DE PROCESAMIENTO ---
    for item in tqdm(preguntas_cargadas, desc="Procesando Preguntas"):
     
     numero_pregunta = item["n√∫mero_pregunta"]
     question = item["pregunta"]
     
     context_data = [] 
     similar_indices = []

     if VARIANTE == "rag" and train_embeddings is not None:
         
         # --- A. RECUPERACI√ìN SEM√ÅNTICA (K_SIMILAR) ---
         
         test_embedding = model.encode(question, convert_to_tensor=True)
         
         if DEVICE == 'cuda':
            test_embedding = test_embedding.to(DEVICE)
         
         cos_scores = util.pytorch_cos_sim(test_embedding, train_embeddings)[0]
         
         top_results = cos_scores.topk(K_SIMILAR)
         similar_indices = top_results[1].tolist()
         
         context_similar = [train_data[idx] for idx in similar_indices]
         context_data.extend(context_similar)
         
         # --- B. RECUPERACI√ìN ALEATORIA (K_RANDOM) ---
         all_indices = set(range(num_pasajes))
         used_indices = set(similar_indices)
         available_indices = list(all_indices - used_indices)
         
         k_to_sample = min(K_RANDOM, len(available_indices))
         
         if k_to_sample > 0:
           random_indices = random.sample(available_indices, k_to_sample)
         else:
           random_indices = []
         
         context_random = [train_data[idx] for idx in random_indices]
         context_data.extend(context_random)
     # si Variante_fs es true a√±adimos ejemplos de pregunta respuesta
     if VARIANTE_FS:
        num_qa_examples = len(train_qa_data) 
        k_fs_sample = min(NUM_FS_EXAMPLES, num_qa_examples)
        
        if k_fs_sample > 0:
            fs_indices = random.sample(range(num_qa_examples), k_fs_sample)
            
            # Extraer de train_qa_data
            few_shot_examples = [train_qa_data[idx] for idx in fs_indices]
     
     
     # Se pasa la pregunta y los pasajes recuperados al LLM
     salida_llm = ask_ollama(question, context_data=context_data,few_shot_examples=few_shot_examples)
     
     # LIMPIEZA ADICIONAL
     salida_llm = re.sub(r'\s+', ' ', salida_llm).strip()
     salida_llm = re.sub(r'[.,;:]', '', salida_llm).strip()
     
     # Creaci√≥n del DataFrame de la fila
     df_fila = pd.DataFrame([{
         "n√∫mero_pregunta": numero_pregunta,
         "pregunta": question,
         "salida_llm": salida_llm,
         "nombre_variante": VARIANTE
     }], columns=HEADERS)
     
     # L√≥gica de guardado incremental en CSV
     modo_escritura = 'w' if es_primera_escritura else 'a'
     
     try:
         # Crea el archivo con encabezados solo en la primera iteraci√≥n
         if es_primera_escritura:
           pd.DataFrame(columns=HEADERS).to_csv(ARCHIVO_CSV_SALIDA, mode='w', header=False, index=False, encoding='utf-8')
           modo_escritura = 'a'
         
         df_fila.to_csv(
          ARCHIVO_CSV_SALIDA,
          mode='a', 
          header=False,     
          index=False,
          encoding='utf-8',
          lineterminator='\n'  
         )
         es_primera_escritura = False 
     except Exception as e:
         print(f"\nERROR al guardar la pregunta {numero_pregunta} en CSV: {e}")
         
     time.sleep(0.1)

    print(f"\nEvaluaci√≥n finalizada. Revisa el archivo '{ARCHIVO_CSV_SALIDA}'.")


Iniciando consulta al LLM con la variante RAG (H√≠brida: 25 Similares + 3 Aleatorio) en cpu...


Procesando Preguntas:   0%|          | 0/50 [00:00<?, ?it/s]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:   2%|‚ñè         | 1/50 [00:03<02:51,  3.51s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:   4%|‚ñç         | 2/50 [00:07<02:58,  3.72s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:   6%|‚ñå         | 3/50 [00:10<02:50,  3.63s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:   8%|‚ñä         | 4/50 [00:14<02:41,  3.50s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:  10%|‚ñà         | 5/50 [00:17<02:34,  3.44s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:  12%|‚ñà‚ñè        | 6/50 [00:21<02:32,  3.47s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:  14%|‚ñà‚ñç        | 7/50 [00:24<02:30,  3.51s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:  16%|‚ñà‚ñå        | 8/50 [00:28<02:27,  3.51s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:  18%|‚ñà‚ñä        | 9/50 [00:31<02:21,  3.44s/it]

You are an AI model highly specialized in factual information retrieval. You are operating within the temporal context of 2018. All answers provided must strictly reflect the knowledge, events, and state of affairs valid up to the end of that year.
    Your sole mission is to address the user's question with the absolute minimum number of words possible, delivering only the essential and requested information. If ambiguity arises, assume the user's intended question and prioritize the most probable correct answer without acknowledging the error.
    When an error in the question is identified (e.g., misspelling, wrong movie number) but points to a primary, well-known entity, provide the correct, assumed information directly.
    When the required answer is a date, you must follow the following examples (18 march 2015, 20 december 2008) if you don't know the exact day or month, you can omit it
    Your response must consist solely of the requested facts. Prohibit all greetings, introduc

Procesando Preguntas:  18%|‚ñà‚ñä        | 9/50 [00:33<02:32,  3.72s/it]


KeyboardInterrupt: 