## RAG app TFM
In this use case it is shown how to exctract information from a PDF file through LLM queries with RAG (Retrieval Augmented Generation) technology. For this use case is necessary the use of a vector database (in this case FAISS), embeddings and OpenAI model calls. To show the final result, the model is embedded on a Gradio UI.

In [1]:
from dotenv import load_dotenv
import os

load_dotenv("apis.env")
hf_api_key = os.environ['HF_API_KEY']


In [2]:
# PDF to ectract info from
pdf_path = "BOE-A-1978-31229-consolidado.pdf"

# Load pdf with external info not seen during training of the LLM
from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader(pdf_path)
pages = loader.load_and_split()

In [3]:
pages[0]

Document(page_content='Constitución Española.\nCortes Generales\n«BOE» núm. 311, de 29 de diciembre de 1978\nReferencia: BOE-A-1978-31229\nÍNDICE\n   \nPreámbulo ................................................................ 3\nTÍTULO PRELIMINAR ........................................................... 3\nTÍTULO I. De los derechos y deberes fundamentales ........................................ 4\nCAPÍTULO PRIMERO. De los españoles y los extranjeros ................................... 5\nCAPÍTULO SEGUNDO. Derechos y libertades .......................................... 5\nSección 1.ª De los derechos fundamentales y de las libertades públicas ........................ 5\nSección 2.ª De los derechos y deberes de los ciudadanos ................................. 8\nCAPÍTULO TERCERO. De los principios rectores de la política social y económica ................... 10\nCAPÍTULO CUARTO. De las garantías de las libertades y derechos fundamentales .................. 12\nCAPÍTULO QUINTO. De la

In [4]:
# Generate vector space representation with words from the external data
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [5]:
# Load embeddings in vector database
from langchain.vectorstores import FAISS
db = FAISS.from_documents(pages, embeddings)

In [50]:
input = "Existe algún modo rápido para triturar alimentos?"
chain(input, return_only_outputs=True)

{'result': ' Sí, se puede utilizar el modo "Turbo" para triturar alimentos'}

# Advanced Gradio app

In [8]:
import gradio as gr
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

def extract_sentence(dictionary):
    # Extract the value associated with the key 'result'
    sentence = dictionary.get('result', '')
    return sentence

def model_hyperparameters(temperature, max_tokens):#
    llm = OpenAI(
        model_name="gpt-3.5-turbo-instruct",
        temperature=temperature,
        max_tokens=max_tokens,
        streaming=True
        )
    chain = RetrievalQA.from_llm(llm=llm, retriever=db.as_retriever())
    return chain

def respond(message,temperature=0.7, max_tokens=32):
    prompt = message
    chain = model_hyperparameters(temperature, max_tokens)
    completion = chain(prompt, return_only_outputs=True)
    return extract_sentence(completion)

with gr.Blocks() as demo:
    completion = gr.Textbox(label="Completion")
    msg = gr.Textbox(label="Prompt")
    with gr.Accordion(label="Advanced options",open=False):
        temperature = gr.Slider(label="temperature", minimum=0.1, maximum=1.0, value=0.7, step=0.1)
        max_tokens = gr.Slider(label="Max tokens", value=32, maximum=64, minimum=8, step=1)
    btn = gr.Button("Submit")
    clear = gr.ClearButton(components=[msg, completion], value="Clear console")

    btn.click(respond, inputs=[msg, temperature, max_tokens], outputs=[completion])
    msg.submit(respond, inputs=[msg, temperature, max_tokens], outputs=[completion])

gr.close_all()
demo.queue().launch(share=True)    

Running on local URL:  http://127.0.0.1:7862
Running on public URL: https://bb0f2ed2dc35ceb312.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


