## Customize LLMs using RAG 

A great use of LLMs is to help digest new bills, policies, and or court verdicts in an efficient matter. Journalists encounter embargos frequently and receive a lot to digest in a short period of time. 

The problem with pretrained LLMs is that they have an end date. So if a new policy or laws get passed, the retreival of new informaton may be uneven. By saving pdfs of the current analysis and the bill itself, we can get a pretty good idea of not just the summary, but also some of the surprising details. 

Below is a demonstration on how we can query the bill directly and some of the analysis included.

My first query was: "What is in the "Big Beautiful Bill"

Response: "The One Big Beautiful Bill Act includes significant changes to federal healthcare programs such as Medicaid/Children's Health Insurance Program (CHIP), Medicare, and the Affordable Care Act (ACA). It also addresses tax reforms and incentives, energy policy, defense and national security provisions, education funding, agricultural supports, transportation and infrastructure investments, as well as immigration and border security measures".




 

In [10]:
#Install in packages (pip) in terminal - if missing
#!pip install python-dotenv
from dotenv import load_dotenv
#pip install duckdb
import duckdb
#pip install llama_index_core
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
import os #built in package
#pip install openai
import openai
#pip install textwrap
import textwrap 
#pip install llama_index.vector_stores.duckdb
import llama_index.vector_stores.duckdb
#pip install llama-index-embeddings-openai
from llama_index.embeddings.openai import OpenAIEmbedding
#pip install llama-index-embeddings-openai
#import llama-index-embeddings-openai
#pip install gradio
import gradio as gr

In [11]:

from llama_index.llms.openai import OpenAI




In [14]:
file_path = 'persist/my_vector_store.duckdb'

# Check if file exists
if os.path.exists(file_path):
  #Delete the file
  os.remove(file_path)
  print("File deleted successfully")
else:
  print("File doesn't exist - first run - it's all good")

File doesn't exist - first run - it's all good


In [15]:
from dotenv import load_dotenv
#load_dotenv()
# Point to the secrets/.env file
load_dotenv(dotenv_path="secrets/.env")

api_key = os.getenv('OPENAI_API_KEY')

from openai import OpenAI
client = OpenAI(api_key=api_key)

In [17]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.duckdb import DuckDBVectorStore
from llama_index.core import StorageContext

vector_store = DuckDBVectorStore("my_vector_store.duckdb", persist_dir="persist/")
documents = SimpleDirectoryReader("/Users/Eileen/Desktop/GoData/Blog/posts/LLM_Demo/BBB/").load_data()

In [18]:
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

In [19]:
import gradio as gr

# Create a custom theme with blue as the primary color
#theme = gr.themes.Default()  
#theme = gr.themes.Default(primary_hue="blue", font=["Helvetica", "sans-serif"])

def greet(query):
    #print("Before query")  # Debugging
    query_engine = index.as_query_engine()
    response = query_engine.query(query)
    #print("After query")   # Debugging
    return str(response)

gr.Interface(
  fn=greet,
  inputs=gr.Textbox(lines=1, placeholder="Enter your query here...",
  label="Your Query"),
  outputs=gr.Textbox(label="Response")
  ).launch(share=False)

* Running on local URL:  http://127.0.0.1:7861
* To create a public link, set `share=True` in `launch()`.


