# LVD Usage With RAG Architecture

In [1]:
!pip install -q openai
!pip install -q langchain
!pip install -q datasets==2.14.0

In [2]:
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from openai import OpenAI
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
import chromadb

## Initialize OpenAI API client

For the purpose of this demo I will use the OpenAI it does not require setting up local LLM mode. 

In [3]:
API_KEY = "your-api-key-here"

In [4]:
client = OpenAI(api_key=API_KEY)

## LVD Setup

In this demo a dataset generated through OpenAI consisting of 10 documents representing the news from different domains after 2021.

In [5]:
documents = [
  "The 'Right to Repair' movement had successfully influenced legislation in the EU and USA, mandating manufacturers to make electronic devices easier to repair.",
  "Pakistan faced a catastrophic natural disaster in the summer of 2022, with unprecedented flooding that affected millions of people and caused extensive economic damage",
  "Taylor Swift's most recent album is called 'Midnights'. It explores themes of introspection, insecurity, and personal growth during sleepless nights.",
  "OpenAI's GPT-4 significantly advances reasoning capabilities, offering enhanced fine-tuning options and robust multilingual support.",
  "Google's 53-qubit quantum processor marks a milestone in quantum computing, showcasing practical quantum supremacy.",
  "The Apple Vision Pro Headset, unveiled in 2023 by Apple, is a mixed reality device that combines augmented and virtual reality technologies. It features advanced spatial audio, 8K displays for each eye, and seamless integration with other Apple products.",
  "Meta's Quest 3 virtual reality headset features improved hand tracking technology, higher resolution displays, and AI-driven interactive environments for an immersive experience.",
  "Samsung's latest generation of foldable smartphones boasts ultra-thin glass technology, enhancing both durability and display quality.",
  "Sony's PlayStation 6 incorporates AI-driven gameplay enhancement features, real-time ray tracing graphics, and a novel virtual reality component.",
  "Global inflation rates surged in 2022 due to supply chain disruptions and increased energy prices, significantly impacting the cost of living worldwide."
]

In [6]:
embedding_function = OpenAIEmbeddingFunction(api_key=API_KEY, model_name='text-embedding-3-small')

In [7]:
chroma_client = chromadb.Client()
collection = chroma_client.create_collection(
  name='news', 
  embedding_function=embedding_function,
  metadata={
    "lmi:n_categories": f"[2]",
  }
)

In [8]:
collection.add(
    ids=[f"id{i}" for i in range(len(documents))],
    documents=documents
)


            LMI Build Config:
            {
                clustering_algorithms: [<function cluster at 0x000001A25E1F55E0>],
                epochs: [200],
                model_types: ['MLP'],
                learning_rate: [0.01],
                n_categories: [2],
            }
             


In [9]:
collection.build_index()

FAISS Kmeans parameters {'verbose': False, 'seed': 2023}
LMI built with n_buckets_in_index: 2
Time taken to build: 0.7161502838134766; Time taken to cluster: 0.17160582542419434


{'id5': [1],
 'id4': [1],
 'id7': [0],
 'id6': [1],
 'id9': [1],
 'id8': [1],
 'id2': [0],
 'id1': [0],
 'id3': [1],
 'id0': [0]}

## RAG Pipeline

In this demo I use `gpt-3.5-turbo`from [OpenAI](https://platform.openai.com/docs/models/gpt-3-5-turbo) which have training data up to September 2021.
Thee `llm_pipeline` represent bare-bones call to the OpenAI API with the user prompt.

In [17]:
def llm_pipeline(prompt, context = ""):
    additional_context = f"Answer user prompt based on the following context: {context}." if context else ""
    system_prompt = f"You are generic chatbot assitant. {additional_context}"

    completion = client.chat.completions.create(
      model="gpt-3.5-turbo",
      messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt}
      ],
      max_tokens=100,
      temperature=0.0,
    )
    return completion.choices[0].message.content

Bellow is the definition of the `rag_pipeline` that represents the simple RAG architecture. It takes the user prompt and uses it to perform a search query in the LVD.
The result of the search query represents the context that the LLM model will receive. Thanks to this context, the LLM will be able to generate up to date answer.

In [18]:
def rag_pipeline(prompt):
    results = collection.query(
        query_texts=[prompt],
        include=["documents"],
        n_results=1,
        n_buckets=1,
    )
    context = results['documents'][0][0]
    answer = llm_pipeline(prompt, context)
    return answer

## RAG Test

In [19]:
user_prompt = "Describe the product called Apple Vision Pro that was recently released by Apple."

llm_answer = llm_pipeline(user_prompt)
print("LLM Answer: \n", llm_answer)

[2024-04-18 21:02:15,986][INFO ][httpx] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


LLM Answer: 
 I'm sorry, but as of my last update, there is no information available about a product called Apple Vision Pro being released by Apple. It's possible that it may be a new product that has not been announced yet or it could be a fictional product. I recommend checking Apple's official website or news sources for the most up-to-date information on their product releases. Let me know if you have any other questions or need assistance with anything else.


In [20]:
rag_answer = rag_pipeline(user_prompt)
print("RAG Answer: \n", rag_answer)

[2024-04-18 21:02:18,884][INFO ][httpx] HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
[2024-04-18 21:02:21,398][INFO ][httpx] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


RAG Answer: 
 The Apple Vision Pro is a cutting-edge mixed reality headset recently unveiled by Apple. This innovative device combines augmented and virtual reality technologies to provide users with an immersive experience. It boasts advanced spatial audio capabilities, 8K displays for each eye, and seamless integration with other Apple products. The Apple Vision Pro is designed to revolutionize the way users interact with digital content and the world around them.
