# SingleStore and Groq RAG Quickstart

This notebook provides an example of how to use SingleStore as a vector database in conjunction with Groq, the world's fastest LLM.

In [7]:
!pip install -q -U langchain langchain-groq singlestoredb langchain-openai --quiet

In [12]:
from getpass import getpass

import os

GROQ_API_KEY = getpass()

os.environ["GROQ_API_KEY"] = GROQ_API_KEY

 ········


In [30]:
OPENAI_API_KEY = getpass()

os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

 ········


In [21]:
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate

## Initialize GroqChat

In [55]:
groq = ChatGroq(temperature=0, model_name="llama3-8b-8192")

In [48]:
from langchain_openai import ChatOpenAI

openai = ChatOpenAI(model="gpt-3.5-turbo-0125")

# Now let's test groq!

In [66]:
system = "You are a helpful assistant."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | groq
chain.invoke({"text": "Explain the importance of low latency LLMs."})

AIMessage(content="Large Language Models (LLMs) have revolutionized the field of natural language processing (NLP) by enabling applications such as language translation, text summarization, and chatbots. However, traditional LLMs often suffer from high latency, which can be a significant limitation in many applications. Low latency LLMs, on the other hand, offer several advantages that make them crucial for various use cases. Here are some reasons why low latency LLMs are important:\n\n1. **Real-time applications**: In applications like chatbots, virtual assistants, and real-time language translation, low latency is essential to provide a seamless user experience. Low latency LLMs enable faster response times, reducing the delay between user input and the AI's response.\n2. **Interactive systems**: Interactive systems like language-based games, quizzes, or educational platforms require low latency to ensure a smooth user experience. Low latency LLMs can process user input quickly, prov

In [None]:
# OpenAI GPT 3.5 Turbo for comparison

In [65]:
system = "You are a helpful assistant."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | openai
chain.invoke({"text": "Explain the importance of low latency LLMs."})

AIMessage(content='Low latency LLMs, or Low-Latency Memory Modules, are important in high-performance computing environments where fast data access is critical. Here are some reasons why low latency LLMs are important:\n\n1. Reduced response time: Low latency LLMs provide faster access to data, reducing the time it takes for the CPU to retrieve information from memory. This results in quicker response times for applications, leading to improved overall system performance.\n\n2. Enhances system efficiency: By minimizing latency, low latency LLMs help in reducing data access bottlenecks and increasing the efficiency of data processing. This is particularly important in real-time applications where timely data retrieval is crucial.\n\n3. Increased throughput: Low latency LLMs can improve the overall throughput of a system by allowing data to be accessed and processed more quickly. This can be beneficial for applications that require high data transfer rates, such as in financial trading o

## RAG using SingleStoreDB

In [71]:
from langchain.vectorstores import SingleStoreDB
import os

from langchain_openai import OpenAIEmbeddings

os.environ["SINGLESTOREDB_URL"] = f'{connection_user}:{connection_password}@{connection_host}:{connection_port}/{connection_default_database}'

In [70]:
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://python.langchain.com/docs/integrations/chat/groq/")
data = loader.load()

In [72]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 200)
all_splits = text_splitter.split_documents(data)

In [74]:
vectorstore=SingleStoreDB.from_documents(documents=all_splits, table_name="test9", embedding=OpenAIEmbeddings())

In [75]:
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())
qa_chain({"query": "Please show a simple example of how to chat with Groq with Langchain in python."})

{'query': 'Please show a simple example of how to chat with Groq with Langchain in python.',
 'result': 'Here is a simple example of how to chat with Groq using Langchain in Python:\n\n```\nfrom langchain.groq import ChatGroq\nfrom langchain.prompts import ChatPromptTemplate\n\n# Initialize the ChatGroq class\nchat = ChatGroq(temperature=0, model_name="mixtral-8x7b-32768")\n\n# Define the system message\nsystem = "You are a helpful assistant."\n\n# Define the human message\nhuman = "What is the definition of Groq?"\n\n# Create a prompt template\nprompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])\n\n# Invoke ChatGroq to create completions\nchain = prompt | chat.invoke({"text": "Explain the definition of Groq."})\n\n# Print the response\nprint(chain)\n```\n\nThis code initializes the `ChatGroq` class with a temperature of 0 and a model name of "mixtral-8x7b-32768". It then defines a system message and a human message, creates a prompt template using these m