# Building a Generative AI Application with LlamaIndex and SingleStore

Welcome to this in-depth guide on constructing a Generative AI application utilizing LlamaIndex and SingleStoreDB. This guide will provide a step-by-step walkthrough, code explanations, and best practices.

## Overview
LlamaIndex is a library dedicated to ingesting, indexing, and querying contextual information for Retrieval Augmented Generation (RAG). In synergy with SingleStoreDB, a scalable and SQL-compliant relational database system, it lays the foundation for building powerful generative AI applications. This combination facilitates real-time data processing and retrieval, essential for answering user queries efficiently. LlamaIndex is also cross compatible with Langchain, another popular library used for composing LLM inputs and outputs. We'll use both with SingleStore to build an end-to-end GenAI app.

## What You'll Learn
- Setting up the environment with the required packages and credentials.
- Ingesting and indexing data using LlamaIndex for efficient retrieval.
- Storing and managing data in SingleStoreDB.
- Building a retrieval-based generative AI system to respond to user queries.

## Prerequisites
- Basic knowledge of Python programming.
- Understanding of SQL databases.
- Familiarity with generative AI concepts would be beneficial.


Let's first install the necessary packages.

In [1]:
!pip install llama-index --quiet
!pip install langchain --quiet
!pip install llama-hub --quiet
!pip install singlestoredb --quiet

Then, let's set our OpenAI API Key. Note: the API keys used in this notebook are placeholders and invalid.

In [2]:
import os
os.environ["OPENAI_API_KEY"] = "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"

Next, we'll import the SingleStore vectorstore from Langchain.

In [3]:
from langchain.vectorstores import SingleStoreDB

After importing SingleStore, we can ingest the docs for LlamaIndex into a new table. This takes three steps:

1. Load raw HTML data using WebBaseLoader
2. Chunk the text.
3. Embed or vectorize the chunked text, then ingest it into SingleStore.

In [4]:
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://gpt-index.readthedocs.io/en/latest/")
data = loader.load()

In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
all_splits = text_splitter.split_documents(data)

In [7]:
from langchain.embeddings import OpenAIEmbeddings
os.environ["SINGLESTOREDB_URL"] = "admin:Naopassara0@svc-b0c3cf30-3e0a-4933-be43-ab1e7f5e2117-dml.aws-saopaulo-1.svc.singlestore.com:3306/minhaDB"

# vectorstore = SingleStoreDB.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
vectorstore = SingleStoreDB(embedding=OpenAIEmbeddings())

Now, we'll use Llama Index to retrieve and query from SingleStore using the SingleStoreReader, a lightweight embedding lookup tool for SingleStore databases ingested with content and vector data.

Note that the full SingleStore vectorstore integration with Llama Index for ingesting and indexing is coming soon!

In [15]:
from llama_index import download_loader

SingleStoreReader = download_loader("SingleStoreReader")

reader = SingleStoreReader(
    scheme="mysql",
    host="svc-b0c3cf30-3e0a-4933-be43-ab1e7f5e2117-dml.aws-saopaulo-1.svc.singlestore.com",
    port=3306,
    user="admin",
    password="Naopassara0",
    dbname="minhaDB",
    table_name="embeddings",
    content_field="content",
    vector_field="vector"
)


  self._metadata.reflect(


TypeError: NullType() takes no arguments

Let's test it out. This function takes a natural language query as input, then does the following:

1. Embed the query using the OpenAI Embedding model, `text-embedding-ada-002` by default.
2. Ingest the documents into a Llama Index list index, a data structure that returns all documents into the context.
3. Initialize the index as a Llama Index query engine, which uses the `gpt-3.5-turbo` OpenAI LLM by default to understand the query and provided context, then generate a response.
4. Returns the response.

In [10]:
import json

from llama_index import ListIndex

def ask_llamaindex_docs(query):

  embeddings = OpenAIEmbeddings()
  search_embedding = embeddings.embed_query(query)
  documents = reader.load_data(search_embedding=json.dumps(str(search_embedding)))

  index = ListIndex(documents)

  query_engine = index.as_query_engine()

  response = query_engine.query(query)
  return response

In [49]:
print(ask_llamaindex_docs("What is Llama Index?"))

Llama Index is a data framework for LLM applications to ingest, structure, and access private or domain-specific data.


In [50]:
print(ask_llamaindex_docs("What are data indexes in Llama Index?"))

Data indexes in Llama Index are modules that allow users to organize and retrieve their data efficiently. These indexes can be customized and extended to fit the specific needs of the users.


In [53]:
print(ask_llamaindex_docs("What are query engines in Llama Index?"))

Query engines in Llama Index are components that handle different types of queries on the data stored in the index. They include the Graph Query Engine, Multistep Query Engine, Retriever Query Engine, Transform Query Engine, Router Query Engine, Retriever Router Query Engine, Sub Question Query Engine, SQL Join Query Engine, Flare Query Engine, Citation Query Engine, Knowledge Graph Query Engine, SQL Query Engine, and Pandas Query Engine. Each query engine is designed to handle specific types of queries and provide efficient and accurate results.


# Tips and Tricks

## Chat engines

Chat with your data conversationally with Llama Index Chat Engines, which allow for follow-ups and further questions.




In [40]:
query = "What is Llama Index?"

embeddings = OpenAIEmbeddings()
search_embedding = embeddings.embed_query(query)
documents = reader.load_data(search_embedding=json.dumps(str(search_embedding)))

index = ListIndex(documents)

chat_engine = index.as_chat_engine(chat_mode='context')

chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.

Human: What is Llama Index?
Assistant: LlamaIndex is an open-source library that provides tools and APIs for building and deploying large-scale search and retrieval systems. It allows users to ingest, index, and query large amounts of data efficiently. LlamaIndex supports various types of data stores, including document stores, key-value stores, and graph stores. It also provides different indexing techniques and retrieval models to optimize search performance. LlamaIndex is designed to be flexible and customizable, allowing users to tailor the system to their specific needs. It is suitable for both beginners and advanced users, offering a high-level API for quick and easy usage, as well as lower-level APIs for customization and extension.

Human: What can Llama Index do?
Assistant: LlamaIndex offers a range of capabilities for building and deploying search and retrieval systems. Here are some key features and functionalities of Llam

KeyboardInterrupt: ignored

## Finetune Embeddings

Improve your retrieval performance by 5-10% using a finetuned embedding model. Though a full implementation is outside the scope of this webinar, at a high level, you will:

1. Split your data into train and validation datasets.
2. Generate synthetic QA embedding pairs.
3. Finetune your model using Llama Index.


In [None]:
# Your data goes here
train_dataset = generate_qa_embedding_pairs(train_nodes)
val_dataset = generate_qa_embedding_pairs(val_nodes)

In [46]:
from llama_index.finetuning import SentenceTransformersFinetuneEngine

In [None]:
finetune_engine = SentenceTransformersFinetuneEngine(
    train_dataset,
    model_id="BAAI/bge-small-en",
    model_output_path="test_model",
    val_dataset=val_dataset,
)

In [None]:
finetune_engine.finetune()

In [None]:
embed_model = finetune_engine.get_finetuned_model()

****

## Data Agents

Data Agents are agents in LlamaIndex that can reason over your data and perform predefined tasks, with the ability to read and modify your data. They can:

- Perform automated search and retrieval over different types of data - unstructured, semi-structured, and structured.

- Calling any external service API in a structured fashion, and processing the response + storing it for later.

We'll create a simple agent with access to the `ask_llamaindex_docs` function we created earlier as a tool.

In [70]:
from llama_index.llms import OpenAI
from llama_index.agent import ReActAgent
from llama_index.tools import QueryEngineTool, ToolMetadata

In [71]:
llamaindex_docs_tool = QueryEngineTool(
    query_engine=index.as_query_engine(),
    metadata=ToolMetadata(
        name="llamaindex_docs",
        description="Provides access to the docs for Llama Index, a library for ingesting, indexing, and querying data for LLMs."
    )
)

In [72]:
agent = ReActAgent.from_tools([llamaindex_docs_tool], verbose=True)

In [73]:
agent.reset()

In [74]:
agent.chat("What is Llama Index?")

[38;5;200m[1;3mThought: I need to use a tool to help me answer the question.
Action: llamaindex_docs
Action Input: {'input': 'Llama Index'}
[0m[36;1m[1;3mObservation: LlamaIndex is a tool that provides APIs for both beginner and advanced users. It allows beginners to easily ingest and query their data with just a few lines of code. Advanced users can customize and extend various modules, such as data connectors, indices, retrievers, query engines, and reranking modules, to suit their specific needs. LlamaIndex also supports different types of stores, including document stores, index stores, key-value stores, and graph stores. It offers various tutorials and guides to help users understand and utilize its features effectively.
[0m[38;5;200m[1;3mResponse: Llama Index is a tool that provides APIs for both beginner and advanced users. It allows beginners to easily ingest and query their data with just a few lines of code. Advanced users can customize and extend various modules, suc

AgentChatResponse(response='Llama Index is a tool that provides APIs for both beginner and advanced users. It allows beginners to easily ingest and query their data with just a few lines of code. Advanced users can customize and extend various modules, such as data connectors, indices, retrievers, query engines, and reranking modules, to suit their specific needs. Llama Index also supports different types of stores, including document stores, index stores, key-value stores, and graph stores. It offers various tutorials and guides to help users understand and utilize its features effectively.', sources=[], source_nodes=[])

In [None]:
agent.chat("Tell me about it's capabiltiies")