# Building a Generative AI Application with LlamaIndex and SingleStore

Welcome to this comprehensive guide on building a state-of-the-art Generative AI application using LlamaIndex and SingleStoreDB. This guide aims to provide a seamless experience, offering step-by-step instructions, code explanations, and best practices.

## Overview
Vertex AI, a product by Google Cloud, offers an integrated suite of machine learning tools that allows developers to build, deploy, and scale AI models faster than ever. On the other hand, SingleStoreDB offers a fast, scalable, and SQL-compliant relational database system. By combining the power of Vertex AI's machine learning capabilities with the efficient storage and retrieval mechanisms of SingleStoreDB, we can create robust AI applications that respond to user queries in real-time.

## What You'll Learn
- Setting up your environment with the necessary packages and credentials.
- Fetching and processing data to be used in our AI models.
- Storing and managing data efficiently using SingleStoreDB.
- Leveraging the power of Vertex AI for real-time data processing and insights.
- Building a retrieval-based QA system to answer user queries.

## Prerequisites
- Basic knowledge of Python programming.
- Familiarity with Google Cloud services and SQL databases.
- An active Google Cloud account.
- A SingleStoreDB hosted or self-managed instance.

Let's dive in and start building!



**Setting up the environment**: Before we begin, it's essential to ensure all the necessary packages are installed. Run the cell below to install the required libraries for our project. This will install gcloud, langchain, google-cloud-aiplatform, and singlestoredb.

In [None]:
!pip install gcloud
!pip install langchain
!pip install google-cloud-aiplatform
!pip install singlestoredb
!pip install shapely==1.8.5

Collecting gcloud
  Downloading gcloud-0.18.3.tar.gz (454 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/454.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m112.6/454.4 kB[0m [31m3.4 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m454.4/454.4 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: gcloud
  Building wheel for gcloud (setup.py) ... [?25l[?25hdone
  Created wheel for gcloud: filename=gcloud-0.18.3-py3-none-any.whl size=602927 sha256=fcacb897159bfd7010fad334694b9f170c99b3f0250d342f3678370e2edb217d
  Stored in directory: /root/.cache/pip/wheels/7c/30/88/5017af921da3a33af785f0d0fd3e944b845bc62a445a2c2f69
Successfully built gcloud
Installing collected packages: gcloud
Successfully installed gcloud-0.18.3
Collecting langchain
  Downloading 

**Authentication**: The next step involves authenticating our session with Google Cloud. By running the following cell, you'll be prompted to log in using your Google Cloud credentials. Follow the instructions to complete the login process.

In [None]:
!gcloud auth application-default login


You are running on a Google Compute Engine virtual machine.
The service credentials associated with this virtual machine
will automatically be used by Application Default
Credentials, so it is not necessary to use this command.

If you decide to proceed anyway, your user credentials may be visible
to others with access to this virtual machine. Are you sure you want
to authenticate with your personal account?

Do you want to continue (Y/n)?  y

Go to the following link in your browser:

    https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=764086051850-6qr4p6gpi6hn506pt8ejuq83di341hur.apps.googleusercontent.com&redirect_uri=https%3A%2F%2Fsdk.cloud.google.com%2Fapplicationdefaultauthcode.html&scope=openid+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fsqlservice.login&state=FGRDa4a34RzM3AQBDUuL1USPeljKYb&prompt=consent&access_type=offline&code_challenge=s34gV

**Setting the Quota Project**: After authentication, we need to set our quota project. Replace my-project-1516239077425 with your project ID if different, then run the cell to set the quota project for this session.

In [None]:
!gcloud auth application-default set-quota-project my-project-1516239077425


Credentials saved to file: [/content/.config/application_default_credentials.json]

These credentials will be used by any library that requests Application Default Credentials (ADC).

Quota project "my-project-1516239077425" was added to ADC which can be used by Google client libraries for billing and quota. Note that some services may still bill the project owning the resource.


In [None]:
!gcloud config set project my-project-1516239077425

Updated property [core/project].


**Importing Necessary Modules**: With the initial setup complete, let's import the essential classes and modules we'll use throughout this project. The following cell imports the required classes from langchain and SingleStoreDB.

In [None]:
from langchain.llms import VertexAI
from langchain.chains import RetrievalQA
from langchain.vectorstores import SingleStoreDB

**Initializing Vertex AI**: To interact with Google Cloud's Vertex AI services, we first need to instantiate the VertexAI class. Running the cell below will create this instance and store it in the variable llm.

In [None]:
llm = VertexAI()

**Loading Data from the Web**: Our application requires data to process and generate insights. In this step, we'll fetch content from a URL using the WebBaseLoader class. The loaded data will be stored in the data variable. You can replace the URL with any other source if needed.


In [None]:
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://cloud.google.com/vertex-ai/docs/generative-ai/learn/generative-ai-studio")
data = loader.load()

**Splitting the Data**: To process the data more efficiently, we'll split the loaded content into smaller chunks. The RecursiveCharacterTextSplitter class helps in achieving this by dividing the data based on specified character limits.

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
all_splits = text_splitter.split_documents(data)

**Setting Up SingleStoreDB with Vertex AI Embeddings**: For efficient storage and retrieval of our data, we use SingleStoreDB in conjunction with Vertex AI embeddings. The following cell sets up the necessary environment variables and initializes the SingleStoreDB instance with Vertex AI embeddings. Ensure you have the correct SingleStoreDB URL and credentials set.


In [None]:
from langchain.embeddings import VertexAIEmbeddings
from langchain.vectorstores import SingleStoreDB
import os

# os.environ["SINGLESTOREDB_URL"] = "admin:SingleStore2023@svc-56441794-b2ba-46ad-bc0b-c3d5810a45f4-dml.aws-oregon-3.svc.singlestore.com:3306/vertex"

# vectorstore = SingleStoreDB.from_documents(documents=all_splits, embedding=VertexAIEmbeddings(), table_name="test")
vectorstore = SingleStoreDB(embedding=VertexAIEmbeddings())

**Setting Up and Testing the QA Chain**: Once our data is processed and stored, we can use it to answer queries. The following cell initializes the RetrievalQA chain using the previously set up llm and vectorstore. After initializing, it tests the setup with a sample question about Vertex AI.



In [None]:
qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())
qa_chain({"query": "What is Vertex AI?"})

{'query': 'What is Vertex AI?',
 'result': ' Vertex AI is a unified platform for machine learning models and generative AI. It allows you to build, tune, and deploy foundation models on Vertex AI, as well as use generative AI apps for search and conversational AI.'}

In [None]:
qa_chain({"query": "What is the primary purpose of Generative AI Studio?"})

{'query': 'What is the primary purpose of Generative AI Studio?',
 'result': ' The primary purpose of Generative AI Studio is to rapidly prototype and test generative AI models.'}

In [None]:
qa_chain({"query": "What are some of the tasks you can perform in Generative AI Studio?"})

{'query': 'What are some of the tasks you can perform in Generative AI Studio?',
 'result': " Some of the tasks you can perform in Generative AI Studio include:\n\n- Testing models using prompt samples\n- Exploring generative AI models in Model Garden\n- Designing your own prompts\n- Customizing foundation models to handle tasks that meet your application's needs"}

In [None]:
qa_chain({"query": "Where can you find sample prompts to test models in Generative AI Studio?"})

{'query': 'Where can you find sample prompts to test models in Generative AI Studio?',
 'result': ' In the Prompt Gallery, in the Language section of Generative AI Studio.'}

In [None]:
qa_chain({"query": "How can you ensure that a designed prompt elicits the desired response from a language model?"})

{'query': 'How can you ensure that a designed prompt elicits the desired response from a language model?',
 'result': ' The only way to ensure that a designed prompt elicits the desired response from a language model is to test the prompt and see if it generates the desired result. '}

In [None]:
qa_chain({"query": "What are some code examples from vertex ai?"})

{'query': 'What are some code examples from vertex ai?',
 'result': ' Code examples from Vertex AI include: \n\nTraining an AutoML model\nTraining a custom model\nGetting predictions from a custom model\nUsing Vertex AI and the Python SDK to train a model\nUsing image models with Imagen on Vertex AI\nUsing the Vertex AI SDK for Python\nUsing Vertex AI in notebooks\nSetting up a project and a development environment for Vertex AI\nExploring AI models and APIs with Vertex AI\nDeveloping your own models with Vertex AI'}