<div id="singlestore-header" style="display: flex; background-color: rgba(255, 167, 103, 0.25); padding: 5px;">
    <div id="icon-image" style="width: 90px; height: 90px;">
        <img width="100%" height="100%" src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/crystal-ball.png" />
    </div>
    <div id="text" style="padding: 5px; margin-left: 10px;">
        <div id="badge" style="display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%">SingleStore Notebooks</div>
        <h1 style="font-weight: 500; margin: 8px 0 0 4px;">Build a GenAI app with SingleStoreDB and Google Vertex AI</h1>
    </div>
</div>

# Building a Generative AI Application with Vertex AI and SingleStore

Welcome to this comprehensive guide on building a state-of-the-art General AI application using Google Cloud's Vertex AI and SingleStoreDB. This guide aims to provide a seamless experience, offering step-by-step instructions, code explanations, and best practices.

## Overview
Vertex AI, a product by Google Cloud, offers an integrated suite of machine learning tools that allows developers to build, deploy, and scale AI models faster than ever. On the other hand, SingleStoreDB offers a fast, scalable, and SQL-compliant relational database system. By combining the power of Vertex AI's machine learning capabilities with the efficient storage and retrieval mechanisms of SingleStoreDB, we can create robust AI applications that respond to user queries in real-time.

## What You'll Learn
- Setting up your environment with the necessary packages and credentials.
- Fetching and processing data to be used in our AI models.
- Storing and managing data efficiently using SingleStoreDB.
- Leveraging the power of Vertex AI for real-time data processing and insights.
- Building a retrieval-based QA system to answer user queries.

## Prerequisites
- Basic knowledge of Python programming.
- Familiarity with Google Cloud services and SQL databases.
- An active Google Cloud account.
- A SingleStoreDB hosted or self-managed instance.

Let's dive in and start building!

**Setting up the environment**: Before we begin, it's essential to ensure all the necessary packages are installed. Run the cell below to install the required libraries for our project. This will install gcloud, langchain, google-cloud-aiplatform, and singlestoredb.

In [1]:
!pip install gcloud
!pip install langchain
!pip install google-cloud-aiplatform
!pip install singlestoredb
!pip install shapely==1.8.5

**Authentication**: The next step involves authenticating our session with Google Cloud. By running the following cell, you'll be prompted to log in using your Google Cloud credentials. Follow the instructions to complete the login process.

In [2]:
!gcloud auth application-default login

**Setting the Quota Project**: After authentication, we need to set our quota project. Replace my-project-1516239077425 with your project ID if different, then run the cell to set the quota project for this session.

In [3]:
!gcloud auth application-default set-quota-project my-project-1516239077425

In [4]:
!gcloud config set project my-project-1516239077425

**Importing Necessary Modules**: With the initial setup complete, let's import the essential classes and modules we'll use throughout this project. The following cell imports the required classes from langchain and SingleStoreDB.

In [5]:
from langchain.llms import VertexAI
from langchain.chains import RetrievalQA
from langchain.vectorstores import SingleStoreDB

**Initializing Vertex AI**: To interact with Google Cloud's Vertex AI services, we first need to instantiate the VertexAI class. Running the cell below will create this instance and store it in the variable llm.

In [6]:
llm = VertexAI()

**Loading Data from the Web**: Our application requires data to process and generate insights. In this step, we'll fetch content from a URL using the WebBaseLoader class. The loaded data will be stored in the data variable. You can replace the URL with any other source if needed.

In [7]:
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://cloud.google.com/vertex-ai/docs/generative-ai/learn/generative-ai-studio")
data = loader.load()

**Splitting the Data**: To process the data more efficiently, we'll split the loaded content into smaller chunks. The RecursiveCharacterTextSplitter class helps in achieving this by dividing the data based on specified character limits.

In [8]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
all_splits = text_splitter.split_documents(data)

**Setting Up SingleStoreDB with Vertex AI Embeddings**: For efficient storage and retrieval of our data, we use SingleStoreDB in conjunction with Vertex AI embeddings. The following cell sets up the necessary environment variables and initializes the SingleStoreDB instance with Vertex AI embeddings. Ensure you have the correct SingleStoreDB URL and credentials set.

In [9]:
from langchain.embeddings import VertexAIEmbeddings
from langchain.vectorstores import SingleStoreDB
import os

os.environ["SINGLESTOREDB_URL"] = "admin:password@svc-6d131c3b-9665-46c0-992c-c3450073c3e7-dml.gcp-virginia-1.svc.singlestore.com:3306/demo"

vectorstore = SingleStoreDB.from_documents(documents=all_splits, embedding=VertexAIEmbeddings())
# vectorstore = SingleStoreDB(embedding=VertexAIEmbeddings())

**Setting Up and Testing the QA Chain**: Once our data is processed and stored, we can use it to answer queries. The following cell initializes the RetrievalQA chain using the previously set up llm and vectorstore. After initializing, it tests the setup with a sample question about Vertex AI.

In [10]:
qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())
qa_chain({"query": "What is Vertex AI?"})

In [11]:
qa_chain({"query": "What is the primary purpose of Generative AI Studio?"})

In [12]:
qa_chain({"query": "What are some of the tasks you can perform in Generative AI Studio?"})

In [13]:
qa_chain({"query": "Where can you find sample prompts to test models in Generative AI Studio?"})

In [14]:
qa_chain({"query": "How can you ensure that a designed prompt elicits the desired response from a language model?"})

In [15]:
qa_chain({"query": "What are some code examples from vertex ai?"})

<div id="singlestore-footer" style="background-color: rgba(194, 193, 199, 0.25); height:2px; margin-bottom:10px"></div>
<div><img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png" style="padding: 0px; margin: 0px; height: 24px"/></div>