# Building AI Agent Bot With RAG, Langchain, and Reasoning Engine From Scratch

## Setup

* This notebook will walk you through some required setup that you need to do before starting with the materials.

* It is highly recommended to use new virtual environment when running jupyter notebook for this workshop.

## Required Software Installed Locally

* Python version 3.9, 3.10, or 3.11. **Python3.12 will not work**.

* If you are using VSCode, please install Jupyter Notebook extensions.

* Jupyter notebook. Please follow this [installation guide](https://docs.jupyter.org/en/stable/install.html). You may choose whether you want to install classic jupyter notebook or jupyterlab (the next-gen web ui for jupyter)

    * [Classic jupyter notebook installation guide](https://docs.jupyter.org/en/stable/install/notebook-classic.html)

    * [Jupyterlab installation guide](https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html)

* Google Cloud CLI. Please follow this [installation guide](https://cloud.google.com/sdk/docs/install-sdk)

### Installing dependencies

In [1]:
%%writefile requirements.txt

google-cloud-aiplatform
google-cloud-aiplatform[langchain]
google-cloud-aiplatform[reasoningengine]
langchain
langchain_core
langchain_community
langchain-google-vertexai==2.0.8
cloudpickle
pydantic==2.9.2
langchain-google-community
google-cloud-discoveryengine
nest-asyncio
asyncio==3.4.3
asyncpg==0.29.0
cloud-sql-python-connector[asyncpg]
langchain-google-cloud-sql-pg
numpy
pandas
pgvector
psycopg2-binary
langchain-openai
langgraph
traceloop-sdk
opentelemetry-instrumentation-google-generativeai
opentelemetry-instrumentation-langchain
opentelemetry-instrumentation-vertexai
python-dotenv

Overwriting requirements.txt


In [2]:
!pip install --upgrade -r requirements.txt

Collecting langchain (from -r requirements.txt (line 5))
  Downloading langchain-0.3.18-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain_core (from -r requirements.txt (line 6))
  Downloading langchain_core-0.3.34-py3-none-any.whl.metadata (5.9 kB)
Collecting langchain_community (from -r requirements.txt (line 7))
  Downloading langchain_community-0.3.17-py3-none-any.whl.metadata (2.4 kB)
Collecting numpy (from -r requirements.txt (line 18))
  Using cached numpy-2.2.2-cp310-cp310-macosx_14_0_x86_64.whl.metadata (62 kB)
Collecting langchain-openai (from -r requirements.txt (line 22))
  Downloading langchain_openai-0.3.4-py3-none-any.whl.metadata (2.3 kB)
Collecting langgraph (from -r requirements.txt (line 23))
  Downloading langgraph-0.2.70-py3-none-any.whl.metadata (17 kB)
Collecting traceloop-sdk (from -r requirements.txt (line 24))
  Downloading traceloop_sdk-0.38.4-py3-none-any.whl.metadata (3.9 kB)
Collecting opentelemetry-instrumentation-google-generativeai (from -r requir

in case you are facing issue with installing psycopg2, please run the following command (linux only):

```
sudo apt update
sudo apt install python3-dev libpq-dev
```

You will require to restart the jupyter kernel once the dependency installed.

In [None]:
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

## Setting up Google Cloud Account

#### Recommended account setup

if you are running this in jupyter notebook locally, you may need to login to google cloud by running the following command from terminal:

```
gcloud auth login
gcloud auth application-default login
```

If you are using Google Colabs, you need to authenticate with your google account by running the following notebook cell. 

> Please remember that you will need to do this on each jupyter notebook during this workshop

In [4]:
# #@markdown ###Authenticate your Google Cloud Account and enable APIs.
# # Authenticate gcloud.
from google.colab import auth
auth.authenticate_user()

ModuleNotFoundError: No module named 'google.colab'

## Accessing Google Cloud Credit

Please redeem your $5 USD credit that you can use for this workshop. Link for this, will be shared on the class room.

The instruction given will also require you to create a new GCP project. Create one!

## Enabling Google Service API

Before creating cloud resources (e.g. database, cloudrun services, reasoning engine, etc), first we must enable the services api.

In [13]:
# @markdown Replace the required placeholder text below. You can modify any other default values, if you like.

# please change the project id into your gcp project id you just created. 
project_id = "gen-lang-client-0521448746"  # @param {type:"string"}

# you can leave this the same.
region = "us-central1"  # @param {type:"string"}

!gcloud config set project {project_id} --quiet


To update your Application Default Credentials quota project, use the `gcloud auth application-default set-quota-project` command.
Updated property [core/project].


In [14]:
from googleapiclient import discovery
service = discovery.build("cloudresourcemanager", "v1")
request = service.projects().get(projectId=project_id)
response = request.execute()
project_number = response["projectNumber"]
project_number

'672065512482'

Here, we will enable few services:

* `aiplatform.googleapis.com` -> used for using Gemini LLM and reasoning engine
* `run.googleapis.com` -> used for deploying to cloud run
* `cloudbuild.googleapis.com` -> used for building docker image and perform the deployment

In [15]:
!gcloud config set core/disable_prompts True


!gcloud services enable artifactregistry.googleapis.com
!gcloud services enable compute.googleapis.com
!gcloud services enable aiplatform.googleapis.com
!gcloud services enable run.googleapis.com 
!gcloud services enable cloudbuild.googleapis.com
!gcloud services enable sqladmin.googleapis.com
!gcloud services enable cloudtrace.googleapis.com

!gcloud beta services identity create --service=aiplatform.googleapis.com --project={project_id}

!gcloud projects add-iam-policy-binding {project_id} \
    --member=serviceAccount:{project_number}-compute@developer.gserviceaccount.com \
    --role="roles/cloudbuild.builds.builder" -q

Updated property [core/disable_prompts].
Service identity created: service-672065512482@gcp-sa-aiplatform.iam.gserviceaccount.com
Updated IAM policy for project [gen-lang-client-0521448746].
bindings:
- members:
  - serviceAccount:service-672065512482@gcp-sa-aiplatform.iam.gserviceaccount.com
  role: roles/aiplatform.serviceAgent
- members:
  - serviceAccount:service-672065512482@gcp-sa-artifactregistry.iam.gserviceaccount.com
  role: roles/artifactregistry.serviceAgent
- members:
  - serviceAccount:672065512482-compute@developer.gserviceaccount.com
  - serviceAccount:672065512482@cloudbuild.gserviceaccount.com
  role: roles/cloudbuild.builds.builder
- members:
  - serviceAccount:service-672065512482@gcp-sa-cloudbuild.iam.gserviceaccount.com
  role: roles/cloudbuild.serviceAgent
- members:
  - serviceAccount:service-672065512482@compute-system.iam.gserviceaccount.com
  role: roles/compute.serviceAgent
- members:
  - serviceAccount:service-672065512482@containerregistry.iam.gserviceacco

# Deploying Dummy API server

Later on this workshop, you will be using your AI agent to interact with api in order to get detail about an online course you provide as well as to create purchase request. Hence, we will deploy the simple stupid API to cloudrun.

If you want to see the detail, you can check the `api/` directory.

Now let's deploy the Go API to cloud run:

In [16]:
# change this registry name with an unique name
registry_name = "mkrs"  # @param {type:"string"}

!gcloud artifacts repositories create {registry_name} \
      --repository-format=docker \
      --location={region} \
      --description="devfest artifact registry" \
      --immutable-tags       

registry_url = f"{region}-docker.pkg.dev/{project_id}/{registry_name}"

Create request issued for: [mkrs]
Waiting for operation [projects/gen-lang-client-0521448746/locations/us-central
1/operations/975def06-4fb4-4cf6-8717-cfa8b43bd8f1] to complete...done.         
Created repository [mkrs].


We will build the docker image used by the API

In [17]:
!gcloud builds submit api --tag {registry_url}/courses-api

Creating temporary archive of 6 file(s) totalling 8.2 KiB before compression.
Some files were not included in the source upload.

Check the gcloud log [/Users/pabrik/.config/gcloud/logs/2025.02.08/14.34.29.988853.log] to see which files and the contents of the
default gcloudignore file used (see `$ gcloud topic gcloudignore` to learn
more).

Uploading tarball of [api] to [gs://gen-lang-client-0521448746_cloudbuild/source/1739000070.222012-38d21b74ecee4a93a2f5fb2f43f6fd32.tgz]
Created [https://cloudbuild.googleapis.com/v1/projects/gen-lang-client-0521448746/locations/global/builds/b4688484-2d48-4e64-9db0-68961803138a].
Logs are available at [ https://console.cloud.google.com/cloud-build/builds/b4688484-2d48-4e64-9db0-68961803138a?project=672065512482 ].
Waiting for build to complete. Polling interval: 1 second(s).
----------------------------- REMOTE BUILD OUTPUT ------------------------------
starting build "b4688484-2d48-4e64-9db0-68961803138a"

FETCHSOURCE
Fetching storage object: gs

We will deploy the docker image to cloud run so that we can have the api up and running

In [18]:
!gcloud run deploy courses-api --allow-unauthenticated --region {region} --quiet --image {registry_url}/courses-api

Deploying container to Cloud Run service [[1mcourses-api[m] in project [[1mgen-lang-client-0521448746[m] region [[1mus-central1[m]
Deploying new service...                                                       
  . Creating Revision...                                                       
  . Routing traffic...                                                         
  . Setting IAM Policy...                                                      
  Deploying new service...                                                     



⠛ Deploying new service...                                                     



⠹ Deploying new service...                                                     



⠼ Deploying new service...                                                     



⠶ Deploying new service...                                                     



⠧ Deploying new service...                                                     



⠏ Deploying new service...                   

Once it is deployed, run the command to get the url of your dummy api. Take note because we will use it later:

In [19]:
urls = !gcloud run services describe courses-api --region=us-central1 --format='value(status.url)'
api_url = urls[0]
print(api_url)

https://courses-api-guckng3ccq-uc.a.run.app


Testing the API

In [20]:
!curl {api_url}/courses

[{"name":"software-security","display_name":"Software Security","description":"Learn how to secure your software","price":100,"currency":"USD"}]

# Creating Staging Bucket for AI Agent

Later, when we deploy the AI Agent, we have to provide the staging gcs bucket used to store the pickle and some other configurations of our reasoning engine. So, let's create a new empty bucket. Please change `staging_bucket_name` variable below with globally unique name.

Once the bucket created, take note the name of the bucket.

In [21]:
# change this with globaly unique name. you may add your name to make it unique. this bucket will be used later for storing the model
staging_bucket_name = "devfest24-demo-bucket" # @param {type:"string"}

!gcloud storage buckets create gs://{staging_bucket_name} --project={project_id} --location={region} --uniform-bucket-level-access

Creating gs://devfest24-demo-bucket/...


# Data Preparation

In this workshop, we are going to use written content from [OWASP CheatSheetSeries](https://github.com/OWASP/CheatSheetSeries) as the source document for our RAG. However, to reduce the cost, I already currated few files that we are going to use in `urls` variable. Instead of using all of them, we will just use few of them and build embedding with the currated files.

The source code below will just iterate over all files within `sources` directory and create a `course_content.jsonl` file containing the file contents.

In [22]:
import json
import uuid
import requests
from pathlib import Path

urls = [
    "https://raw.githubusercontent.com/OWASP/CheatSheetSeries/refs/heads/master/cheatsheets/Authentication_Cheat_Sheet.md",
    "https://raw.githubusercontent.com/OWASP/CheatSheetSeries/refs/heads/master/cheatsheets/Authorization_Cheat_Sheet.md",
    "https://raw.githubusercontent.com/OWASP/CheatSheetSeries/refs/heads/master/cheatsheets/File_Upload_Cheat_Sheet.md",
    "https://raw.githubusercontent.com/OWASP/CheatSheetSeries/refs/heads/master/cheatsheets/Forgot_Password_Cheat_Sheet.md",
    "https://raw.githubusercontent.com/OWASP/CheatSheetSeries/refs/heads/master/cheatsheets/Password_Storage_Cheat_Sheet.md",
    "https://raw.githubusercontent.com/OWASP/CheatSheetSeries/refs/heads/master/cheatsheets/REST_Security_Cheat_Sheet.md",
    "https://raw.githubusercontent.com/OWASP/CheatSheetSeries/refs/heads/master/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.md"
]

def generate_course_content_jsonl():
    output_file = 'course_content.jsonl'
    
    with open(output_file, 'w') as jsonl_file:

        for url in urls:
            response = requests.get(url)
            if response.status_code == 200:
                content = response.text
                filename = url.split('/')[-1]                
                title = filename.replace('_', ' ').replace('.md', '')

                slug = title.lower().replace(' ', '-')
                            
                record = {
                    'id': str(uuid.uuid4()),
                    'title': title,
                    'content': content,
                    'file_path': str(url),
                    'slug': slug
                }                
                json.dump(record, jsonl_file)
                jsonl_file.write('\n')
            else:
                print(f"Failed to download content. Status code: {response.status_code}")

        
    print(f"JSONL file '{output_file}' has been generated successfully.")

generate_course_content_jsonl()


JSONL file 'course_content.jsonl' has been generated successfully.


Let's see what is inside the `course_content.jsonl` file:

In [23]:
import pandas as pd

df = pd.read_json('course_content.jsonl', lines=True)
df.head()

Unnamed: 0,id,title,content,file_path,slug
0,9db4bf98-68cc-4cce-9729-322e99c63d72,Authentication Cheat Sheet,# Authentication Cheat Sheet\n\n## Introductio...,https://raw.githubusercontent.com/OWASP/CheatS...,authentication-cheat-sheet
1,ed2f05be-2f72-4a20-a1f0-b55f3a6a81ad,Authorization Cheat Sheet,# Authorization Cheat Sheet\n\n## Introduction...,https://raw.githubusercontent.com/OWASP/CheatS...,authorization-cheat-sheet
2,7d5f7f1f-6203-4314-aadb-e5f81e4e7d64,File Upload Cheat Sheet,# File Upload Cheat Sheet\n\n## Introduction\n...,https://raw.githubusercontent.com/OWASP/CheatS...,file-upload-cheat-sheet
3,dae018dc-0319-4a7a-b396-9d9f6ed41550,Forgot Password Cheat Sheet,# Forgot Password Cheat Sheet\n\n## Introducti...,https://raw.githubusercontent.com/OWASP/CheatS...,forgot-password-cheat-sheet
4,b6be3c3a-691c-49d3-95aa-f56f8000a57e,Password Storage Cheat Sheet,# Password Storage Cheat Sheet\n\n## Introduct...,https://raw.githubusercontent.com/OWASP/CheatS...,password-storage-cheat-sheet


# Creating Embedding and Vector Store

This notebook demonstrates the process of creating embeddings and setting up a vector store for a course content retrieval system. 

It covers the following key steps:

1. Importing necessary libraries and creating and setting up database and its configurations
1. Connecting to either a Google Cloud SQL
1. Loading course content data from markdown files
1. Creating embeddings for the course content using a Google Gemini embedding model
1. Storing the embeddings in a vector database for efficient similarity search

Setting up few constants:

In [24]:
instance_name="devfest24-demo" # @param {type:"string"}
database_password = 'testing' # @param {type:"string"} #change this to your database password
database_name = 'testing' # @param {type:"string"} #change this to your database name
database_user = 'testing' # @param {type:"string"} #change this to your database user

# Dont update these lines below

embeddings_table_name = "course_content_embeddings"
chat_history_table_name = "chat_histories"
gemini_embedding_model = "text-embedding-004"

assert database_name, "⚠️ Please provide a database name"
assert database_user, "⚠️ Please provide a database user"
assert database_password, "⚠️ Please provide a database password"


## Setting Up PostgreSQL in Google Cloud SQL

Here will we set the default GCP project and get information about the user using the GCP account.

In [25]:
# Grant Cloud SQL Client role to authenticated user
current_user = !gcloud auth list --filter=status:ACTIVE --format="value(account)"
print(f"{current_user}")

['sugengdcahyo@gmail.com']


Before sending query to database, we will have to add required permissions for our notebook so that it can access the database:

In [26]:
print(f"Granting Cloud SQL Client role to {current_user[0]}")
# granting cloudsql client role to the current user
!gcloud projects add-iam-policy-binding {project_id} \
  --member=user:{current_user[0]} \
  --role="roles/cloudsql.client"

Granting Cloud SQL Client role to sugengdcahyo@gmail.com
Updated IAM policy for project [gen-lang-client-0521448746].
bindings:
- members:
  - serviceAccount:service-672065512482@gcp-sa-aiplatform.iam.gserviceaccount.com
  role: roles/aiplatform.serviceAgent
- members:
  - serviceAccount:service-672065512482@gcp-sa-artifactregistry.iam.gserviceaccount.com
  role: roles/artifactregistry.serviceAgent
- members:
  - serviceAccount:672065512482-compute@developer.gserviceaccount.com
  - serviceAccount:672065512482@cloudbuild.gserviceaccount.com
  role: roles/cloudbuild.builds.builder
- members:
  - serviceAccount:service-672065512482@gcp-sa-cloudbuild.iam.gserviceaccount.com
  role: roles/cloudbuild.serviceAgent
- members:
  - user:sugengdcahyo@gmail.com
  role: roles/cloudsql.client
- members:
  - serviceAccount:service-672065512482@compute-system.iam.gserviceaccount.com
  role: roles/compute.serviceAgent
- members:
  - serviceAccount:service-672065512482@containerregistry.iam.gserviceacco

Next, we are going to create new postgresql database from Google CloudSQL and create postgresql user/role which will be used to store the embeddings later on

In [27]:
#@markdown Create and setup a Cloud SQL PostgreSQL instance, if not done already.
database_version = !gcloud sql instances describe {instance_name} --format="value(databaseVersion)"
if database_version[0].startswith("POSTGRES"):
  print("Found an existing Postgres Cloud SQL Instance!")
else:
  print("Creating new Cloud SQL instance...")
  !gcloud sql instances create {instance_name} --database-version=POSTGRES_15 \
    --region={region} --cpu=1 --memory=4GB --root-password={database_password} \
    --authorized-networks=0.0.0.0/0
# Create the database, if it does not exist.
out = !gcloud sql databases list --instance={instance_name} --filter="NAME:{database_name}" --format="value(NAME)"
if ''.join(out) == database_name:
  print("Database %s already exists, skipping creation." % database_name)
else:
  !gcloud sql databases create {database_name} --instance={instance_name}
# Create the database user for accessing the database.
!gcloud sql users create {database_user} \
  --instance={instance_name} \
  --password={database_password}

Creating new Cloud SQL instance...
Creating Cloud SQL instance for POSTGRES_15...done.                            
Created [https://sqladmin.googleapis.com/sql/v1beta4/projects/gen-lang-client-0521448746/instances/devfest24-demo].
NAME            DATABASE_VERSION  LOCATION       TIER              PRIMARY_ADDRESS  PRIVATE_ADDRESS  STATUS
devfest24-demo  POSTGRES_15       us-central1-c  db-custom-1-4096  34.42.192.52     -                RUNNABLE
Creating Cloud SQL database...done.                                            
Created database [testing].
instance: devfest24-demo
name: testing
project: gen-lang-client-0521448746
Creating Cloud SQL user...done.                                                
Created user [testing].


Here we are going to get the ip of postgresql we just created. Take note to the database host ip address.

In [29]:
# get the ip address of the instance
ip_addresses = !gcloud sql instances describe {instance_name} --project {project_id} --format 'value(ipAddresses.ipAddress)'
# Split the IP addresses and take the first one
database_host = ip_addresses[0].split(';')[0].strip()
print(f"Using database host: {database_host}")

Using database host: 34.42.192.52


## Prepare the embeddings

Now, we will build the embeddings from the content we have selected. 

Before creating the embedding, we need to split the content of each files into chunks. This is most of the time required, especially when the content is toolong, because embedding has the limit for the number of input token it can accept.

In [30]:
from langchain.text_splitter import MarkdownTextSplitter

text_splitter = MarkdownTextSplitter(
  chunk_size=1000, 
  chunk_overlap=200)

from langchain_core.documents import Document

chunked = []
for index, row in df.iterrows():
    course_content_id = row["id"]
    title = row["title"]
    content = row["content"]
    splits = text_splitter.create_documents([content])
    for s in splits:
        metadata = {"course_content_id": course_content_id, "title": title}
        doc = Document(page_content=s.page_content, metadata=metadata)
        chunked.append(doc)

chunked[0]

Document(metadata={'course_content_id': '9db4bf98-68cc-4cce-9729-322e99c63d72', 'title': 'Authentication Cheat Sheet'}, page_content='# Authentication Cheat Sheet\n\n## Introduction\n\n**Authentication** (**AuthN**) is the process of verifying that an individual, entity, or website is who or what it claims to be by determining the validity of one or more authenticators (like passwords, fingerprints, or security tokens) that are used to back up this claim.\n\n**Digital Identity** is the unique representation of a subject engaged in an online transaction. A digital identity is always unique in the context of a digital service but does not necessarily need to be traceable back to a specific real-life subject.\n\n**Identity Proofing** establishes that a subject is actually who they claim to be. This concept is related to KYC concepts and it aims to bind a digital identity with a real person.')

In [37]:
len(chunked)

209

Once we have the file content chunked into smaller sizes, we are going to create embedding for each chunked and store it to cloudsql.

Now let's initialize vertex ai sdk and create the embedding services.

In [38]:
from langchain_google_vertexai import VertexAIEmbeddings
import vertexai

# Initialize Vertex AI
vertexai.init(project=project_id, location=region)
# Create a Vertex AI Embeddings service
embeddings_service = VertexAIEmbeddings(model_name=gemini_embedding_model)

Now, let's construct the embeddings and store it to the database.

On the function below we are doing these steps:
1. We are initiating a PostgresEngine. This instance of PostgresEngine will be used to handle database connection as well as authentication.
1. Then, `ainit_vectorstore_table()` will create a table which will be used to store the chucked content, its embedding, and metadata.
1. We initialize the PostgresVectorStore and provide the engine as well as the embedding service.
1. For each chunked document, we call function `aadd_documents` to create embedding and create new record on the given table.

In [39]:
from langchain_google_cloud_sql_pg import PostgresEngine, PostgresVectorStore
import uuid

async def create_vectorstore():
    engine = await PostgresEngine.afrom_instance(
        project_id,
        region,
        instance_name,
        database_name,
        user=database_user,
        password=database_password,
    )

    await engine.ainit_chat_history_table(
        table_name=chat_history_table_name
    )

    await engine.ainit_vectorstore_table(
        table_name=embeddings_table_name, vector_size=768, overwrite_existing=True
    )

    vector_store = await PostgresVectorStore.create(
        engine,
        table_name=embeddings_table_name,
        embedding_service=embeddings_service,
    )

    ids = [str(uuid.uuid4()) for i in range(len(chunked))]
    await vector_store.aadd_documents(chunked, ids=ids)

await create_vectorstore()

TimeoutError: 

Once you have the vector store, you can check the content from google cloud sql data viewer.

# Retriever

Once we have data stored in cloudsql, we need to find a way to query the data. This notebook covers how we can create and use the postgresql retriever to perform similarity search.

Similar to previous section, we will try to create PostgresEngine to connect to CloudSQL instance:

In [34]:
from langchain_google_cloud_sql_pg import PostgresEngine

pg_engine = PostgresEngine.from_instance(
    project_id=project_id,
    instance=instance_name,
    region=region,
    database=database_name,
    user=database_password,
    password=database_password,
)

We create the vector store object by using the engine and embedding service we created earlier:

In [40]:
from langchain_google_cloud_sql_pg import PostgresVectorStore

vector_store = PostgresVectorStore.create_sync(
            pg_engine,
            table_name=embeddings_table_name,
            embedding_service=embeddings_service,
        )
retriever = vector_store.as_retriever(search_kwargs={"k": 10})

TimeoutError: 

Let's try with some query:

In [None]:
retriever.invoke("how to design forgot password?")

In [None]:
retriever.invoke("how to design security for authentication?")