<a href="https://colab.research.google.com/github/aipractices/genai-application-collab-excercise/blob/main/GenAI_Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Building Gen AI Application

# Understanding Gen AI Application Architecture

## Gen AI Architecture

Gen AI Architecture Consists of following components:

![Gen AI Architecture](https://thinkuldeep.com/images/genai-artchitecture.jpg)


### Gen AI User Interface - Front End
This is the front of Gen AI application providing user interface which collect user inputs and present the response as per the business need. Eg. [ChatGPT Web App](https://chatgpt.com/)  

### Gen AI Application - Backend
This consists of GenAI application backend that accepts user inputs and converts them on prompts, and send them to further LLMs and get response from them, and return the response to front-end applications. It also has local memory and connected to knowlege base to improve the prompt.

- Memory  - Local memory system to maintain cache for responses, and important metadata, context, and send with prompt to make it as a session, or continuous chat.
- Knowlege Base - This is knowlege system, typically vector database as per the usecases, and it fetches the inforamtion, find closes match to input prompt, and augment the prompt, the improved promt then send to LLMs. Knowlege get's in the form of embedings.

## Large Language Models
These are GPT models trained on very large datasets, and many such models are avaiable on repositories like Hugging faces and others.   

> Fine Tuning - This is technique to re-train and pre-trained generic model, to meet our business need.

# Developing First Gen AI Application

Let's build our first Gen AI application, that takes a prompt and print the response return from LLM avaialble at Gemini APIs.

Lets try the same with gemini model. We would API key for gemini.

Follow the steps below.
- Sign up on google.cloud.com, and go to https://console.cloud.google.com/
- Enable Gemini APIs -  https://console.cloud.google.com/apis/library/generativelanguage.googleapis.com
- Get API Key from Google AI Studio - https://console.cloud.google.com/apis/library/generativelanguage.googleapis.com

https://ai.google.dev/gemini-api/docs/text-generation

In [None]:
from google import genai
from google.colab import userdata

def query_gemini(prompt):
  client = genai.Client(api_key=userdata.get('GOOGLE_API_KEY'))

  response = client.models.generate_content(
      model="gemini-2.0-flash",
      contents=[prompt]
  )
  return response.text

prompt = "Tell me about Kuldeep"

print(query_gemini(prompt))


To give you the best information about "Kuldeep," I need a little more context.  "Kuldeep" is a fairly common name, particularly in India.  To help me narrow down who or what you're interested in, please tell me:

*   **Are you thinking of a specific person?** If so, do you know anything else about them? For example, their last name, occupation, or any achievements they might be known for?
*   **Are you interested in the meaning or origin of the name "Kuldeep" in general?**
*   **Are you looking for something else entirely related to the word "Kuldeep"?**

In the meantime, here's some general information that might be helpful:

*   **Meaning of the name:** "Kuldeep" is a Hindu/Indian name that typically means "light of the family" or "light of the clan." It comes from the Sanskrit words "Kul" (family/clan) and "Deep" (light/lamp).
*   **Popularity:** The name is relatively popular in India and among the Indian diaspora.

Once I have a better idea of what you're looking for, I can provi

# Improving The Prompt - Prompt Engineering
In the earlier example, you have seen the LLM responded with weard answers, because we not provide good context, and information.  


Gen AI response is very much depend what we ask in the prompt. There are
verious techniques to improve the prompt, by providing GPT a persona, context,
provide one or more examples on how do we want the response. These are call one-shot, few-shot prompting.  Generally it make take multiple attempts to get the prompt right, and we can have chain of prompt like conversations with people to get it right, or provding step by step intructions to LLM to reach to output. some call these chain of thought, or tree of thoght techniques.



In [None]:
from google import genai
from google.colab import userdata

client = genai.Client(api_key=userdata.get('GOOGLE_API_KEY'))
chat = client.chats.create(model="gemini-2.0-flash")

def run_chatbot():
    print("Hello! I am thinkuldeep, how can I help you! Type 'quit' to exit.")
    chat_history = []

    while True:
        user_input = input("You: ")
        if user_input.lower() == 'quit':
            print("thinkuldeep: Goodbye!")
            break

        response = chat.send_message(user_input)
        print(f"thinkuldeep: {response.text}")

# Start the chatbot
run_chatbot()

Hello! I am thinkuldeep, how can I help you! Type 'quit' to exit.
You: test
thinkuldeep: Understood. You've entered "test". How can I help you with that?  Are you:

*   **Testing if I'm working?** (I am!)
*   **Wanting to test me on something specific?** (e.g., a math problem, a grammar question, a code snippet)
*   **Using this as a placeholder for a future question?**
*   **Something else entirely?**

Let me know what you'd like to do.

You: quit
thinkuldeep: Goodbye!


people call all these technieques as prompt engieering, but I think writing a prompt is art, art of communication I will explain it in a seperate article. However scope of this article is to build GenAI based application, so not focusing much on so called prompt engieering.

Assuming you are very good in write prompt, but still LLM may not respond well as we exact. For example, for knowing summary of infromation about individual "Kuldeep" may not be known by LLMs, but if our Gen AI application first get some more inforamation kuldeep and send it LLM along with prompt, and then it would respond better. Let's uderstand this better in next section.

# Retrival Augmented Generation (RAG)

LLMs are trained on huge dataset, but may still know about responding perticular information about individuals, specific usecases. In such cases RAG is a techniques helps retrieve contextual information from knowlege base, and augment the prompt and send it for generation to LLM.

In [None]:
!pip install requests
!pip install beautifulsoup4

import requests
from google import genai
from google.colab import userdata

from bs4 import BeautifulSoup


def retrive(url):
  try:
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    text_content = ""
    for paragraph in soup.find_all("p"):
        text_content += paragraph.get_text(strip=True) + "\n"

    return text_content
  except requests.exceptions.RequestException as e:
    return f"Error fetching URL: {e}"
  except Exception as e:
    return f"An error occurred: {e}"

def augment(prompt, context):
  return f"{prompt} where context is {context}"

def generation(prompt):

  client = genai.Client(api_key=userdata.get('GOOGLE_API_KEY'))

  response = client.models.generate_content(
      model="gemini-2.0-flash",
      contents=[prompt]
  )
  return response.text

prompt = "Tell me about Kuldeep"

context = retrive("https://thinkuldeep.com/about/")

print(generation(augment(prompt, context)))

Okay, here's a summary of Kuldeep's professional and personal profile, organized according to the categories provided:

**📨 Contact Me**

*   **LinkedIn:** kuldeep-reck
*   **X (Twitter):** thinkuldeep
*   **Instagram:** thinkuldeep
*   **Facebook:** kuldeep.reck
*   **Personal Email:** thinkuldeep@gmail.com
*   **Official Email:** kuldeeps@thoughtworks.com

**📃 My Resume**

*   **Current Role:** Global Emerging Technology Leader and Principal Consultant at Thoughtworks.
*   **Past Role:** Director, Technology at Nagarro (13 years, various roles from developer).
*   **Education:**
    *   B.Tech (Hons) in Computer Science and Engineering, National Institute of Technology, Kurukshetra.
    *   Schooling from a small town in Rajasthan.
*   **Key Skills & Expertise:**
    *   Emerging Technologies: XR (AR/VR/MR), IoT, Metaverse, Blockchain.
    *   Software Development: CICD, TDD, Automation Testing, XP, Cloud-Native Architectures, Microservices.
    *   Data Projects: Estimations, Foreca

In [None]:
!pip install requests
!pip install beautifulsoup4

import requests
from google import genai
from google.colab import userdata
from bs4 import BeautifulSoup

client = genai.Client(api_key=userdata.get('GOOGLE_API_KEY'))
chat = client.chats.create(model="gemini-2.0-flash")

def retrive(url):
  try:
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    text_content = ""
    for paragraph in soup.find_all("p"):
        text_content += paragraph.get_text(strip=True) + "\n"

    return text_content
  except requests.exceptions.RequestException as e:
    return f"Error fetching URL: {e}"
  except Exception as e:
    return f"An error occurred: {e}"

def augment(prompt, context):
  return f"Context for all future communication {context}"

context = retrive("https://thinkuldeep.com/about/")

def run_chatbot():
    print("Hello! ask me about Kuldeep Singh! Type 'quit' to exit.")
    chat_history = []

    chat.send_message(augment("", context))

    while True:
        user_input = input("You: ")
        if user_input.lower() == 'quit':
            print("thinkuldeep: Goodbye!")
            break

        response = chat.send_message(user_input)
        print(f"thinkuldeep: {response.text}")

# Start the chatbot
run_chatbot()

Hello! ask me about Kuldeep Singh! Type 'quit' to exit.
You: where does kuldeep work
thinkuldeep: Kuldeep is currently associated with Thoughtworks as Global Emerging Technology Leader and Principal Consultant. He previously worked with Nagarro for 13 years.

You: what contact number of kuldeep
thinkuldeep: I do not have access to Kuldeep's personal contact information. The provided text includes his email addresses:

*   **Personal:** thinkuldeep@gmail.com
*   **Official:** kuldeeps@thoughtworks.com

You can try reaching out to him through these email addresses or via his social media profiles listed:

*   LinkedIn: kuldeep-reck
*   X (Twitter): thinkuldeep
*   Instagram: thinkuldeep
*   Facebook: kuldeep.reck

You: quit
thinkuldeep: Goodbye!


## Understanding Embeddings



Embeddings are the vector representation of a sentence, and used to find relation with a query and data. Like in below example there are some titles for specific content. Using embdedings and similarity search we can find the closest match to the prompt.   

In [None]:
!pip install requests
!pip install beautifulsoup4 pandas numpy

import os
import requests
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import google.generativeai as genai
from google.colab import userdata

pages = [
  {
    "title": "Open-sources and Community Applications",
    "content": "https://thinkuldeep.com/about/open-sources/"
  },
  {
    "title": "Live Streaming, Webinars",
    "content": "https://thinkuldeep.com/about/streaming/"
  },
  {
    "title": "Patents Granted",
    "content": "https://thinkuldeep.com/about/patents/"
  },
  {
    "title": "Awards received and public coverage",
    "content": "https://thinkuldeep.com/about/recognitions/"
  },
  {
    "title": "Me, my family and some moments, travel and trips",
    "content": "https://thinkuldeep.com/about/moments/"
  },
  {
    "title": "Book foreword, reviewed and authored",
    "content": "https://thinkuldeep.com/about/books/"
  }
]

df = pd.DataFrame(pages)
df.columns = ['Title', 'Url']

genai.configure(api_key=userdata.get('GOOGLE_API_KEY'))

model = 'models/embedding-001'
def pageEmbeddings(title, text):
  return genai.embed_content(model=model,
                             content=text,
                             task_type="retrieval_document",
                             title=title)["embedding"]

def findBestPage(query, dataframe):
  query_embedding = genai.embed_content(model=model,
                                        content=query,
                                        task_type="retrieval_query")
  dot_products = np.dot(np.stack(dataframe['Embeddings']), query_embedding["embedding"])
  idx = np.argmax(dot_products)
  return dataframe.iloc[idx]['Url']


prompt = "Tell me in brief Kuldeep's travels"

df['Embeddings'] = df.apply(lambda row: pageEmbeddings(row['Title'], row['Url']), axis=1)

bestPage = findBestPage(prompt, df)

print(bestPage)

[31mERROR: Operation cancelled by user[0m[31m
[0mhttps://thinkuldeep.com/about/moments/


## Implementing RAG using Similaliry Search on Embeddings

Let's build a chatbot, that gives better result based on best match and additional context. Ask specific questions about Kuldeep.

In [None]:
!pip install requests
!pip install beautifulsoup4 pandas numpy
from google.colab import userdata

import os
import requests
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import google.generativeai as genai

pages = [
  {
    "title": "Open-sources and Community Applications",
    "content": "https://thinkuldeep.com/about/open-sources/"
  },
  {
    "title": "Live Streaming, Webinars",
    "content": "https://thinkuldeep.com/about/streaming/"
  },
  {
    "title": "Patents Granted",
    "content": "https://thinkuldeep.com/about/patents/"
  },
  {
    "title": "Awards received and public coverage, news",
    "content": "https://thinkuldeep.com/about/recognitions/"
  },
  {
    "title": "Me, my family and some moments, travel and trips",
    "content": "https://thinkuldeep.com/about/moments/"
  },
  {
    "title": "Book foreword, reviewed and authored",
    "content": "https://thinkuldeep.com/about/books/"
  }
]

df = pd.DataFrame(pages)
df.columns = ['Title', 'Url']

genai.configure(api_key=userdata.get('GOOGLE_API_KEY'))

model = 'models/embedding-001'
def pageEmbeddings(title, text):
  return genai.embed_content(model=model,
                             content=text,
                             task_type="retrieval_document",
                             title=title)["embedding"]

def findBestPage(query, dataframe):
  query_embedding = genai.embed_content(model=model,
                                        content=query,
                                        task_type="retrieval_query")
  dot_products = np.dot(np.stack(dataframe['Embeddings']), query_embedding["embedding"])
  idx = np.argmax(dot_products)
  return dataframe.iloc[idx]['Url']

df['Embeddings'] = df.apply(lambda row: pageEmbeddings(row['Title'], row['Url']), axis=1)

def retrive(url):
  try:
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    text_content = ""
    for paragraph in soup.find_all("p"):
        text_content += paragraph.get_text(strip=True) + "\n"
    for h3 in soup.find_all("h3"):
        text_content += h3.get_text(strip=True) + "\n"
    #print(text_content)
    return text_content
  except requests.exceptions.RequestException as e:
    return f"Error fetching URL: {e}"
  except Exception as e:
    return f"An error occurred: {e}"

def augment(prompt, context):
  return f"{prompt}. Here is the context: {context}"

context = retrive("https://thinkuldeep.com/about/")

def run_chatbot():
    print("Hello! ask me about Kuldeep Singh! Type 'quit' to exit.")
    chat_history = []

    chat.send_message(augment("", context))

    while True:
        user_input = input("You: ")
        if user_input.lower() == 'quit':
            print("thinkuldeep: Goodbye!")
            break

        bestPageMatching = findBestPage(user_input, df)
      #  print(f"bestPageMatching: {bestPageMatching}")
        response = chat.send_message(augment(user_input, retrive(bestPageMatching)))
        print(f"thinkuldeep: {response.text}")

run_chatbot()

Hello! ask me about Kuldeep Singh! Type 'quit' to exit.
You: tell me about kuldeep community contribution
thinkuldeep: Okay, based on the provided information, here's a summary of Kuldeep's community contributions, primarily focusing on open-source projects:

Kuldeep contributes to the open-source community through the following projects:

*   **OpenThread Commissioner Implementation in Java:** This project provides a guide for developers to build an external Commissioner in Java, bridging the gap in the OpenThread project.
*   **MQTT-Remote:** A web-based remote control for an LED, syncing a web button with a hardware button.
*   **ModelViewer:** WebXR and 3D printing demonstrations showcasing the possibilities of WebXR.
*   **3D Book:** An implementation of adding and turning pages in a book like a real book using code.
*   **Arium — An Automation framework for Unity/XR:** Part of the core development team and a reviewer for this framework designed for writing automation tests in Uni

# Fine Tuning

In [None]:
import google.generativeai as genai
import time

genai.configure(api_key=userdata.get('GOOGLE_API_KEY'))

basemodel = "models/gemini-1.5-flash-001-tuning"

trainingdata = [
    {"text_input": "What is Gen AI", "output": "Good Question! Gen AI is branch of AI. All the best - thinkuldeep.com"},
    {"text_input": "What is LLM", "output": "Good Question! LLM means Large Language Model. All the best - thinkuldeep.com"},
    {"text_input": "What is Embeddings in Gen AI", "output": "Good Question! Embeddings is a vector representations. All the best - thinkuldeep.com"},
    {"text_input": "What is Fine Tuning", "output": "Good Question! retraining model specific dataset. All the best - thinkuldeep.com"},
    {"text_input": "What is prompt engineering", "output": "Good Question! It is about writing better prompts. All the best - thinkuldeep.com"},
    {"text_input": "What is RAG", "output": "Good Question! Retrieval Augmented Generation (RAD) is technique to augment prompts with externanly retrived data. All the best - thinkuldeep.com"},
    {"text_input": "What is a Transformer in Gen AI", "output": "Good Question! It is an LLM architecure. All the best - thinkuldeep.com"},
    {"text_input": "What is zero-shot learning in AI", "output": "Good Question! no example given in prompt. All the best - thinkuldeep.com"},
    {"text_input": "What is hallucination in Gen AI", "output": "Good Question! Gen AI providing false information. All the best - thinkuldeep.com"},
    {"text_input": "What is overfitting", "output": "Good Question! Data is too well trained, not training on unknowns . All the best - thinkuldeep.com"},
    {"text_input": "What is token limit in LLM", "output": "Good Question! Maximum token for query. All the best - thinkuldeep.com"},
    {"text_input": "What is unsupervised learning in AI", "output": "Good Question! learnings from unlabeled dataset. All the best - thinkuldeep.com"}
]

operation = genai.create_tuned_model(
    display_name="thinkuldeep",
    source_model=basemodel,
    epoch_count=5,
    batch_size=4,
    learning_rate=0.001,
    training_data=trainingdata,)

for status in operation.wait_bar():
  time.sleep(10)

print(operation.result());


  0%|          | 0/15 [00:00<?, ?it/s]

TunedModel(name='tunedModels/thinkuldeep-1t5bnlclgmt4eg3pgdkvbyaqmms5',
           source_model='models/gemini-1.5-flash-001-tuning',
           base_model='models/gemini-1.5-flash-001-tuning',
           display_name='thinkuldeep',
           description='',
           temperature=1.0,
           top_p=0.95,
           top_k=64,
           state=<State.ACTIVE: 2>,
           create_time=datetime.datetime(2025, 5, 16, 6, 50, 16, 356464, tzinfo=datetime.timezone.utc),
           update_time=datetime.datetime(2025, 5, 16, 6, 51, 4, 78592, tzinfo=datetime.timezone.utc),
           tuning_task=TuningTask(start_time=datetime.datetime(2025, 5, 16, 6, 50, 16, 508254, tzinfo=datetime.timezone.utc),
                                  complete_time=datetime.datetime(2025, 5, 16, 6, 51, 4, 78592, tzinfo=datetime.timezone.utc),
                                  snapshots=[...],
                                  hyperparameters=Hyperparameters(epoch_count=5,
                                         

## Using Fine Tuned Model

In [None]:
import google.generativeai as genai
from google.colab import userdata

genai.configure(api_key=userdata.get('GOOGLE_API_KEY'))
model = genai.GenerativeModel(model_name="tunedModels/thinkuldeep-1t5bnlclgmt4eg3pgdkvbyaqmms5");

prompt = "What is Gen AI"

print(model.generate_content(prompt).text)

Gen AI is a type of AI that is used to generate different text formats. All the best - thinkuldeep.com


# Conclusion

This completes the Gen AI in practice.