<a href="https://colab.research.google.com/github/ridhima2718/Casey_GeminiChatBot/blob/main/Casey_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
pip install -q -U google-generativeai

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m164.2/164.2 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m718.3/718.3 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h

What is Generative AI?

A Generative AI system is a model capable of generating ideas, thoughts and opinions with creativity that mimics humans on its own. The basis of a GenAI model relies in the concept of Deep Learning.

Models known as Large Language Models are trained on data sets of exponential sizes and are leveraged to generate output in the form of text, image, audio, video and code.

A building a LLM consists:
1. Processing the data which not only vast but of good quality (not noisy) and diverse to increase output accuracy.
2. The Large Language Model: The advanced model which is responsible for generating the output.
3. Training the model on the dataset and refining the model to produce the desired input.
4. Generating output by using the training data as a guide to produce unique ideas and concepts.

Working of Gemini API

Gemini allows using the pretrained models available for customized use. We can modify the model to serve a specific purpose according to our requirement.

To access the Geminin API interface, API keys are required. API keys are available in Google AI studio. The purpose of using a API keys is to provide authorisation for using the Gemini API interface. It also guards access to tuned models and files.

For colab, Add your API key to the secrets pane and allow its utilisation in the code to enable utilisation of API access service.

In [2]:
import google.generativeai as genai
from google.colab import userdata
import time

In [3]:
import pathlib
import textwrap
from IPython.display import display
from IPython.display import Markdown

Setting up and using the Gemini API key

In [4]:
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

Below are all the models available with the Gemini API

In [5]:
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro-vision-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-001
models/gemini-1.5-flash-latest
models/gemini-1.5-pro
models/gemini-1.5-pro-001
models/gemini-1.5-pro-latest
models/gemini-pro
models/gemini-pro-vision


For this, we have used the 1.5 flash model. 1.5 Flash is lightweight with faster reasoning ability. It can breakdown bulk of text or other input data easily. It can process upto 1 million tokens.

In the current use, Gemini 1.5 Flash is preferred over Gemini 1.5 Pro for its ability to provide easy to understand answers quickly.

In [6]:
model = genai.GenerativeModel("gemini-1.5-flash-latest")

Now we can start the chat with the model and use it for our use. The start chat method of the gemini API allows for auto storage of the chat history for context to the model.

In other cases, we would need to setup a list history to keep track of the conversation.

In [7]:
chat = model.start_chat(history=[])

Gradio uses its own functions to maintain the history of the chat. To rewrite it into a form that Gemini would understand, we need the tranform history function.

In [8]:
def transform_history(history,system_prompt):
    new_history = []
    new_history.append({"parts": [{"text": system_prompt}], "role": "user"})
    for chat in history:
        new_history.append({"parts": [{"text": chat[0]}], "role": "user"})
        new_history.append({"parts": [{"text": chat[1]}], "role": "model"})
    return new_history

The response function is used to send query to the Gemini model to generate response. It takes parameters of the user message and the history, calls the transform function to retrive history of the conversation and generates a response. We define a system prompt which is fed to the history. This provides the model guidlines about the way it is supposed to perform. We also define the role as user. This is what makes this model a custom chat bot and makes it adopt the role assigned to it.

In [11]:
def response(message, history):
    global chat
    system_prompt = f"""You are a career advice bot. Your task is to advice students on the career path they should take or
    help them prepare for their selected career path. You can provide them an option to take a mock interview with you.

    In the first message ask the user what kind of service they want, career suggestions or mock interview.

    If the user selects the career suggestion service, ask them for their interests and skills. Using this information,
    give them some suggestions on some careers they can take up.
    Example:
    Casey: What are your interests and skills?
    User: My interests are sports and  helping people
    Casey: Based on your preferences, some possible career options are:
    Sports Physician: Provide medical care to athletes, helping them prevent and recover from injuries.
    Athletic Trainer: Work with athletes to prevent, diagnose, and treat sports-related injuries.
    Physical Therapist: Help patients recover from injuries, improve mobility, and manage pain, often working with athletes.
    Sports Coach: Train and develop athletes or sports teams to improve their performance.
    Personal Trainer: Help clients achieve their fitness goals through customized exercise programs.
    Sports Psychologist: Work with athletes to improve their mental health, focus, and performance.

    If the user says they want you to take their mock interview, then ask them for the type of role they
    are applying for and ask relevant questions. Ask the user questions one by one. Wait for the user to respond
    to one question then ask the next question. Do not ask all the questions at the same time. In total, ask about an average of
    6 to 7 questions. You dont need to show all the questions beforehand.
    Try to simulate an interview like experience for the user.
    Example:
    Casey: Tell me about yourself and your experience.
    User: I am xyz from abc place. I have studied [subject] at [college name].....
    Casey: What motivates you to apply for this role?
    User: I believe I would be a good fit because I am organised and skilled....
    Casey: Question 3
    and so on.
    Once all the questions are done, ask the user if they want feedback and provide it accordingly.
    Example:
    Casey: Based on the your responses, your strong points are x,y,z. The skills you can work more on are clarity, confidence, etc.
    Some other advice is a,b,c
    """
    chat.history = transform_history(history, system_prompt)
    response = chat.send_message(message)
    response.resolve()
    # return response.text

    for i in range(len(response.text)):
        time.sleep(0.005)
        yield response.text[: i+20]

Designing a prompt for a Generative AI model or any LLM is crucial to the development process. A well-framed and clear prompt is necessary to ensure that the model understands the requirements and can generate tailored output.

While writing a prompt, some best practices include using delimiters to specify instruction areas if there is a chance that user instructions can modify the system prompt. Additionally, if you want output in a specific format, ask the model for structured output in JSON, XML, or any other format.

While framing the prompt, consider all edge cases where ambiguity is possible. Make sure to tell the model how to handle assumptions. One industry best practice is to use one-shot prompting.
**One-shot** prompting is a technique where the model is given some examples or test cases of the expected behaviour. This helps reduce deviation from requirements and makes expectations clear to the system.

It is also important to not rush the model while it is generating output. Give it the steps to reach the solution and let it work out the solution on its own.

The process of developing a prompt is iterative. The idea of reaching the final, perfect prompt in the first iteration is an idealistic approach. Experimentation, error analysis and refining the prompt are what make a prompt perfect.

Now for the frontend, we can use Gradio which is a web interface which allows demo of machine learning models. Here it will build a ChatAssistant interface for conversation.

1. Using Gradio to render the interface for the assistant bot


In [13]:
pip install gradio

Collecting gradio
  Downloading gradio-4.38.1-py3-none-any.whl (12.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.4/12.4 MB[0m [31m30.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)
Collecting altair<6.0,>=5.0 (from gradio)
  Downloading altair-5.3.0-py3-none-any.whl (857 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m857.8/857.8 kB[0m [31m36.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting fastapi (from gradio)
  Downloading fastapi-0.111.1-py3-none-any.whl (92 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.2/92.2 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting ffmpy (from gradio)
  Downloading ffmpy-0.3.2.tar.gz (5.5 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting gradio-client==1.1.0 (from gradio)
  Downloading gradio_client-1.1.0-py3-none-any.whl (318 kB)
[2K     [90m━━━━━━━━━━━━━━

In [15]:
import gradio as gr
gr.ChatInterface(response,
                 title='Career Advice',
                 textbox=gr.Textbox(placeholder="Ask Casey"),
                 retry_btn=None).launch(debug=True)

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://45d7ce8e99bdeba3e4.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://45d7ce8e99bdeba3e4.gradio.live




2. Using the panel library to display chatbot interface as embedded in the notebook

Panel is an open source library that provides interactive API call functions within the Python development environment. It streamlines the program interface.

In [None]:
!pip install panel



In [10]:
import panel as pn
pn.extension()
panels = []

def collect_messages(event):
    message = inp.value
    if message:
        hist.append((message, response(message, hist)))
        panels.append(
            pn.Row('User:', pn.pane.Markdown(message, styles={'padding': '10px', 'border-radius': '5px', 'margin': '5px 0', 'background-color': '#E0F7FA', 'overflow': 'hidden', 'word-wrap': 'break-word'}))
        )
        panels.append(
            pn.Row('Assistant:', pn.pane.Markdown(hist[-1][1], styles={'padding': '10px', 'border-radius': '5px', 'background-color': '#F6F6F6', 'margin': '5px 0', 'overflow': 'hidden', 'word-wrap': 'break-word'}))
        )
        inp.value = ''
        interactive_conversation.objects = list(panels)  # Update the conversation panel
        return pn.Column(*panels)

inp = pn.widgets.TextInput(value="Hi", placeholder='Enter text here…')
button_conversation = pn.widgets.Button(name="Chat!")
button_conversation.on_click(collect_messages)

interactive_conversation = pn.bind(collect_messages, button_conversation)

hist = []

dashboard = pn.Column(
    inp,
    pn.Row(button_conversation),
    interactive_conversation,
)

dashboard.servable()