#What is Gemini AI?

Gemini AI is a large language model (LLM) developed by Google DeepMind. It was released on December 6th, 2023. Gemini is a “multimodal” language model, meaning that it is trained on a massive dataset of text and code.


##What is “multimodal” in Gemini AI ?

“Multimodal” in Gemini AI refers to its ability to understand and process information from various sources, not just text.

-  Gemini AI is trained not only based on text but also Text, Images, Pictures, Audio, Code and Other modalities.
-  Gemini AI can respond to multimodal prompts, combining text with images, audio, or code to generate new content. It can translate between different modalities, for example, generating text descriptions of images or creating images based on text descriptions.
-  It can even use multimodal understanding to perform tasks like answering complex questions that require information from multiple sources.
Three Types of Gemini AI Model


##Three Types of Gemini AI Model

![Gemini Model Size](https://drive.google.com/uc?id=19PY6P61jadO3WC1v61vR6xPtJwXnqpb8)

1.  Gemini Ultra: This is the largest and most capable model, designed to tackle highly complex tasks efficiently.
2.  Gemini Pro: Ideal for scaling across a broad range of tasks, Gemini Pro offers exceptional performance and adaptability.
3.  Gemini Nano: The most efficient model for on-device tasks like android phones, Gemini Nano ensures optimal performance while conserving resources.


##The Easiest way to test Gemini AI.
The easiest way to test gemini ai is using [Google AI Studio](https://makersuite.google.com/).


##Documentation Reference
[Gemini AI](https://ai.google.dev/docs)

##Prompts and model tuning

Gemini AI has three prompt types

- Freeform prompts - These prompts offer an open-ended prompting experience for generating content and responses to instructions. You can use both images and text data for your prompts.
- Structured prompts - This prompting technique lets you guide model output by providing a set of example requests and replies. Use this approach when you need more control over the structure of model output.
- Chat prompts - Use chat prompts to build conversational experiences. This prompting technique allows for multiple input and response turns to generate output.

Google AI Studio also lets you to change the behavior of a model, using a technique called **tuning**.

Tuned model - Use this advanced technique to improve a model's responses for a specific task by providing more examples. Note that tuning is only available for legacy PaLM models. Turn on the Show legacy models option in Settings to enable this prompt.

##Freeform prompt

To create a multimodal prompt:

1.  Navigate to Google AI Studio.
2.  In the left panel, select Create new > Freeform prompt.
3.  In the right column Model field, select a model that supports images, such as the Gemini Pro Vision model.

![Freeprompt](https://drive.google.com/uc?id=119dQa12fZX668dYZUMImwt28lBu6HxTM)


##Add a replaceable variable to the prompt

Sometimes if you want to be able to dynamically change parts of a prompt. We can modify specific input by seleting Test Imput and then add testing prompts. And we can change and test by changing run setting arguments.

![Freeprompt](https://drive.google.com/uc?id=1Q-vy69B3iH574hqG8xikIDbheFfwsk5H)





##Structured Prompt

Structured prompts in Google AI Studio help you do just that combine instructions with examples to show the model the kind of output you want, rather than just telling it what to do. This kind of prompting, called **few-shot** prompting.

It is useful when you want the model to stick to a consistent output format (i.e. structured json) or when it’s difficult to describe in words what you want the model to do (i.e. write in a particular style).

To create a multimodal prompt:

- Navigate to Google AI Studio.
- In the left panel, select Create new > Freeform prompt. (Note: Only 500 prompts can be given as static)


Example : *generates advertising copy for products*

###To import examples from a file:

- In the top, right corner of examples table, select Actions > Import examples.
- In the dialog, select a CSV or Google Sheets file in your Google Drive, or upload from your computer.
- In the import examples dialog, choose which columns to import and which to leave out. The dialog also lets you specify which data column imports to which table column in your structured prompt.

![StructuredPrompt](https://drive.google.com/uc?id=1ncUvjbpNs_GpPqSzpNY5dl1KTsnueBQO)


##Chat prompt

To create a chat prompt
- Navigate to Google AI Studio.
- In the left panel, select Create new > Chat Prompt


![StructuredPrompt](https://drive.google.com/uc?id=1OjURaf1ht-YbY1D55wzY6sjXpXuAWQH9)

###Teach your bot to chat better
Chat reply with the long text. It is not user friendly. So let's teach our chat by adding examples.


![StructuredPrompt](https://drive.google.com/uc?id=1squYycaoHUy3T5rAi3-fbt_hc8v-rsS6)



#Gemini API

###How to get API Key

[Gemini API Key](https://makersuite.google.com/app/apikey) Go to the link and click *Create API key in new project*.

###Verify your API key with a curl command



```
API_KEY="ADD YOUR_API_KEY"
curl -H 'Content-Type: application/json' \
     -d '{"contents":[
            {"role": "user",
              "parts":[{"text": "Give me five subcategories of jazz?"}]}]}' \
     "https://generativelanguage.googleapis.com/v1/models/gemini-pro:generateContent?key=${API_KEY}"
```

###AI Models
- Gemini is Google's latest generation of generative models, and goes beyond the capabilities of the PaLM family of models.
- A key difference between the Gemini and PaLM models is that the Gemini vision model is able to handle image input. You can prompt Gemini models with text, or images, or both. PaLM models only handle text input and output.

![AI Models](https://drive.google.com/uc?id=1cEVWNhRD2_SM1TqcbYQ2Al-L5rsHq6T2)

##Gemini Model variations

![AI Models](https://drive.google.com/uc?id=1FLI3WIozk8Ir11OigSdEnHdlENhDbz6y)
![AI Models](https://drive.google.com/uc?id=1KPVNc0j_rEllG3ESgC8jrXWqhRb4oJFa)
![AI Models](https://drive.google.com/uc?id=1psuapdQl5w0vPBjZe-rke5FIAmpPXVfu)
![AI Models](https://drive.google.com/uc?id=1AI-rARBP8evvSm4m6lo1R4-utMiWhv1r)

###Gemini Meta Data
![AI Models](https://drive.google.com/uc?id=1B8MooryRyd57JTA9MtbSNFZ5294cji8Z)


#Safety Settings & Guidance

By default, safety settings block content with medium and/or high probability of being unsafe content across all 4 dimensions.

The adjustable safety filters cover the following categories:

- Harassment
- Hate speech
- Sexually explicit
- Dangerous

![AI ModelsSafety ](https://drive.google.com/uc?id=16bKGgULRxRYsRpMy8Tx6-qe6nz3tNHk2)

Large language models (LLMs) are so useful that they’re creative tools that can address many different language tasks. Unfortunately, this also means that large language models can generate output that you don't expect, including text that's offensive, insensitive, or factually incorrect. Gemini AI is designed with [Google AI Principles](https://ai.google/responsibility/principles/)

![AI ModelsSafety ](https://drive.google.com/uc?id=18W2Fyh07401rc0l7HRXQSD0c5C8ST2Ei)


![AI ModelsSafety ](https://drive.google.com/uc?id=1ai8LY6y4kUTNVUyNqiL19E6lVuHyKX3C)

When we build our own applications with LLMs we need to consider the following points

- Understanding the safety risks of your application
- Considering adjustments to mitigate safety risks
- Performing safety testing appropriate to your use case
- Soliciting feedback from users and monitoring usage

#Prompt design 101 or Prompting

Prompt design is the process of creating prompts that elicit the desired response from language models. Writing well structured prompts is an essential part of ensuring accurate, high quality responses from a language model.

###What is Prompt?

A prompt is a natural language request submitted to a language model to receive a response back. Prompts can contain questions, instructions, contextual information, examples, and partial input for the model to complete or continue.


Every prompt you send to the model includes parameter values that control how the model generates a response.
The most common model parameters are:

1.  Max output tokens -  Specifies the maximum number of tokens that can be generated in the response (100 tokens correspond to roughly 60-80 words)

2.  Temperature -  The temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a more deterministic or less open-ended response, while higher temperatures can lead to more diverse or creative results.

3.  topK - It changes how the model selects tokens for output. A topK of 1 means the selected token is the most probable among all the tokens in the model's vocabulary (also called greedy decoding), while a topK of 3 means that the next token is selected from among the 3 most probable using the temperature.

4.  topP - It changes how the model selects tokens for output. Tokens are selected from the most to least probable until the sum of their probabilities equals the topP value.  For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the topP value is 0.5, then the model will select either A or B as the next token by using the temperature and exclude C as a candidate. Default topP value is 0.95


##Types of prompts

1.  **Zero-shot prompt** - These prompts don't contain examples for the model to replicate. Zero-shot prompts essentially show the model's ability to complete the prompt without any additional examples or information. It means the model has to rely on its pre-existing knowledge to generate a plausible answer. Use zero-shot prompts to generate creative text formats, such as poems, code, scripts, musical pieces, email, letters, etc.


2. **One-shot prompts** - These prompts provide the model with a single example to replicate and continue the pattern. This allows for the generation of predictable responses from the model.

3. **Few-shot prompts** - These prompts provide the model with multiple examples to replicate. Use few-shot prompts to complete complicated tasks, such as synthesizing data based on a pattern


### Good Prompt Design Strategies



*   Give clear instructions
*   Include examples
*   Let the model complete partial input
*   Prompt the model to format its response
*   Add contextual information
*   Add prefixes
*   Experiment with different parameter values
*   Use different phrasing
*   Switch to an analogous task
*   Change the order of prompt content



#GeminiAI With Python

In [41]:
#Install gemini ai

!pip install -q -U google-generativeai

In [42]:
#Import necessary packages

import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown


def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

In [43]:
# Used to securely store your API key
from google.colab import userdata

In [None]:
# Setup API Key
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_API_KEY)

In [None]:
#List models
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

##Generate Text from Text Input

In [None]:
#Select the model

model = genai.GenerativeModel('gemini-pro')

In [None]:
#Check the CPU Time
%%time
response = model.generate_content("What is the meaning of life?")

In [None]:
#Generate text output
to_markdown(response.text)

In [None]:
#Check the prompt feedback (Safety)

response.prompt_feedback

In [None]:
#Check other possible outputs

response.candidates

In [None]:
#Stream the output

%%time
response = model.generate_content("What is the meaning of life?", stream=True)

In [None]:
#Stream the output
#Note: Some response attributes are not available in streaming until you've iterated through all the response chunks.

for chunk in response:
  print(chunk.text)
  print("_"*80)

## Generate text from image and text inputs

In [None]:
#Get Sample Image
!curl -o image.jpg https://i.pinimg.com/originals/51/ba/7f/51ba7f2b243f804805eca35af937089e.jpg

In [None]:
#Load image using PIL.Image python package

import PIL.Image

img = PIL.Image.open('image.jpg')
img

In [None]:
#Select Gemini pro vision model for image input

model = genai.GenerativeModel('gemini-pro-vision')

In [None]:
#Generate content from image

response = model.generate_content(img)

to_markdown(response.text)

In [None]:
#Give Text input and image together

response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the fruit in the photo and talk about eating fruits every day is healthy.", img], stream=True)
response.resolve()

In [None]:
to_markdown(response.text)

## Chat conversations

In [None]:
#Start build chat using gemini pro
model = genai.GenerativeModel('gemini-pro')
chat = model.start_chat(history=[])
chat

In [None]:
#Start Chat

response = chat.send_message("In one sentence, explain how AI can influence in future")
to_markdown(response.text)

In [None]:
#list chat history

chat.history

In [None]:
#Use stream to continue senting message

response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)

for chunk in response:
  print(chunk.text)
  print("_"*80)

In [None]:
#Markdown chat list

for message in chat.history:
  display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))

##Use Embeddings

The embedding service in the Gemini API generates state-of-the-art embeddings for words, phrases, and sentences. The resulting embeddings can then be used for NLP tasks, such as semantic search, text classification and clustering among many others.

Embedding is a technique used to represent information as a list of floating point numbers in an array. With Gemini, you can represent text (words, sentences, and blocks of text) in a vectorized form, making it easier to compare and contrast embeddings. For example, two texts that share a similar subject matter or sentiment should have similar embeddings, which can be identified through mathematical comparison techniques such as cosine similarity.

 embed_content methods to generate embeddings

![AI Embedding ](https://drive.google.com/uc?id=1eHlaBX4K6Dum3f259xZaiinzW2dsuZ7R)


In [None]:
#Embedding for single string document retrieval
#The retrieval_document task type is the only task that accepts a title.

result = genai.embed_content(
    model="models/embedding-001",
    content="What is the meaning of life?",
    task_type="retrieval_document",
    title="Tesing retriveal document embedding")

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')

In [None]:
#To handle batches of strings, pass a list of strings in content:

result = genai.embed_content(
    model="models/embedding-001",
    content=[
      'What is the meaning of life?',
      'How much wood would a woodchuck chuck?',
      'How does the brain work?'],
    task_type="retrieval_document",
    title="Embedding of list of strings")

# A list of inputs > A list of vectors output
for v in result['embedding']:
  print(str(v)[:50], '... TRIMMED ...')

In [None]:
#List other candidates

response.candidates[0].content

In [None]:
#Pass embedded using candidates

result = genai.embed_content(
    model = 'models/embedding-001',
    content = response.candidates[0].content)

# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')

In [None]:
#List chat history

chat.history

In [None]:
#Pass embedded using Chat history

result = genai.embed_content(
    model = 'models/embedding-001',
    content = chat.history)

# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
  print(str(v)[:50], '... TRIMMED...')

##Safety Settings

In [None]:
#Test rude words safety setting

response = model.generate_content('[Questionable rude words prompt here]')
response.candidates

In [None]:
response.prompt_feedback

In [None]:
#Change Safety Setting
response = model.generate_content('[Questionable prompt here]',
                                  safety_settings={'HARASSMENT':'block_none'})
response.text

## Encode messages

This  offers a fully-typed equivalent to the previous example, so you can better understand the lower-level details regarding how the SDK encodes messages.

In [None]:
import google.ai.generativelanguage as glm

In [None]:
#Create Encoded Messages

#The SDK attempts to convert your message to a `glm.Content` object, which contains a list of `glm.Part` objects that each contain either:

model = genai.GenerativeModel('gemini-pro-vision')
response = model.generate_content(
    glm.Content(
        parts = [
            glm.Part(text="Write a short, engaging blog post based on this picture."),
            glm.Part(
                inline_data=glm.Blob(
                    mime_type='image/jpeg',
                    data=pathlib.Path('image.jpg').read_bytes()
                )
            ),
        ],
    ),
    stream=True)

In [None]:
response.resolve()

to_markdown(response.text[:100] + "... [TRIMMED] ...")

##Multi-turn conversations

genai.ChatSession is just a wrapper around GenerativeModel.generate_content

In [None]:
#Create chat model

model = genai.GenerativeModel('gemini-pro')

messages = [
    {'role':'user',
     'parts': ["Briefly explain how AI can benefit to students"]}
]
response = model.generate_content(messages)

to_markdown(response.text)

In [None]:
#Note: For multi-turn conversations, you need to send the whole conversation history with each request. The API is stateless.

messages.append({'role':'model',
                 'parts':[response.text]})

messages.append({'role':'user',
                 'parts':["Okay, how about a more detailed explanation to a high school student?"]})

response = model.generate_content(messages)

to_markdown(response.text)

## Generation configuration

In [None]:
#The generation_config argument allows you to modify the generation parameters. Every prompt you send to the model includes parameter values that control how the model generates responses.

model = genai.GenerativeModel('gemini-pro')
response = model.generate_content(
    'Tell me a story about a magic backpack.',
    generation_config=genai.types.GenerationConfig(
        # Only one candidate for now.
        candidate_count=1,
        stop_sequences=['x'],
        max_output_tokens=20,
        temperature=1.0)
)

In [None]:
#text = response.text

#if response.candidates[0].finish_reason.name == "MAX_TOKENS":
 #   text += '...'

#to_markdown(text)

In [None]:
response.prompt_feedback