<a href="https://colab.research.google.com/github/sridevi-1234/MLops_git/blob/master/gemini_api_walkthrough.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Introduction to GoogleAI**

Launched in December 2023 after PaLM (Pathways Language Model) and LaMDA - Language Model for Dialoge application).

You can use Gemini chatbot which was formally known as Bard. Google announced Bard in February 2023 as a GenAI chatbot powered by LaMDA. Later chatbot switched to PaLM model before finally switching to the Gemini model.

Let's list down a few reasons as why you might want to choose Gemini.
- **Context Window:** In May 2024, Gemini 1.5 was updated with a context window of 2 million tokens. To put that in perspective, 2 million tokens can  process up to 2 hours of video, 22 hours of audio, 60K lines of code, or 1.4 million words of text.
- **Multimodal Capabilities:** Works with text, images and videos.
- **Variety of options:** Variants: Ultra, Pro, Flash and Nano.
- **Generous free offerings:** Offers a free to use option.

**Important Links:**
1. [Gemini Chatbot](https://gemini.google.com/app)
2. []()

**Dependencies:**
```python
! pip install google-generativeai
```

In [None]:
#!pip install google-generativeai

## **Importing Google Gemini AI**

In [None]:
import google.generativeai as genai

## **Setting the API Key**

In [None]:
f = open("keys/.gemini.txt")
key = f.read()

genai.configure(api_key=key)

## **Available Models**

In [None]:
for m in genai.list_models():
    print(m.name)

models/chat-bison-001
models/text-bison-001
models/embedding-gecko-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro
models/gemini-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-001
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-pro-exp-0801
models/gemini-1.5-pro-exp-0827
models/gemini-1.5-flash-latest
models/gemini-1.5-flash-001
models/gemini-1.5-flash-001-tuning
models/gemini-1.5-flash
models/gemini-1.5-flash-exp-0827
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-1.5-flash-8b-exp-0827
models/gemini-1.5-flash-8b-exp-0924
models/embedding-001
models/text-embedding-004
models/aqa


## **Prompting the Gemini Model**

In [None]:
# Initializing the GoogleAI Model
model = genai.GenerativeModel(model_name="gemini-1.5-flash")

# Defining the Prompt
user_prompt = """Complete the following:
                In our solar system, Earth is a """

# Calling the model with the prompt
response = model.generate_content(user_prompt)

# Printing the response
print(response.text)

In our solar system, Earth is a **planet**. 



In [None]:
user_prompt = """Generate some factual information to complete the following in 2-3 lines:
                In our solar system, Earth is a """

response = model.generate_content(user_prompt)

print(response.text)

In our solar system, Earth is a **rocky planet**, the **third planet from the Sun**, and the **only known planet to harbor life**. 



## **Adding a System Prompt**

**Important Note:** System Prompt can be specified using `system_instruction`. `system_instruction` is not enabled for models/gemini-pro.

In [None]:
model = genai.GenerativeModel(model_name="models/gemini-1.5-pro-latest",
                              system_instruction="""Generate some factual information to complete the user input.
                              Completion must have maximum 2-3 lines.""")

user_prompt = """In our solar system, Earth is a """

response = model.generate_content(user_prompt)

print(response.text)

...rocky planet that orbits the Sun, and is the only known astronomical object to harbor life. 



## **Important Parameters**

**Reference - [HuggingFace Blog](https://huggingface.co/blog/how-to-generate)**

If you run the above code few times, you will notice that the output changes across runs. Generative models are **non-deterministic**. This means that even with the same input they can produce different outputs. This behavior allows for creativity and diversity in the generated outputs, which can be great when trying to generate different creative styles. There are parameters which can help us control this behavior like temperature, top_p, etc...

- **candidate_count:** This controls the number of responses that will be generated for a single prompt. Default value is 1. Increasing this will generate more text responses. This increase the resource usage.
- **stop_sequence:** It allows to specify a list of strings that will act as stopsigns for the model.
- **max_output_tokens:** This is the maximum number of tokens the model will generate in the response.
- **temperature:** It act as a control knob that influences the randomness of the model's output. A higher temperature value will result in a more varied and creative response. Lower values would be more effective in returning predictable results with an LLM.
- **top_p:** Range from [0.0, 1.0]. This is also known a **nucleus sampling**. The LLM only considers the next word options that cumulatively add up to a probability of reaching or exceeding the `top_p` value. A higher value will create looser threshold. This will allow the model to consider a wider range of probable options while still prioritizing the most likely ones. A lower `top_p` value will create a stricter threshold, leading to less diverse and more predictable outputs.
- **top_k:** This parameter limits the number of possible next words to the `k` most probable options based on the probability distribution. A lower `k` value restricts the selection to a smaller pool of the most likely words, leading to less diverse and more predictable outputs.

Both `top_p` and `top_k` works in conjunction with the `temperature` parameter.

In [None]:
model = genai.GenerativeModel("gemini-1.5-flash")

# Setting our parameters
custom_config = genai.types.GenerationConfig(max_output_tokens=256, temperature=1.0)

user_prompt = """What is feature selection in data science? Explain in detail."""

# Passing our custom parameters to the generate_content method
response = model.generate_content(user_prompt, generation_config=custom_config)

print(response.text)

## Feature Selection: The Art of Choosing the Right Features for Your Model

Feature selection is a crucial step in data science, particularly in machine learning, where it plays a vital role in building efficient and accurate models. It's the process of **identifying and selecting the most relevant features from a dataset** for building a predictive model. By focusing on the most informative features, we achieve several benefits:

**Why Feature Selection Matters:**

* **Improved Model Performance:** Irrelevant features can introduce noise and bias, leading to overfitting and poor generalization on unseen data. Removing them enhances the model's accuracy and ability to predict future outcomes.
* **Reduced Complexity:** Fewer features simplify the model, making it easier to understand, interpret, and deploy. This also reduces computational time and resources needed for training and prediction.
* **Enhanced Interpretability:** By selecting the most influential features, we gain insights 

In [None]:
# Setting our parameters
custom_config = genai.types.GenerationConfig(temperature=0.1, top_p=0.1, top_k=32)

user_prompt = """What is feature selection in data science? Explain in detail."""

# Passing our custom parameters to the generate_content method
response = model.generate_content(user_prompt, generation_config=custom_config)

print(response.text)

## Feature Selection: The Art of Choosing the Right Variables

In data science, feature selection is a crucial process that involves **identifying and selecting the most relevant features (variables) from a dataset** for use in building a predictive model. It's like choosing the right ingredients for a recipe – the wrong ones can ruin the dish, while the right ones create a masterpiece.

**Why is Feature Selection Important?**

* **Improved Model Performance:** Irrelevant or redundant features can introduce noise and complexity, hindering model accuracy and generalization. Selecting the right features can lead to simpler, more interpretable, and more accurate models.
* **Reduced Overfitting:** Overfitting occurs when a model learns the training data too well, failing to generalize to unseen data. Feature selection helps prevent this by reducing the number of features, thus reducing the model's complexity.
* **Faster Training and Deployment:** Fewer features mean less data to process, l

In [None]:
# Setting our parameters
custom_config = genai.types.GenerationConfig(temperature=0.9, top_p=0.1, top_k=32)

user_prompt = """What is feature selection in data science? Explain in detail."""

# Passing our custom parameters to the generate_content method
response = model.generate_content(user_prompt, generation_config=custom_config)

print(response.text)

## Feature Selection: The Art of Choosing the Right Variables

In data science, feature selection is a crucial process that involves **identifying and selecting the most relevant features (variables) from a dataset** for use in building a predictive model. It's like choosing the right ingredients for a recipe – the wrong ones can ruin the dish, while the right ones create a masterpiece.

**Why is Feature Selection Important?**

* **Improved Model Performance:** Irrelevant or redundant features can introduce noise and complexity, hindering model accuracy and generalization. Selecting the right features can lead to simpler, more interpretable, and more accurate models.
* **Reduced Overfitting:** Overfitting occurs when a model learns the training data too well, failing to generalize to unseen data. Feature selection helps prevent this by reducing the number of features, thus reducing the model's complexity.
* **Faster Training and Deployment:** Fewer features mean less data to process, l