In [None]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.sandbox.google.com/github/https-deeplearning-ai/sc-gc-c4-gemini-public/blob/main/lesson-5/L5_colab_videos.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

# Cost Estimate

The estimated cost of running this notebook once using your Google Cloud account, using `Gemini 1.5 Flash`, without "Finding a Needle in a Haystack" example (which has been converted to markdown) should be less than 0.20 USD (as of August 2024). Get the latest Gemini costs [here](https://cloud.google.com/vertex-ai/generative-ai/pricing).

# SETUP

This is follow up to the [How to Set Up your Google Cloud Account](https://learn.deeplearning.ai/courses/large-multimodal-model-prompting-with-gemini/lesson/9/how-to-set-up-your-google-cloud-account-|-try-it-out-yourself-[optional]) instructions from the course, [Large Multimodal Model Prompting with Gemini](https://learn.deeplearning.ai/courses/large-multimodal-model-prompting-with-gemini/lesson/1/introduction) on the [Learning Platform](https://learn.deeplearning.ai) of [DeepLearning.AI](https://www.deeplearning.ai).

### Install Vertex AI SDK and other Required Packages

In [None]:
%pip install --upgrade --user --quiet google-cloud-aiplatform

### Restart Runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After it's restarted, continue to the next step.

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

### Authenticate your Notebook Environment (Colab Only)

If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

**NOTE:** The Gmail email address you use to authenticate this lesson colab must be the same as the one you used to set up your Google Cloud account and your Project.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud Project Information and Initialize Vertex AI SDK

**Add _your_ Project ID below**, which you created while following the [How to Set Up your Google Cloud Account](https://learn.deeplearning.ai/courses/large-multimodal-model-prompting-with-gemini/lesson/9/how-to-set-up-your-google-cloud-account-|-try-it-out-yourself-[optional]) instructions. If your `Project ID` was `dlai-shortcourse-on-gemini`, then you can run the cell below as it is. Otherwise, be sure to change it.

You can also look up your Project ID in your [Project Dashboard](https://console.cloud.google.com/projectselector2/home/dashboard).

In [None]:
PROJECT_ID = "dlai-shortcourse-on-gemini"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}


import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

# IN COURSE VIDEO

Lesson video starts from below.

# [Lesson 5: Developing Use Cases with Videos](https://learn.deeplearning.ai/courses/large-multimodal-model-prompting-with-gemini/lesson/6/developing-use-cases-videos)

In this lesson, you'll go through Gemini's Multimodality capabilities, by passing Videos and Texts as input.

**Note:** In the latest version, `from vertexai.preview.generative_models` has been changed to `from vertexai.generative_models`.

`from vertexai.preview.generative_models` can still be used.

In [None]:
from vertexai.generative_models import GenerativeModel

- Load the `gemini-pro-vision` model.
- When specifying `gemini-pro-vision`, the [gemini-1.0-pro-vision](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemini-pro-vision) model is used.

**Note:** In the video, `gemini-pro-vision` was used, which you can do so as well. Or use `gemini-1.5-flash-001`, which is dramatically less cheaper than `gemini-pro-vision`. You can take a look at pricing [here](https://cloud.google.com/vertex-ai/generative-ai/pricing).

```Python
multimodal_model = GenerativeModel("gemini-pro-vision")
```

In [None]:
multimodal_model = GenerativeModel("gemini-1.5-flash-001")

## Digital Marketer

In [None]:
file_path_1 = "dlai-sc-gemini-bucket/vertex-ai-langchain.mp4"
video_uri_1 = f"gs://{file_path_1}"
video_url_1 = f"https://storage.googleapis.com/{file_path_1}"

In [None]:
import IPython

In [None]:
IPython.display.Video(video_url_1, width=450)

In [None]:
from vertexai.generative_models import (
    GenerationConfig,
    GenerativeModel,
    Part,
)

In [None]:
video_1 = Part.from_uri(video_uri_1, mime_type="video/mp4")

- Structure your prompt(s).
- Be specific with what you want the model to do for you.
- You can even specify the output format of the response from the model.
- In this case, you are asking for the response to be in JSON format.

In [None]:
role = """
You are a great digital marketer working on a new video.
"""

In [None]:
tasks = """
You will add the video to your website and to do this you
need to complete some tasks. Please make sure your answer
is structured.

Tasks:
- What is the title of the video?
- Write a summary of what is in the video.
- Generate metadata for the video in JSON that includes:\
Title, short description, language, and company.
"""

# tasks = """
# You will add the video to your website and to do this you
# need to complete some tasks. Please make sure your answer
# is structured.

# Tasks:
# - What is the title of the video?
# - Write a summary of what is in the video.
# - Generate metadata for the video that includes:\
# Title, short description, language, and company.
# """

- You can choose the number of variables you want for your prompt.
- More variables means you have more flexibility in making specific changes to your prompts while keeping everyhting else the same.

In [None]:
# format_json = "Please output the metadata in JSON"

In [None]:
contents_1 = [video_1, role, tasks]

# contents_1 = [video_1, role, tasks, format_json]

- Feel free to change the `temperature`

In [None]:
generation_config_1 = GenerationConfig(
    temperature=0.1,
)

In [None]:
responses = multimodal_model.generate_content(
    contents_1,
    generation_config=generation_config_1,
    stream=False
)

**Note**: If you set `stream=True`, you'll print your responses as:
```Python
for response in responses:
    print(response.text, end="")
```

**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get.

In [None]:
print(responses.text, end="")

Here are the tasks you requested:

- **Title of the video:** Build AI-powered apps on Vertex AI with LangChain
- **Summary of the video:** This video explains how to use Vertex AI and LangChain to build AI-powered applications. The video covers common design patterns for using large language models, including how to include data from external sources and how to chain multiple models together. The video also shows how to use Vertex AI extensions to deploy LangChain applications.
- **Metadata for the video in JSON:**
```json
{
  "Title": "Build AI-powered apps on Vertex AI with LangChain",
  "short description": "Learn how to use Vertex AI and LangChain to build AI-powered applications. This video covers common design patterns, including how to include data from external sources and how to chain multiple models together. It also shows how to use Vertex AI extensions to deploy LangChain applications.",
  "language": "English",
  "company": "Google Cloud"
}
``` 


# Explaining the Educational Concepts

In [None]:
file_path_2 = "dlai-sc-gemini-bucket/descending-into-ml.mp4"
video_uri_2 = f"gs://{file_path_2}"
video_url_2 = f"https://storage.googleapis.com/{file_path_2}"

IPython.display.Video(video_url_2, width=450)

In [None]:
video_2 = Part.from_uri(video_uri_2, mime_type="video/mp4")

- You can even ask the model to answer based on answers of previous questions.
- And to generate programming code based on previous answers.

In [None]:
prompt = """
Please have a look at the video and answer the following
questions.

Questions:
- Question 1: Which concept is explained in the video?
- Question 2: Based on your answer to Question 1,
can you explain the basic math of this concept?
- Question 3: Can you provide a simple scikit code example
explaining the concept?
"""

In [None]:
contents_2 = [video_2, prompt]

In [None]:
responses = multimodal_model.generate_content(
    contents_2,
    stream=False
)

**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get.

In [None]:
print(responses.text)

Here are the answers to your questions:

**Question 1: Which concept is explained in the video?**

The video explains the concept of **Linear Regression**, a type of supervised learning model. 

**Question 2: Based on your answer to Question 1, can you explain the basic math of this concept?**

Linear regression aims to find the best-fitting line through a set of data points. The equation of this line is:

**y = wx + b**

* **y**: The target variable (dependent variable) we want to predict.
* **x**: The input feature (independent variable) used to predict y.
* **w**: The weight vector, which determines the slope of the line.
* **b**: The bias term, which determines the y-intercept of the line.

The goal of linear regression is to find the values of w and b that minimize the difference between the predicted y values and the true y values in the dataset. 

**Question 3: Can you provide a simple scikit code example explaining the concept?**

```python
import pandas as pd
from sklearn.line

- You can copy/paste and run your generated code in the cell below.

**Note:** LLM's are known to generate code which is incomplete or has bugs

In [None]:
### you can copy/paste your generated code here:



- Below cell includes the code which was generated in the lecture video

In [None]:
# # Import the necessary libraries
# import numpy as np
# import matplotlib.pyplot as plt
# from sklearn.linear_model import LinearRegression

# # Create some data
# X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
# y = np.dot(X, np.array([1, 2])) + 3

# # Fit the linear regression model
# model = LinearRegression()
# model.fit(X, y)

# # Make predictions
# y_pred = model.predict(X)

# # Plot the data and the fitted line
# plt.scatter(X[:, 1], y)
# plt.plot(X[:, 1], y_pred, color='red')
# plt.show()

## Extracting Information

In [None]:
file_path_4 = "dlai-sc-gemini-bucket/google-search.mp4"
video_uri_4 = f"gs://{file_path_4}"
video_url_4 = f"https://storage.googleapis.com/{file_path_4}"

IPython.display.Video(video_url_4, width=450)

In [None]:
video_4 = Part.from_uri(video_uri_4, mime_type="video/mp4")

**Note:** In the lecture video, everything was put in a single prompt (`prompt_4`):

```Python
prompt_4 = """
Answer the following questions using the video only.
Present the results in a table with a row for each question
and its answer.
Make sure the table is in markdown format.

Questions:
- What is the most searched sport?
- Who is the most searched scientist?

"""

contents_4 = [video_4, prompt_4]
```
But as also mentioned in the lecture, you can break it into seperate variables (`questions` and `format_html`), as done in the notebook below. Feel free to pause the video and compare your notebook with the video to see the differences.

- Here, you have your questions.

In [None]:
questions = """
Answer the following questions using the video only.

Questions:
- What is the most searched sport?
- Who is the most searched scientist?
"""

# questions = """
# Answer the following questions using the video only.
# If the answer is not found in the video,
# say "Not found in video".

# Questions:
# - What is the most searched sport?
# - Who is the most searched scientist?
# """

- Here, you specify the output format.
- In this case, it is table format.

In [None]:
format_html = """
Format:
Present the results in a table with a row for each question
and its answer.
Make sure the table is in markdown format.
"""

In [None]:
contents_4 = [video_4, questions, format_html]

- Set the `temperature`. For now, it is `temperature=0.9`

In [None]:
generation_config_1 = GenerationConfig(
    temperature=0.9,
)

In [None]:
responses = multimodal_model.generate_content(contents_4,
                   generation_config=generation_config_1,
                                              stream=True
)

**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get.

In [None]:
for response in responses:
    print(response.text, end="")

| Question | Answer |
|---|---|
| What is the most searched sport? | Soccer/Football |
| Who is the most searched scientist? | Albert Einstein |

```
You can copy/paste your generation in this Markdown cell (double click here)
```

## Finding a Needle in a Haystack

- Load the [gemini-1.5-pro-001](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemini-pro-preview-0409) model.

<span style="color:red; font-weight:bold;">IMPORTANT ⚠️ : For this example, `gemini-1.5-pro-001` was used in the lecture video</span>

<span style="color:red; font-weight:bold;"> - PROMPTING THIS NEEDLE IN A HAYSTACK EXAMPLE SHOULD COST LESS THAN 5.0 USD PER EXECUTION WITH `gemini-1.5-pro-001` (as of August 2024). </span>

<span style="color:green; font-weight:bold;"> - Using `gemini-1.5-flash-001` instead should cost less than 0.25 USD (as of August 2024) per execution. </span>

In [None]:
# multimodal_model = GenerativeModel("gemini-1.5-pro-001")

multimodal_model = GenerativeModel("gemini-1.5-flash-001")

- Just like with images, you can send more than 1 video to the model.
- The following videos are from the **[LLMOps](https://learn.deeplearning.ai/courses/llmops/lesson/1/introduction)** short course, which you can enroll in on **[DeepLearning.AI's Short Courses Platform](https://learn.deeplearning.ai)**.

In [None]:
video_1 = Part.from_uri("gs://dlai-sc-gemini-bucket/sc-gc-c3-LLMOps_L1_v3.mp4.mp4",  mime_type="video/mp4")
video_2 = Part.from_uri("gs://dlai-sc-gemini-bucket/sc-gc-c3-LLMOps_L2_v4.mp4",  mime_type="video/mp4")
video_3 = Part.from_uri("gs://dlai-sc-gemini-bucket/sc-gc-c3-LLMOps_L3_v4.mp4",  mime_type="video/mp4")

In [None]:
from IPython.display import IFrame

- This displays only one of the three videos.
- To view others, feel free to change the `file_path`

In [None]:
file_path = "dlai-sc-gemini-bucket/sc-gc-c3-LLMOps_L2_v4.mp4"
video_url = f"https://storage.googleapis.com/{file_path}"

In [None]:
IFrame(video_url, width=560, height=315)  # Adjust width and height as needed

In [None]:
role = """
You are specialized in analyzing videos and finding \
a needle in a haystack.
"""

In [None]:
instruction = """
Here are three videos. Each is a lesson from the \
LLMOps course from Deep Learning AI.
Your answers are only based on the videos.
"""

- You are asking the model (question 2) to find something very specific from across these 3 videos.

In [None]:
questions = """
Answer the following questions:
1. Create a summary of each video and what is discussed in \
the video.\
Limit the summary to a max of 100 words.
2. In which of the three videos does the instructor run \
and explains this Python code: bq_client.query(). \
Where do you see this code in the video?
"""

In [None]:
contents_5 = [
    role,
    instruction,
    video_1,
    video_2,
    video_3,
    questions
]

# contents_5 = [
#     instruction,
#     video_1,
#     video_2,
#     video_3,
#     questions,
#     role,
# ]

**Note:** We have commented out the code to prevent accidental execution

```Python
responses = multimodal_model.generate_content(
    contents_5,
    stream=True
)
```

**Note**: LLM's do not always produce the same results, especially because they are frequently updated. So the output you see in the video might be different than what you may get.

```Python
### this will take some time to run

for response in responses:
    print(response.text, end="")
```

Okay, here is a summary of each video: 

**Video 1 Summary:** The first video of the LLMOps course is an introduction to the fundamental concepts and ideas within LLMOps. The instructor explains the relationship between LLMOps and MLOps as well as how LLMOps differs from traditional MLOps. Key terms covered include data management, automation, and deployment as well as their importance in LLMOps and how they relate to one another. An example is given highlighting a typical LLMOps workflow. 

**Video 2 Summary:** The instructor continues to delve into the key concepts and components of LLMOps in the second video of the course by showing how to retrieve text data from a BigQuery data warehouse. The importance of dealing with large datasets that cannot fit into memory is highlighted. The instructor uses SQL to query a database that is too large to fit into memory, ultimately resulting in an error. 

**Video 3 Summary:** The third video of the LLMOps course moves into more advanced concept