<a href="https://colab.research.google.com/github/arnabksarkar/gemini-utils/blob/main/Long_video_to_Gemini_1_5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Long youtube video upload to Gemini 1.5

## Purpose

The purpose of this notebook is to explore how we can increase student engagement for long lecture classes using Gemini long context widown.

Traditional lectures often lead to information overload, making it difficult for students to retain key concepts in a passive learning environment that offers limited opportunities for immediate clarification and necessitates time-consuming note-taking.

By leveraging Gemni 1.5 long context window, students will experience more efficient study sessions, leading to better exam results and a more fulfilling educational journey, ultimately paving the way for future learning innovations.


For illustrative purposes, a YouTube lecture video is utilized; however, the system accommodates any .mp4 file. This approach enables the exploration and understanding of complex video data within a single, unified context.

## Prerequisites

Install necessary dependencies. We need two libraries to demonstrate this example.

1.   The [Google AI Python SDK](https://pypi.org/project/google-generativeai/)  to access the Gemini API.
2.   A lightweight Python library [pytubefix](https://pypi.org/project/pytubefix/) for downloading YouTube videos



In [None]:
!pip install -q -U google-generativeai
!pip install -q -U  pytubefix

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/84.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━[0m [32m81.9/84.2 kB[0m [31m3.3 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.2/84.2 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25h

## Import necessary packages

In [None]:
import textwrap
import os
import time

# Import the genAI library
import google.generativeai as genai

# Used to securely store your API key
from google.colab import userdata

# for Kaggle secrects
# from kaggle_secrets import UserSecretsClient

# Markdown
from IPython.display import Markdown

# To download Youtube video
from pytubefix import YouTube



#### An utility function

In [None]:
def to_markdown(text):
    text = text.replace("•", "  *")
    return Markdown(textwrap.indent(text, "> ", predicate=lambda _: True))


## Authenticate with Google Generative AI

Get your API key from https://aistudio.google.com/app/apikey and access your API key as an environment variable.


If you are using **Kaggle** please use the below code to access the secrects.









In [None]:
# user_secrets = UserSecretsClient()
# GOOGLE_API_KEY = user_secrets.get_secret("GEMINI_API_KEY")

If you are using **Google Colab** please use the below code.

In [None]:
GOOGLE_API_KEY = userdata.get("GEMINI_API_KEY")


Configure Google GenAI with the API key.

In [None]:
genai.configure(api_key=GOOGLE_API_KEY)

## Download the large video from Youtube



> This is a machine learning [lecture video](https://www.youtube.com/watch?v=lWGdFeMsjzg) from the University of Tübingen. It was uploaded under the  [Creative Commons Attribution license (reuse allowed)](https://support.google.com/youtube/answer/2797468) on Youtube.com. We will be using this video for demonstration purpose only.



In [None]:
# download the video to the current directory.
YouTube('https://www.youtube.com/watch?v=lWGdFeMsjzg').streams.get_highest_resolution().download()

'/content/Introduction to Machine Learning - 01 - Baby steps towards linear regression.mp4'

In [None]:
yt_video = 'Introduction to Machine Learning - 01 - Baby steps towards linear regression.mp4'


## Define helper functions

In [None]:
# Uploads the youtube video to Gemini API.
def upload_to_gemini(path, mime_type=None):
  file = genai.upload_file(path, mime_type=mime_type)
  print(f"Uploaded file '{file.display_name}' as: {file.uri}")
  return file


# Waits for the given youtube video file to be active.
# Some files uploaded to the Gemini API need to be processed before they can be
#  used as prompt inputs
def wait_for_files_active(file):
  print("Waiting for file processing...")
  file = genai.get_file(file.name)
  while file.state.name == "PROCESSING":
    print(".", end="", flush=True)
    time.sleep(10)
    file = genai.get_file(file.name)
  if file.state.name != "ACTIVE":
    raise Exception(f"File {file.name} failed to upload.")

  print(f'Video processing complete: {file.name}')

In [None]:
# # Create the model
# generation_config = {
#   "temperature": 1,
#   "top_p": 0.95,
#   "top_k": 64,
#   "max_output_tokens": 8192,
#   "response_mime_type": "text/plain",
# }

# model = genai.GenerativeModel(
#   model_name="gemini-1.5-pro-latest",
#   generation_config=generation_config,
# )

## Lets upload the video file to Gemini

In [None]:
yt_video_file = upload_to_gemini(yt_video, mime_type="video/webm")
wait_for_files_active(yt_video_file)


Uploaded file 'Introduction to Machine Learning - 01 - Baby steps towards linear regression.mp4' as: https://generativelanguage.googleapis.com/v1beta/files/l9ppi5zq3vy0
Waiting for file processing...
....Video processing complete: files/l9ppi5zq3vy0


## Cache the tokens

We will now create a cache with a 5 minutes TTL

In [None]:
from google.generativeai import caching
import datetime

cache = caching.CachedContent.create(
    model="gemini-1.5-pro-001",
    display_name='machine_learning_desc',
    system_instruction=(
        'You are a helpful and informative bot that is an expert in analyzing videos included here, and your job is to answer '
        'the user\'s query based on the video file you have access to. Do not answer anything outside of the video context.'
    ),
    contents=yt_video_file,
    ttl=datetime.timedelta(minutes=5),
)

## Ask Gemini 1.5 questions about the file

In [None]:
# query = "Can you please summarize the video in 5 sentences?"
# prompt = make_prompt(query)

In [None]:
# Construct a GenerativeModel which uses the created cache.
model = genai.GenerativeModel.from_cached_content(cached_content=cache)

### Lets look at the usage.

In [None]:
def generate_content(prompt):
  answer = model.generate_content(prompt)
  print(f"Usage: {answer.usage_metadata}")
  return Markdown(answer.text)

In [None]:
generate_content([(
    'Describe a maximum of 5 topics described in the lecture video '
    'Display the topic names, a short description. Also list the timestamps '
    'they were introduced for the first time. Please use bullet points. ')])

Usage: prompt_token_count: 733754
candidates_token_count: 423
total_token_count: 734177
cached_content_token_count: 733714



The video describes the following topics:

* **Machine Learning I Course Introduction**  [0:17]
This is an introductory machine learning course with around 10 lectures, covering a broad range of machine learning topics with a focus on the mathematics behind them. The presenter states that this course will prepare the students to take other, more advanced, machine learning courses. 
* **Mathematics for Machine Learning** [2:30]
The course uses mathematical concepts, such as derivatives, to explain machine learning methods. Some students may be unfamiliar with the mathematics needed, but they can use a free online book "Mathematics for Machine Learning," by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, to supplement the course.
* **Machine Learning vs. Statistics** [6:26]
Both machine learning and statistics aim to discover patterns in data by building predictive models. However, the fields differ slightly in terms of their emphasis. Statistics focuses on understanding the world through data, using simpler models that are easier to interpret and develop theoretical analyses. In contrast, machine learning focuses on making accurate predictions, often using complex models and large datasets, and giving up on the interpretability of the statistical models.  
* **Types of Machine Learning Problems** [11:51]
The presenter describes 3 main types of machine learning problems:
    * **Supervised learning:** Input data includes labels and the algorithm learns to distinguish one class from another.  
    * **Unsupervised learning:** Input data is not labeled and the algorithm learns to cluster input data by the object/category they belong to.
    * **Reinforcement learning:** The algorithm learns by taking actions in a given environment and receiving rewards for completing a given task. 
* **Linear Regression** [15:00]
Linear regression is a classical statistical method that predicts a continuous variable from a single predictor. The model fits a linear function to a set of data, predicting the target (dependent) variable from the single predictor (independent) variable. 
