# Google Colab using LLM & Webscrape Fun

# <font color=red>Mr Fugu Data Science</font>

# (◕‿◕✿)

# `Google Gemini 1.5 Flash & Pro`

+ We will go to the link connected in order to see the baisc landing page allowing us to read documentation, obtain an API key and related materials to start learning.

https://deepmind.google/technologies/gemini/flash/



# `Gemini Flash 1.5 Pricing`

https://ai.google.dev/pricing

+ What countries can access free usage: https://ai.google.dev/gemini-api/docs/available-regions

In [12]:
!pip install -q -U google-generativeai # Installing Gemini AI for use with API

# Google Gemini AI (API)
import google.generativeai as genai

# File management
import os

# Ff we call urls
import requests

# Handle images
from IPython.display import Image

#
genai.configure(api_key=os.environ["API_KEY"])

In [None]:
# List of Available Models: NOT ALL WORK because of deprecation
for model_name in genai.list_models():
    if 'generateContent'  in model_name.supported_generation_methods:
        print(model_name.name)

# `Ex 1) Read Text from Image`

In [None]:
# Used with Google Colab to collect files:
from google.colab import files

uploaded = files.upload() #prompt will open and you can choose files, you can also create a path 

In [None]:
img_=Image('IMG_7850.jpg') # my file for example

In [None]:
# This is another model we can use, I will use here
model = genai.GenerativeModel('gemini-1.5-pro-001')

# Text prompt for Gemini AI
prompt='analyze and extract all information from image, including best by date and all other information'

#Gemini takes our img, prompt and generate a response string from it
result=model.generate_content([img_,prompt])


# Output text 
print(result.text)

# `Document Processing:` 

+ Setup: https://ai.google.dev/gemini-api/docs/document-processing?lang=python
    
+ Improve Prompting: https://ai.google.dev/gemini-api/docs/file-prompting-strategies

# `Ex 2) PDF File Summarize `

https://ai.google.dev/gemini-api/docs/document-processing?lang=python

In [None]:
# Download Article:
!curl -o txt2vidDiffModel.pdf https://arxiv.org/html/2403.06098v2

In [None]:
# 
sample_file = genai.upload_file(path="txt2vidDiffModel.pdf",
                                display_name="text2Vid_DiffisionModel PDF")

print(f"Uploaded file '{sample_file.display_name}' as: {sample_file.uri}")


In [None]:
# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-001")

# Prompt the model with text and the previously uploaded image.
response = model.generate_content([sample_file, "Can you summarize this document as a bulleted list?"])

print(response.text)

# `Ex3) Analyze video`


+ There is a caveat here. You can take videos and analyze them but, if you are interested in the audio there needs to be captions/subtitles from the video you are using.
+ Also, the video will need to be downloaded somewhere, at this time I am unsure how to stream the data directly without downloading. 

Now, with this problem I have found a work around to get this done! It may not be the best way but it is a proof of concept at the very least. 

https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/video-understanding

In [None]:
import vertexai

from vertexai.generative_models import GenerativeModel, Part

# TODO(developer): Update project_id and location
PROJECT_ID='Generative Language API Key'
vertexai.init(project=PROJECT_ID, location="us-central1")

vision_model = GenerativeModel("gemini-1.5-pro-001")

# Generate text
response = vision_model.generate_content(
    [
        Part.from_uri(
            "https://youtu.be/CNbqLm7uqaA", mime_type="video/mp4"
        ),
        "What is in the video?, give a summary with bullet points",
    ]
)
print(response.text)



# `Ex4) Audio Summarizing or Transcriptions`

In [None]:
# !pip install pytube
!pip install pytubefix
from pytubefix import YouTube
import os

from google.colab import drive
drive.mount('/content/drive/')


def sanitize_filename(filename):
    return "".join(c if c.isalnum() or c in " ._-" else "_" for c in filename)


def audio(thelink, path):
    try:
        yt = YouTube(thelink)
        print('Title:', yt.title)
        print('Views:', yt.views)
        yd = yt.streams.get_audio_only()
        yt_title = sanitize_filename(yt.title)
        yd.download(output_path=path, filename=f'{yt_title}.mp3')
        print('Finished downloading audio')
    except Exception as e:
        print(f"Error: {e}")


def high(thelink, path):
    try:
        yt = YouTube(thelink)
        print('Title:', yt.title)
        print('Views:', yt.views)
        yt_title = sanitize_filename(yt.title)

        # Download the highest resolution video with a specified filename
        video_stream = yt.streams.filter().order_by("resolution").last()
        audio_stream = yt.streams.get_audio_only()

        video_filename = f'{yt_title}.mp4'
        audio_filename = f'{yt_title}.mp3'

        video_stream.download(output_path=path, filename=video_filename)
        audio_stream.download(output_path=path, filename=audio_filename)

        print('Finished downloading high resolution video and audio')

    except Exception as e:
        print(f"Error: {e}")


def low(thelink, path):
    try:
        yt = YouTube(thelink)
        print('Title:', yt.title)
        print('Views:', yt.views)
        yd = yt.streams.get_lowest_resolution()
        yt_title = sanitize_filename(yt.title)
        yd.download(output_path=path, filename=f'{yt_title}.mp4')
        print('Finished downloading low resolution video')
    except Exception as e:
        print(f"Error: {e}")


link_inp = input("Please Enter the link of the video: ")
path_inp = input("Please Enter the download path: ")
menu_inp = input('Select:\n1- Audio\n2- Highest Resolution\n3- Lowest Resolution\n')

if menu_inp == '1':
    audio(link_inp, path_inp)
elif menu_inp == '2':
    high(link_inp, path_inp)
elif menu_inp == '3':
    low(link_inp, path_inp)
else:
    print('Invalid input')
    
  


 # ----------------------------------
# NOT MY CODE, Came from reddit
# https://www.reddit.com/r/learnpython/comments/156p9ju/pytube_errors_help_fix_or_recommend_other_library/

# `Summarize Audio`

In [None]:
# Initialize a Gemini model appropriate for your use case.
import google.generativeai as genai
import pathlib


from google.colab import userdata
api=userdata.get('Google_Gemini_API')

genai.configure(api_key=api)
model = genai.GenerativeModel('models/gemini-1.5-pro-001')

# Create the prompt.
prompt = "Please summarize the audio with bullet points and timestamps."

# Load the samplesmall.mp3 file into a Python Blob object containing the audio
# file's bytes and then pass the prompt and the audio to Gemini.
response = model.generate_content([
    prompt,
    {
        "mime_type": "audio/mp3",
        "data": pathlib.Path('/content/drive/FRUSTRATING Runway ML _ Kling AI Free Tier_s NEED WORK.mp3').read_bytes()
    }
])

# Output Gemini's response to the prompt and the inline audio.
print(response.text)

# `Transcribe with timestamps:`

In [None]:
# Initialize a Gemini model appropriate for your use case.
import google.generativeai as genai
import pathlib


from google.colab import userdata
api=userdata.get('Google_Gemini_API')

genai.configure(api_key=api)
model = genai.GenerativeModel('models/gemini-1.5-pro-001')

# Create the prompt.
prompt = """Can you transcribe this audio, in the format of timecode,caption and separate each section by a new line"""

# Load the samplesmall.mp3 file into a Python Blob object containing the audio
# file's bytes and then pass the prompt and the audio to Gemini.
response = model.generate_content([
    prompt,
    {
        "mime_type": "audio/mp3",
        "data": pathlib.Path('/content/drive/FRUSTRATING Runway ML _ Kling AI Free Tier_s NEED WORK.mp3').read_bytes()
    }
])

# Output Gemini's response to the prompt and the inline audio.
print(response.text)

In [1]:
# Ex) Reading a Schema for a database: ER Diagram:

In [None]:
uploaded = files.upload()
Image('Screen Shot 2024-09-19 at 5.42.39 PM.png')

# https://www.csub.edu/~ychoi2/MIS/Assignment/ERD_Practice.pdf

In [None]:
from google.colab import files 
from IPython.display import Image
uploaded = files.upload()

img_e=Image('Screen Shot 2024-09-19 at 5.42.39 PM.png')

model = genai.GenerativeModel('gemini-1.5-pro-001')

prompt = "Document the entities and relationships in this ER diagram."

contents = [prompt, img_e]

# Use a more deterministic configuration with a low temperature
generation_config = model.generate_content(contents)
   



print("-------Prompt--------")
# print_multimodal_prompt(contents)

print("\n-------Response--------")
for response in generation_config:
    print(response.text, end="")

# Caching Large requests to lower token usage and context, like short term memory..

https://ai.google.dev/gemini-api/docs/caching?lang=python

In [None]:
#idea for connecting to webscraping

In [None]:
# audio analysis or segmentation

In [None]:
# video: you can summarize a video or find a specific image within a video and extra infomration from it

# `Citations & Help:`

# ◔̯◔

https://deepmind.google/technologies/gemini/flash/

https://medium.com/rahasak/build-rag-application-using-a-llm-running-on-local-computer-with-ollama-and-langchain-e6513853fda0
    
https://cloud.google.com/vertex-ai/docs/tutorials/jupyter-notebooks (AI_ML Notebooks to practice)

https://www.youtube.com/watch?v=mA1K1tT56jo

https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_1_5_pro.ipynb

https://medium.com/@punya8147_26846/understanding-prompt-templates-in-langchain-f714cd7ab380

https://www.datacamp.com/tutorial/prompt-chaining-llm

https://medium.com/vectrix-ai/image-extraction-with-langchain-and-gemini-a-step-by-step-guide-02c79abcd679

https://medium.com/@ibrahimmukherjee/ai-for-drug-discovery-with-python-code-47e1fe3a8233

https://diverger.medium.com/building-asynchronous-llm-applications-in-python-f775da7b15d1

https://developers.google.com/youtube/v3/guides/uploading_a_video

https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/intro_multimodal_use_cases.ipynb

https://medium.com/pythoneers/building-a-video-insights-generator-using-gemini-flash-e4ee4fefd3ab

https://dev.to/deepgram/live-transcription-with-python-and-django-4aj2

https://www.assemblyai.com/blog/real-time-transcription-in-python/

https://www.reddit.com/r/learnpython/comments/156p9ju/pytube_errors_help_fix_or_recommend_other_library/

https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_python.ipynb

https://medium.com/vectrix-ai/image-extraction-with-langchain-and-gemini-a-step-by-step-guide-02c79abcd679

https://medium.com/@mohammed97ashraf/revolutionizing-image-data-extraction-a-comprehensive-guide-to-gemini-pro-vision-and-langchain-200bbc60b949

https://rohitraj-iit.medium.com/part-10-voice-chat-with-gemini-484514580033

https://blogs.jollytoday.com/how-to-use-llm-such-as-gemini-and-chatgpt-for-video-translation-e22ff076e885

https://medium.com/featurepreneur/extracting-audio-from-video-using-pythons-moviepy-library-e351cd652ab8

https://medium.com/featurepreneur/extracting-audio-from-video-using-pythons-moviepy-library-e351cd652ab8