<a href="https://colab.research.google.com/github/Soban-Saleem/Agentic-and-Robotic-AI-Engineer/blob/main/Exploring_Gemini_2_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install --upgrade --quiet google-genai


Setting up API key

In [3]:
from google.colab import userdata
GOOGLE_API_KEY: str = userdata.get('GOOGLE_API_KEY_1')
if(GOOGLE_API_KEY):
  print("API key set")
else:
  print("API key not set")

API key set


In [4]:
from google import genai
from google.genai import Client

client: Client = genai.Client(
    api_key=GOOGLE_API_KEY,
)
model: str = "gemini-2.0-flash-exp"

In [12]:
from google.genai.types import GenerateContentResponse
from IPython.display import display, Markdown

response: GenerateContentResponse = client.models.generate_content(
    model=model,
    contents ="How does MAtrix work?"
    )

In [16]:
display(response)

GenerateContentResponse(candidates=[Candidate(content=Content(parts=[Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text='Matrix is an open-source, decentralized communication protocol designed for real-time communication. It\'s not an application itself, but rather a set of standards and APIs that allow different clients, servers, and services to communicate with each other. Think of it like the HTTP protocol for web browsing, but for real-time chat and other communication methods.\n\nHere\'s a breakdown of how Matrix works, broken down into key concepts:\n\n**1. Decentralization & Federation:**\n\n* **Not One Central Server:** Unlike services like Slack or Discord, Matrix isn\'t tied to a single company or server. Instead, it\'s designed to be decentralized.\n* **Homeservers:** Individuals or organizations can run their own "homeservers." These servers store user d

In [18]:
display(Markdown(response.text))

Matrix is an open-source, decentralized communication protocol designed for real-time communication. It's not an application itself, but rather a set of standards and APIs that allow different clients, servers, and services to communicate with each other. Think of it like the HTTP protocol for web browsing, but for real-time chat and other communication methods.

Here's a breakdown of how Matrix works, broken down into key concepts:

**1. Decentralization & Federation:**

* **Not One Central Server:** Unlike services like Slack or Discord, Matrix isn't tied to a single company or server. Instead, it's designed to be decentralized.
* **Homeservers:** Individuals or organizations can run their own "homeservers." These servers store user data, messages, and participate in the network.
* **Federation:** Homeservers connect and communicate with each other, allowing users on different servers to interact seamlessly. This is similar to how email servers work.
* **Benefits of Decentralization:**
    * **No Single Point of Failure:** If one server goes down, the rest of the network continues to function.
    * **User Control:** You can choose which homeserver to use and control your own data.
    * **Resilience:** Less susceptible to censorship or corporate control.

**2. Rooms (Channels):**

* **Where Conversations Happen:** Users interact in "rooms," which are like chat channels or group conversations.
* **Persistence:** Room history is generally stored on the participating homeservers.
* **Access Control:** Rooms can be public, invite-only, or private.
* **Multiple Users:** Many users can join the same room, regardless of which homeserver they use.

**3. Users & IDs:**

* **Unique Identifiers:** Users are identified by their Matrix ID, which typically looks like `@username:homeserver.domain`. For example, `@alice:example.com`.
* **Homeserver Affiliation:** The ID includes the homeserver where the user is registered.
* **Federated Identity:** The system knows how to reach a user based on their ID, even if they're on a different server.

**4. The Matrix API (HTTP/JSON):**

* **Standard Interface:** The Matrix protocol defines a standardized HTTP API that clients and servers use to communicate.
* **JSON Format:** Data is exchanged using JSON, a common and lightweight data format.
* **Client-Server Communication:** Clients use the API to send and receive messages, create rooms, join rooms, manage user settings, etc.
* **Server-Server Communication (Federation):** Homeservers use the API to exchange data, replicate room history, and manage users who are part of shared rooms.

**5. End-to-End Encryption (E2EE):**

* **Security Focus:** Matrix supports end-to-end encryption using the Olm and Megolm algorithms.
* **User-Controlled Keys:** Encryption keys are managed by the client devices of each user, ensuring that only those users can read the messages.
* **Privacy:** Even the homeservers cannot access the content of encrypted messages.

**6. Clients & Bridges:**

* **Clients:** You interact with the Matrix network through client applications. These come in various forms, including:
    * **Web Clients:** Accessible through a web browser (e.g., Element Web).
    * **Desktop Clients:** Installed on your computer (e.g., Element Desktop, Fractal).
    * **Mobile Clients:** Apps for smartphones and tablets (e.g., Element Mobile).
* **Bridges:** These are software components that connect Matrix to other communication platforms. For example, you can use a bridge to connect your Matrix room to a Discord server, allowing users on both platforms to communicate.

**How a Typical Interaction Works:**

1. **User A on Homeserver 1 sends a message to User B on Homeserver 2 in a shared room:**
   * User A's client uses the Matrix API to send the message to Homeserver 1.
   * Homeserver 1 checks who else is in the room and uses the federation API to send the message to Homeserver 2.
   * Homeserver 2 delivers the message to User B's client.

2. **For E2EE:**
   * User A's client encrypts the message using keys shared with User B's client.
   * The encrypted message is sent across the Matrix network.
   * User B's client decrypts the message using its corresponding key.

**Key Advantages of Matrix:**

* **Open and Decentralized:** No single point of control or failure.
* **Federated:** Allows different servers to connect and interact.
* **End-to-End Encryption:** Provides strong privacy and security.
* **Open Source:** Encourages community development and transparency.
* **Extensible:** Allows for customization and the development of new features.
* **Bridges:** Allows for communication with other platforms.

**In Summary:**

Matrix provides a powerful and flexible communication protocol that prioritizes decentralization, user control, and security. It's more than just chat; it's a foundation for building various types of real-time communication applications. Instead of being controlled by a single company, the Matrix network is powered by a network of interconnected servers and clients that follow an open standard.

While it might seem complex, the underlying mechanics allow for seamless communication between users regardless of where they are registered, ensuring a resilient and user-centric communication experience.


Uploading video to Gemini 2.0 flash through device

In [19]:
from google.colab import files
uploaded = files.upload()


Saving Gemini_2.0_Flash_Testing.mp4 to Gemini_2.0_Flash_Testing.mp4


Uploading through drive url

In [23]:
!wget https://drive.google.com/file/d/1Oeyjejr0QFd405TYJLRi4haKdR7qkfN4/view?usp=drive_link -O Gemini_2.0_Flash_Testing.mp4 -q


Interacting and Analyzing both the visual and audio components of the video

In [37]:
from google.genai.types import Content, Part
prompt = """For each scene in this video,
            generate captions that describe the scene along with any spoken text placed in quotation marks.
            Place each caption into an object with the timecode of the caption in the video.
         """

video = pottery_video

response = client.models.generate_content(
    model=model,
    contents=[
        Content(
            role="user",
            parts=[
                Part.from_uri(
                    file_uri=video.uri or "",
                    mime_type=video.mime_type or ""),
                ]),
        prompt,
    ]
)

Markdown(response.text)

```json
[
    {
        "timecode": "00:00",
        "caption": "A close-up shot of a computer screen displaying a web browser. The tab title is \"Exploring\_Gemini\_2.0.ipynb\". There's a dark background and white text, which reads 'Matrix is an open-source, decentralized communication protocol designed for real-time communication. It's not an application itself, but rather APIs that allow different clients, servers, and services to communicate with each other. Think of it like the HTTP protocol for web browsing, but for other communication methods. Here's a breakdown of how Matrix works, broken down into key concepts: 1. Decentralization & Federation: • Not One Central Server: Unlike services like Slack or Discord, Matrix isn't tied to a single company or server. Instead, it's designed to be • Homeservers: Individuals or organizations can run their own \"homeservers.\" These servers store user data, messages, and participate in • Federation: Homeservers connect and communicate with each other, allowing users on different servers to interact seamlessly. This is how servers work. • Benefits of Decentralization: • No Single Point of Failure: If one server goes down, the rest of the network continues to function. • User Control: You can choose which homeserver to use and control your own data. • Resilience: Less susceptible to censorship or corporate control. 2. Rooms (Channels): • Where Conversations Happen: Users interact in \"rooms,\" which are like chat channels or group conversations. • Persistence: Room history is generally stored on the participating homeservers. • Access Control: Rooms can be public, invite-only, or private. • Multiple Users: Many users can join the same room, regardless of which homeserver they use.'"
    },
    {
        "timecode": "00:00",
        "caption": "\"My name is Subhan, and I am currently exploring the fascinating world of AI and machine learning.\""
    },
    {
        "timecode": "00:07",
        "caption": "\"I am excited to be working with Google Gemini 2.0 flash for this project, which demonstrates its ability to analyze both video and audio inputs seamlessly.\""
    },
    {
      "timecode": "00:23",
      "caption": "\"In this video, I will provide a brief introduction about myself.\""
    },
    {
      "timecode": "00:27",
      "caption": "\"I have a background in computer science, AI, and some web development, and I am passionate about learning and applying cutting-edge technologies to solve real-world problems.\""
    },
    {
      "timecode": "00:40",
      "caption": "\"I look forward to see how Gemini 2.0 flash analyzes this video and understands both the visual and audio components. Thank you.\""
    }
]
```
