# 1. Forewords

## In this notebook, we will use and showcase the capabilities of Gemma 3n running locally via Ollama.

**Our goal is to leverage the model's functionality as much as possible without relying on a GPU. This demonstrates that our code can run on any computer, even those with limited resources.**

This project embodies the spirit of **inclusivity and accessibility**. We recognize that not all students and researchers arond the world have access to high-end machines with powerful GPUs. By developing this code to operate efficiently on CPUs, we aim to empower young students who may not have the financial means to invest in expensive hardware to utilize the application effectively.

Additionally, our approach is beneficial for field researchers, such as archaeologists, who often work with lightweight equipment in remote areas with limited access to electrical power sources. The ability to run the model on a portable device enhances their research capabilities without the need for a constant internet connection.

By using Ollama to run the model, we ensure that our application remains accessible to a broader audience, fostering inclusivity in technology and research. Our vision is to enable everyone, regardless of the resources at their disposal, to benefit from advanced AI capabilities.

# 2. Package Installation Script

This section of the code is responsible for safely installing the required Python packages for the project. It defines a function to perform package installations while checking for existing installations to avoid redundancy.

## Key Features:

- **Dynamic Package Management**: The script can install specific versions of packages or the latest available version from the Python Package Index (PyPI) and Git repositories.

- **Error Handling**: The installation function captures any exceptions that occur during the installation process, providing feedback in case of any failures.

- **Clear Dependencies**: A dictionary (`required_packages`) lists all required packages along with their desired versions, making it easy to manage dependencies.

## Usage:
1. The function `install_package` checks whether a package is already installed and installs it if not.
2. Each package in the `required_packages` dictionary is processed, and the installation procedure is invoked, ensuring all necessary libraries are available for the application to function properly.

This approach helps maintain a clean and organized environment essential for smooth development and deployment of the project.

## Required Packages Notice

The following packages are included in the `required_packages` dictionary but are commented out to optimize application performance on machines without a GPU:

```python
required_packages = {
    "gradio": "latest",
    "ollama": "latest",
    "markdown2": "latest",
    # "torch": ">=2.4.0",
    # "transformers": ">=4.53.0"
}

In [1]:
# Function to safely install packages
def install_package(package, version=None):
    """Install a package safely, checking if it's already installed."""
    try:
        if version:
            # Use >= for minimum version specification
            !pip install -q {package}{version} 
        else:
            !pip install -q {package} 
        print(f"Successfully installed {package}")
        return True
    except Exception as e:
        print(f"Failed to install {package}: {e}")
        return False

# List of required packages with versions
required_packages = {
    "gradio": "latest",
    "ollama": "latest",
    "markdown2": "latest",
    # "torch": ">=2.4.0",
    # "transformers": ">=4.53.0"
    }
print("!!! Installation concluded !!!")

# Install packages
for package, version in required_packages.items():
    if version == "latest":
        install_package(package)
    else:
        install_package(package, version)

!!! Installation concluded !!!
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.3/54.3 MB[0m [31m27.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m323.9/323.9 kB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m95.5/95.5 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.6/11.6 MB[0m [31m83.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.0/72.0 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.4/66.4 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hSuccessfully installed gradio
Successfully installed ollama
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.5/48.5 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25hSuccessfully installed markdown2


# 3. Imports
## Performance Optimization Notice

The following imports are commented out to enhance the performance of the application on machines without a GPU:

```python
# import torch
# from transformers import AutoProcessor, AutoModelForImageTextToText
# !pip list | grep -E 'torch|transformers'

In [2]:
%%time
import os
import sys
import psutil
import subprocess
import logging
import warnings
import gradio as gr
import ollama
from ollama import chat
from PIL import Image
import io
import base64
import markdown2
# import torch
# from transformers import AutoProcessor, AutoModelForImageTextToText
# !pip list | grep -E 'torch|transformers'

CPU times: user 4.11 s, sys: 458 ms, total: 4.57 s
Wall time: 5.11 s


# 4. Logger Configuration

This section of code is responsible for setting up a logging system in Python, enabling the application to record log messages at various levels (INFO, ERROR, etc.). Logging is essential for debugging and monitoring applications.

## Key Components:

1. **Logger Creation**:
   - A logger is created using `logging.getLogger(__name__)`, which allows tracking logs specific to the current module or script.

2. **Log Level Setting**:
   - The logger’s level is set to `INFO` using `logger.setLevel(logging.INFO)`, which means that all messages at this level and above (WARNING, ERROR, CRITICAL) will be captured.

3. **File Handler**:
   - A file handler is created with `logging.FileHandler('app.log')`, which writes log messages to a specified file (in this case, `app.log`).
   - It is configured to capture messages at the INFO level and above.

4. **Console Handler**:
   - A console handler is set up with `logging.StreamHandler()`, which outputs log messages to the console (standard output).
   - This handler is configured to capture messages only at the ERROR level and above, which helps to reduce console clutter during regular operations.

5. **Formatters**:
   - The format of the log messages is defined using `logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')`. This format includes:
     - Timestamp of the log message (`asctime`)
     - Name of the logger (`name`)
     - Level of the log message (`levelname`)
     - The log message itself (`message`)

6. **Adding Handlers**:
   - The configured file and console handlers are added to the logger with `logger.addHandler()`, allowing the logger to output messages to both the console and a log file with the specified formats.

In [3]:
%%time
# Cell: Logger Configuration
# Create a logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

# Create file handler and set level to debug
file_handler = logging.FileHandler('app.log')
file_handler.setLevel(logging.INFO)

# Create console handler and set level to error
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.ERROR)

# Create formatters
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

# Add formatters to handlers
file_handler.setFormatter(formatter)
console_handler.setFormatter(formatter)

# Add handlers to logger
logger.addHandler(file_handler)
logger.addHandler(console_handler)

CPU times: user 998 µs, sys: 0 ns, total: 998 µs
Wall time: 948 µs


# 5. System Information Function

This section of code defines a function `print_system_info()` that prints detailed information about the current system environment, including Python version, current working directory, PyTorch version, and GPU availability.


In [4]:
# ==================================================
# _/$\_\%/_/&\_\@/_/$\_\%/_/&\_\@/_/$\_\%/_/&\_\@/_
# ==================================================
#  **********    System Information   *************
# ==================================================
# _/$\_\%/_/&\_\@/_/$\_\%/_/&\_\@/_/$\_\%/_/&\_\@/_
# ==================================================
def print_system_info(use_torch=False):
    print("System Information:")
    print(f"• Python version: {sys.version}")
    print(f"• Current working directory: {os.getcwd()}")
    
    if use_torch:
        # No need to import torch. Just use it.
        print(f"• PyTorch version: {torch.__version__}")
        # Check GPU availability and details
        if torch.cuda.is_available():
            gpu_info = {
                "CUDA Available": torch.cuda.is_available(),
                "CUDA Device Count": torch.cuda.device_count(),
                "Current CUDA Device": torch.cuda.current_device(),
                "Device Name": torch.cuda.get_device_name(torch.cuda.current_device()),
                "Memory Allocated (MB)": round(torch.cuda.memory_allocated(0) / 1024**2, 2),
                "Memory Reserved (MB)": round(torch.cuda.memory_reserved(0) / 1024**2, 2),
            }
            
            print("\n⚡ GPU Detected:")
            for key, value in gpu_info.items():
                print(f"  • {key}: {value}")
        else:
            print("\n😭 No GPU detected. Running on CPU only.")

    # Memory information
    ram = psutil.virtual_memory()
    print("\n🐘 System Memory:")
    print(f"  • Total RAM: {round(ram.total / 1024**2, 2)} MB")
    print(f"  • Available RAM: {round(ram.available / 1024**2, 2)} MB")
    print(f"  • Used RAM: {round(ram.used / 1024**2, 2)} MB")
    print(f"  • RAM Percentage: {ram.percent}% used")

# Check if torch is imported
try:
    import torch
    torch_imported = True  # Indicate that torch is available
except ImportError:
    torch_imported = False  # Indicate that torch is not available

# Call the function based on whether torch is imported
if torch_imported:
    print_system_info(use_torch=True)  # Call the function with use_torch=True
else:
    print_system_info(use_torch=False)  # Call the function with use_torch=False

System Information:
• Python version: 3.11.11 (main, Dec  4 2024, 08:55:07) [GCC 11.4.0]
• Current working directory: /kaggle/working
• PyTorch version: 2.6.0+cu124

😭 No GPU detected. Running on CPU only.

🐘 System Memory:
  • Total RAM: 32102.89 MB
  • Available RAM: 30784.21 MB
  • Used RAM: 861.09 MB
  • RAM Percentage: 4.1% used


# 6. Ollama Installation Script

This command is used to install the Ollama software on your system by downloading and executing the installation script.

## Command Breakdown:

- **`!curl -fsSL https://ollama.com/install.sh`**:
  - `curl`: A command-line tool for transferring data with URLs. It is used here to download the installation script from the specified URL.
  - `-f`: This option tells `curl` to fail silently on server errors (like 404 or 500 HTTP responses), which means it won't proceed to execute if there are issues accessing the script.
  - `-s`: This flag makes the `curl` operation silent, meaning no progress meter or error messages will be shown.
  - `-S`: This option tells `curl` to show errors if they occur, even if the `-s` (silent) option is used.
  - `-L`: This flag tells `curl` to follow redirects. If the URL is redirected to another location, `curl` will follow it to download the file.

- **`| sh`**:
  - The pipe `|` takes the output of the `curl` command (the contents of the `install.sh` script) and passes it to the shell (`sh`) for execution.
  - This allows the script to be run directly after being downloaded without the need to save it to a file first.


In [5]:
%%time
!curl -fsSL https://ollama.com/install.sh | sh

>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
############################################################################################# 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
CPU times: user 939 ms, sys: 267 ms, total: 1.21 s
Wall time: 38.1 s


# 7. Running Ollama Server with subprocess

This command is used to start the Ollama server as a subprocess within a Python script. The server allows you to interact with the Ollama model API.

## Command Breakdown:

- **`subprocess.Popen`**:
  - `subprocess`: This is a Python module used to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.
  - `Popen`: A constructor from the `subprocess` module that executes a child program in a new process. It can be used to execute shell commands and interact with them.

- **Parameters**:
  - **`"ollama serve"`**: This is the command being executed. In this case, it starts the Ollama server, which provides an interface for interacting with the Ollama model.
  - **`shell=True`**: This option indicates that the command should be executed through the shell. It allows for shell-specific features, such as running commands like `ollama serve` directly as if you were typing it into the command line.


In [6]:
%%time
process = subprocess.Popen("ollama serve", shell=True)

CPU times: user 286 µs, sys: 1.08 ms, total: 1.36 ms
Wall time: 1.13 ms


# 8. Pulling the Gemma 3n Model

This command is used to download the `gemma3n:e2b` model from the Ollama repository to your local environment. It ensures that you have the necessary model files for running inference or training tasks.

## Command Breakdown:

- **`!ollama`**:
  - This is the command-line interface (CLI) for interacting with the Ollama service. Using the `!` prefix indicates that this command is being run in a shell interface, often seen in Jupyter notebooks and some interactive environments.

- **`pull`**:
  - The `pull` command is used to download a specified model from the Ollama repository. This retrieves the model's weights and configuration files, making them available for local use.

- **`gemma3n:e2b`**:
  - This specifies the model you want to download. The format `gemma3n:e2b` indicates the specific version or variant of the `gemma3n` model, which is necessary to ensure that you are using the correct model configuration suitable for your tasks.


In [7]:
%%time
# !ollama pull gemma3n:e4b 
!ollama pull gemma3n:e2b

Couldn't find '/root/.ollama/id_ed25519'. Generating new private key.
Your new public key is: 

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIInpHb96kdhV9sb7lufahmx8y6qp9TaKON8Bx0cviYXI

[GIN] 2025/07/01 - 15:59:51 | 200 |      65.848µs |       127.0.0.1 | HEAD     "/"


time=2025-07-01T15:59:51.801Z level=INFO source=routes.go:1235 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICE

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l

time=2025-07-01T15:59:52.793Z level=INFO source=download.go:177 msg="downloading 3839a254cf2d in 16 351 MB part(s)"


[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling 3839a254cf2d:   0% ▕                  ▏ 6.3 MB/5.6 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d:   1% ▕                  ▏  48 MB/5.6 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d:   1% ▕                  ▏  71 MB/5.6 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d:   2% ▕                  ▏ 102 MB/5.6 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d:   2% ▕                  ▏ 127 MB/5.6 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d:   2% ▕                  ▏ 134 MB/5.6 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d:   3% ▕                  ▏ 147 MB/5.6 GB           

time=2025-07-01T16:00:31.090Z level=INFO source=download.go:177 msg="downloading e0a42594d802 in 1 358 B part(s)"


[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K[?25h[?2026l[?2026h[?25l[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K[?25h[?2026l[?2026h[?25l[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K[?25h[?2026l[?2026h[?25l[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B  

time=2025-07-01T16:00:32.375Z level=INFO source=download.go:177 msg="downloading 1adbfec9dcf0 in 1 8.4 KB part(s)"


[?2026h[?25l[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K
pulling 1adbfec9dcf0: 100% ▕██████████████████▏ 8.4 KB                         [K[?25h[?2026l[?2026h[?25l[A[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K
pulling 1adbfec9dcf0: 100% ▕██████████████████▏ 8.4 KB                         [K[?25h[?2026l[?2026h[?25l[A[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K
pulling 1adbfec9dcf0: 100% ▕██████████████████▏ 8.4 KB                         [K[?25h[?2026l[?2026h[?25l[A[A[A[1Gpulling manifest [K
pulling 3839a25

time=2025-07-01T16:00:35.039Z level=INFO source=download.go:177 msg="downloading a3e66f51d60b in 1 417 B part(s)"


[?2026h[?25l[A[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K
pulling 1adbfec9dcf0: 100% ▕██████████████████▏ 8.4 KB                         [K[?25h[?2026l[?2026h[?25l[A[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K
pulling 1adbfec9dcf0: 100% ▕██████████████████▏ 8.4 KB                         [K[?25h[?2026l[?2026h[?25l[A[A[A[1Gpulling manifest [K
pulling 3839a254cf2d: 100% ▕██████████████████▏ 5.6 GB                         [K
pulling e0a42594d802: 100% ▕██████████████████▏  358 B                         [K
pulling 1adbfec9dcf0: 100% ▕██████████████████▏ 8.4 KB                         [K[?25h[?2026l[?2026h[?25l[A[A[A[1Gpulling manifest [K
pulling 3839

# 9. Gradio Integration with Gemma Model

This application uses **Gradio** to create an interactive user interface that allows users to communicate with the **Gemma 3n** model.

## Overview:

- **Gradio** is a Python library that provides an easy way to create user interfaces for machine learning models.
- It allows developers to transform their models into web applications without the need for extensive coding in web frameworks.
- Users can input questions through a text box, and the application outputs responses generated by the Gemma model in a chat format.

## Key Features:

- **Interactive Interface**: Users can ask questions and receive answers in real-time.
- **Chat History**: The application maintains a history of the conversation for better context and continuity.
- **Model Integration**: Direct integration with the Gemma model enables complex inference tasks and interactions on user input.

## How to Use The Cruzeta Analysis Portal:

Choose a file or ask a question:

1. **For Image Files**: 
   - GEMMA 3N via Ollama can access an image file and interpret the image content and provide a description, but you have to ask it to access the file
   - Use a prompt such as: *"Access the file and describe in detail what you see in the image."*

2. **For Audio Files**: 
   - GEMMA 3N via Ollama does not have the capability to "listen" to audio. Another version of the model is required for audio analysis.
   - Use a prompt such as: *"Describe the characteristics of this audio file."*

3. **Without a File**: 
   - Simply enter your question in the provided text box and click "Submit". 
   - **You can also request text translations in the chat with a simple prompt !!!**
   - Use a prompt such as: *"How do you say 'donut' in German, French, and Portuguese?"*

In [8]:
# Unified function for chat with Gemma
def gemma_chat(history, url_input, question):
    try:
        # Validate the question input
        question = question.strip()
        if not question:
            return history, "Please enter a valid question."
        if url_input:  # If a URL has been provided
            message_content = f"You have received a URL for a file: {url_input}. Question: {question}"
        else:  # No URL provided, just process the question
            message_content = f"Question: {question}"
        # Send the prompt to the Ollama model
        response = chat(model='gemma3n:e2b', messages=[
            {
                "role": "user",
                "content": message_content
            },
        ])
        answer = response['message']['content']
        # Convert the Markdown answer to HTML
        answer_html = markdown2.markdown(answer)
        # Update history with the new interaction
        history.append(f"<div style='color: blue;'>You: {question}</div>")
        history.append(f"<div style='color: green;'>Gemma 3n: {answer_html}</div>")
        history_text = "<br>".join(history)
        
        return history_text, answer
    except Exception as e:
        return history, f"Error occurred: {str(e)}"

# Create Gradio interface
with gr.Blocks() as demo:
    history = gr.State([])
    with gr.Column():
        gr.Markdown("# Welcome to the Cruzeta Analysis Portal")
        chat_output = gr.HTML(label="Chat History")
        response_output = gr.Textbox(label="Response", placeholder="Model response will appear here...", interactive=False)
        url_input = gr.Textbox(lines=1, label="Enter File URL (audio/image) or leave it blank")
        question_input = gr.Textbox(lines=2, label="Ask Gemma")
        
        submit_button = gr.Button("Submit")
    
    # Connect inputs and outputs
    submit_button.click(
        gemma_chat,
        inputs=[history, url_input, question_input],
        outputs=[chat_output, response_output]
    )

# Launch the Gradio interface
demo.launch()

* Running on local URL:  http://127.0.0.1:7860
It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

* Running on public URL: https://267a0bdfba30c7dc9c.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


