<a target="_blank" href="https://colab.research.google.com/github/urcraft/llm_lecture_notebooks/blob/main/03_Gemini_API_Basics_Key_Setup_and_First_Prompts.ipynb">   <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>

# Gemini API Basics: Free Key, First Calls, and Prompting

## What you will learn
- What a closed-model API is and why teams use it.
- How to use a free Gemini API key in Colab.
- How prompt changes affect model outputs.

Expected runtime: 30-45 minutes
Expected cost: Free tier if usage stays within quota.


## Setup notes
1. Store your key as a Colab Secret named GOOGLE_API_KEY.
2. If key setup fails, you can still complete analysis tasks using reference outputs.


In [1]:
%pip -q install -U google-genai


[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/53.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.2/53.2 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/728.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m727.0/728.8 kB[0m [31m32.8 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m728.8/728.8 kB[0m [31m20.4 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import os
import time
import pandas as pd

GEMINI_AVAILABLE = False
GEMINI_ERROR = None

try:
    from google import genai

    api_key = os.getenv('GOOGLE_API_KEY')
    if not api_key:
        try:
            from google.colab import userdata
            api_key = userdata.get('GOOGLE_API_KEY')
        except Exception:
            api_key = None

    if not api_key:
        raise ValueError('GOOGLE_API_KEY not found. Set env var or Colab secret GOOGLE_API_KEY.')

    client = genai.Client(api_key=api_key)
    GEMINI_AVAILABLE = True
except Exception as e:
    GEMINI_ERROR = str(e)
    print('Gemini is not available in this runtime.')
    print('Reason:', GEMINI_ERROR)


In [3]:
MODEL_ID = 'gemini-2.5-flash'
print('Model:', MODEL_ID)


Model: gemini-2.5-flash


In [4]:
def run_gemini(prompt: str, system_instruction=None):
    if not GEMINI_AVAILABLE:
        return {
            'ok': False,
            'model': MODEL_ID,
            'prompt': prompt,
            'system_instruction': system_instruction,
            'text': None,
            'latency_s': None,
            'tokens_in': None,
            'error': GEMINI_ERROR,
        }

    config = {'system_instruction': system_instruction} if system_instruction else None

    try:
        start = time.perf_counter()
        response = client.models.generate_content(model=MODEL_ID, contents=prompt, config=config)
        latency = time.perf_counter() - start

        token_info = client.models.count_tokens(model=MODEL_ID, contents=prompt)
        tokens_in = getattr(token_info, 'total_tokens', None)

        return {
            'ok': True,
            'model': MODEL_ID,
            'prompt': prompt,
            'system_instruction': system_instruction,
            'text': response.text,
            'latency_s': round(latency, 2),
            'tokens_in': tokens_in,
            'error': None,
        }
    except Exception as e:
        return {
            'ok': False,
            'model': MODEL_ID,
            'prompt': prompt,
            'system_instruction': system_instruction,
            'text': None,
            'latency_s': None,
            'tokens_in': None,
            'error': str(e),
        }


In [5]:
# First basic call (no system prompt)
first_prompt = 'Explain GPU and VRAM to a high-school student in 5 bullet points.'
first_result = run_gemini(first_prompt)

print('--- First Gemini Call ---')
print('Prompt:', first_prompt)
print('Error:', first_result['error'])
print('Output:\n')
print(first_result['text'])

# Prompt for system prompt comparison
comparison_prompt = 'Explain the concept of large language models (LLMs).'
system_prompt_1 = 'You are a helpful assistant that provides concise answers.'
system_prompt_2 = 'You are a helpful assistant that provides detailed explanations, especially for technical terms.'

print('\n--- System Prompt Comparison Setup ---')
print(f'User prompt: {comparison_prompt}')
print(f'System Prompt 1: {system_prompt_1}')
print(f'System Prompt 2: {system_prompt_2}')


--- First Gemini Call ---
Prompt: Explain GPU and VRAM to a high-school student in 5 bullet points.
Error: None
Output:

Here's an explanation of GPU and VRAM in 5 bullet points:

*   **GPU (Graphics Processing Unit): The Visual Artist:** Think of the GPU as a highly specialized super-artist inside your computer. While your main processor (CPU) is a general-purpose genius handling everything from spreadsheets to web browsing, the GPU is *specifically* designed to quickly render all the images, videos, and 3D graphics you see on your screen. It does this by performing thousands of simple calculations at the same time, perfect for creating detailed game worlds or editing high-resolution video.

*   **GPU vs. CPU: Different Strengths:** Imagine the CPU as a master chef who can cook any dish perfectly, but only one at a time. The GPU is like a whole team of sous chefs, each capable of doing a small, specific task very quickly (like chopping onions), allowing them to prepare huge meals much

In [6]:
baseline_result = run_gemini(comparison_prompt)
result_sp1 = run_gemini(comparison_prompt, system_instruction=system_prompt_1)
result_sp2 = run_gemini(comparison_prompt, system_instruction=system_prompt_2)

print('\n--- Output: No System Prompt ---')
print(baseline_result['text'])

print('\n--- Output: System Prompt 1 (Concise) ---')
print(result_sp1['text'])

print('\n--- Output: System Prompt 2 (Detailed) ---')
print(result_sp2['text'])

comparison_df = pd.DataFrame([
    {'prompt_type': 'No System Prompt', 'latency_s': baseline_result['latency_s'], 'tokens_in': baseline_result['tokens_in'], 'error': baseline_result['error'], 'text': baseline_result['text']},
    {'prompt_type': 'System Prompt 1: Concise', 'latency_s': result_sp1['latency_s'], 'tokens_in': result_sp1['tokens_in'], 'error': result_sp1['error'], 'text': result_sp1['text']},
    {'prompt_type': 'System Prompt 2: Detailed', 'latency_s': result_sp2['latency_s'], 'tokens_in': result_sp2['tokens_in'], 'error': result_sp2['error'], 'text': result_sp2['text']}
])

comparison_df



--- Output: No System Prompt ---
Large Language Models (LLMs) are a type of artificial intelligence (AI) program designed to understand, generate, and manipulate human language. They are at the forefront of the current AI revolution due to their remarkable capabilities in various text-based tasks.

Let's break down the concept:

### What's in the Name?

1.  **Language:**
    *   LLMs are specifically trained on vast amounts of text data (books, articles, websites, code, conversations, etc.).
    *   Their primary function is to process and produce natural language, meaning the way humans communicate.
    *   They learn the nuances of grammar, syntax, semantics, context, style, and even tone.

2.  **Model:**
    *   In AI, a "model" is a trained algorithm. It's a mathematical structure that has learned patterns and relationships from data.
    *   LLMs are typically built using a **neural network architecture**, specifically a type called **Transformers**. Transformers are highly effec

Unnamed: 0,prompt_type,latency_s,tokens_in,error,text
0,No System Prompt,11.97,12,,Large Language Models (LLMs) are a type of art...
1,System Prompt 1: Concise,1.76,12,,Large Language Models (LLMs) are deep learning...
2,System Prompt 2: Detailed,14.37,12,,Large Language Models (LLMs) are a type of art...


In [7]:
# Additional prompt examples for basic API calling
prompts = [
    'What is the difference between open-weight and closed models? Keep it simple.',
    'Compare open-weight and closed models for a small startup. Include privacy, cost, and control in a table.'
]

rows = [run_gemini(p) for p in prompts]
results_df = pd.DataFrame(rows)
results_df[['prompt', 'latency_s', 'tokens_in', 'error', 'text']]


Unnamed: 0,prompt,latency_s,tokens_in,error,text
0,What is the difference between open-weight and...,7.19,17,,"The core difference is about **who gets the ""b..."
1,Compare open-weight and closed models for a sm...,17.24,24,,"For a small startup, the choice between open-w..."


In [8]:
# Optional: quick side-by-side excerpt view for teaching
preview_df = comparison_df.copy()
preview_df['text_preview'] = preview_df['text'].fillna('').str.slice(0, 240)
preview_df[['prompt_type', 'latency_s', 'tokens_in', 'error', 'text_preview']]


Unnamed: 0,prompt_type,latency_s,tokens_in,error,text_preview
0,No System Prompt,11.97,12,,Large Language Models (LLMs) are a type of art...
1,System Prompt 1: Concise,1.76,12,,Large Language Models (LLMs) are deep learning...
2,System Prompt 2: Detailed,14.37,12,,Large Language Models (LLMs) are a type of art...


## Checkpoint
- Rewrite a weak prompt into a strong prompt for the same task.
- Note 2 output differences you observe.

## Reflection
- When would a closed API model be better than running local models?

## Troubleshooting
- If key lookup fails, re-open Colab Secrets and verify GOOGLE_API_KEY.
- Free-tier quota limits can cause temporary request failures.
