Skip to content

feat(gemini-image): add Gemini image generation package#3

Merged
williaby merged 1 commit into
mainfrom
feat/gemini-image-generation
Dec 15, 2025
Merged

feat(gemini-image): add Gemini image generation package#3
williaby merged 1 commit into
mainfrom
feat/gemini-image-generation

Conversation

@williaby
Copy link
Copy Markdown
Contributor

@williaby williaby commented Dec 14, 2025

Summary

Adds a new gemini-image package to the monorepo for generating images using Google's Gemini API.

Features

  • Generate images from text prompts using Gemini 2.5 Flash or Gemini 3 Pro
  • Support for draft mode (1K resolution) and final mode (2K/4K resolution)
  • Story sequence generation (multiple related images)
  • Draft finalization with upscaling
  • CLI interface for easy command-line usage
  • Configurable aspect ratios and image sizes
  • Optional Google Search grounding for enhanced results

Package Structure

  • gemini_image.generator - Core image generation functions
  • gemini_image.models - Model configurations and constants
  • gemini_image.utils - Utility functions (API key handling, image encoding)
  • gemini_image.cli - Click-based command-line interface

Testing

  • 25 unit tests covering all core functionality
  • Mock-based testing to avoid API calls
  • Tests for error handling and edge cases

Test plan

  • All 25 package tests pass
  • Ruff linting passes
  • Pre-commit hooks pass
  • Pre-push hooks pass

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added Gemini Image Generation library for creating images from text prompts
    • Introduced command-line interface for image generation with support for multiple models
    • Added story sequence generation for creating multi-part image narratives
    • Implemented draft-to-final workflow for image refinement
    • Support for customizable image sizes, aspect ratios, and reference images
  • Documentation

    • Added comprehensive README with installation, usage examples, and API documentation

✏️ Tip: You can customize this high-level summary in your review settings.

Add new workspace package for image generation using Google's Gemini
models (Nano Banana / Nano Banana Pro).

Features:
- Text-to-image generation with configurable resolution (1K/2K/4K)
- Multiple aspect ratios (1:1, 3:4, 4:3, 9:16, 16:9)
- Reference-based image editing and refinement
- Multi-part story sequence generation with visual continuity
- Draft-then-finalize workflow for 75% cost reduction
- Thinking mode with intermediate image visualization
- CLI tool (`gemini-image`) for command-line usage

Package structure:
- generator.py: Core generate_image(), generate_story_sequence(),
  finalize_draft() functions
- models.py: Model configurations and type definitions
- utils.py: Helper functions for API key, base64, file extensions
- cli.py: Full-featured command-line interface

Models supported:
- flash: Gemini 2.5 Flash (fast generation)
- pro: Gemini 3 Pro (4K, better text, thinking mode) [default]

Based on handoff from /home/byron/dev/library/scripts/generate_image.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 14, 2025

Walkthrough

Introduces a new gemini-image package providing image generation via Google Gemini API, with comprehensive source modules, CLI interface, tests, and documentation. Integrates the package into the workspace build configuration.

Changes

Cohort / File(s) Summary
Package metadata and documentation
packages/gemini-image/README.md, packages/gemini-image/pyproject.toml
Adds package README with features, installation, quick-start, API examples, and CLI documentation. Introduces pyproject.toml with project metadata, dependencies, entry points, build configuration, and semantic-release tooling.
Model configuration
packages/gemini-image/src/gemini_image/models.py
Defines type aliases (ModelKey, AspectRatio, ImageSize), TypedDict for ModelConfig, and public constants for available models ("flash", "pro"), default model, supported aspect ratios, and image sizes.
Utility functions
packages/gemini-image/src/gemini_image/utils.py
Provides helper functions for API key retrieval from environment or .env file, image base64 encoding/decoding, MIME type detection, and file extension mapping.
Core image generation
packages/gemini-image/src/gemini_image/generator.py
Implements main image generation logic using Gemini API, including generate_image(), generate_story_sequence(), and finalize_draft() functions with support for reference images, aspect ratios, draft-to-final workflows, and verbose logging.
Command-line interface
packages/gemini-image/src/gemini_image/cli.py
Adds CLI entry point with argument parsing for single/multi-part image generation, draft finalization, model listing, and output configuration.
Public API surface
packages/gemini-image/src/gemini_image/__init__.py
Exposes package version, model constants, and core generator functions as public API.
Test fixtures and configurations
packages/gemini-image/tests/conftest.py
Provides pytest fixtures for sample PNG images and mock Gemini API responses used across test modules.
Test coverage
packages/gemini-image/tests/test_*.py
Adds test modules validating model configurations, generator functions (including error paths), and utility functions for API key, image encoding, and file operations.
Workspace integration
pyproject.toml
Updates root configuration to include gemini-image package in Ruff, pytest, and uv workspace settings.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Generator
    participant Utils
    participant GeminiAPI as Gemini API
    participant FileSystem as File System

    User->>CLI: invoke generate_image()
    CLI->>Utils: get_api_key()
    Utils-->>CLI: API key
    CLI->>Generator: generate_image(prompt, model, ...)
    Generator->>Utils: load_image_as_base64(ref_image)
    Utils-->>Generator: base64_data, mime_type
    Generator->>GeminiAPI: generate_content(contents, config)
    GeminiAPI-->>Generator: response with image parts
    Generator->>Generator: iterate parts, extract thoughts & images
    Generator->>Utils: decode_base64_image()
    Utils-->>Generator: image bytes
    Generator->>FileSystem: write image file
    FileSystem-->>Generator: file path
    Generator-->>CLI: return Path
    CLI-->>User: output file path
Loading
sequenceDiagram
    participant User
    participant CLI
    participant Generator
    participant GeminiAPI as Gemini API
    participant FileSystem as File System

    User->>CLI: invoke generate_story_sequence(base_prompt, 3 parts)
    CLI->>Generator: generate_story_sequence(...)
    loop for each part (1 to 3)
        Generator->>Generator: construct part-specific prompt
        Generator->>GeminiAPI: generate_content(prompt, prev_image as ref)
        GeminiAPI-->>Generator: response with image
        Generator->>FileSystem: write part_N.png
        FileSystem-->>Generator: file path
        Generator->>Generator: use output as reference for next part
    end
    Generator-->>CLI: return list of Paths
    CLI-->>User: output paths for all parts
Loading
sequenceDiagram
    participant User
    participant CLI
    participant Generator
    participant FileSystem as File System
    participant GeminiAPI as Gemini API

    User->>CLI: invoke finalize_draft(draft.png, prompt)
    CLI->>Generator: finalize_draft(draft_path, ...)
    Generator->>FileSystem: validate draft exists
    FileSystem-->>Generator: draft file
    Generator->>Generator: load draft as reference
    Generator->>GeminiAPI: generate_content(prompt, draft_ref, 2K resolution)
    GeminiAPI-->>Generator: high-res response
    Generator->>FileSystem: write final image
    FileSystem-->>Generator: file path
    Generator-->>CLI: return Path
    CLI-->>User: finalized image path
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Generator module (generator.py): Dense logic with lazy imports, Gemini API interaction, image encoding/decoding, multi-part sequencing, and draft finalization; requires careful review of error handling and API response processing.
  • CLI argument parsing (cli.py): Multiple conditional flows (list models, finalize, single image, story generation) with error handling and output path management.
  • Integration points: Verify correct data flow between utils, models, generator, and CLI; check that all public exports in __init__.py are properly defined; validate test coverage of error paths.
  • Model and utility modules: Simpler but verify completeness of constants and utility function implementations.

Suggested labels

python, tests, documentation, dependencies

Poem

🐰 A fuzzy gem for image dreams,
Where Gemini now streams,
Stories born from prompts so fine,
Draft to polish, each design—
Code hops forward, tests alight! 🎨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(gemini-image): add Gemini image generation package' accurately and comprehensively summarizes the main change—introducing a new Gemini image generation package to the monorepo.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/gemini-image-generation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

✅ FIPS Compatibility Check

Metric Count
Errors 0
Warnings 0
Info 0

Status: ✅ PASSED

What is FIPS?

FIPS 140-2/140-3 is a US government standard for cryptographic modules.
Systems running Ubuntu LTS with fips-updates or similar configurations
restrict cryptographic algorithms to NIST-approved ones.

Common issues:

  • Using hashlib.md5() without usedforsecurity=False
  • Dependencies using non-approved algorithms (bcrypt, DES, RC4)
  • Weak cipher configurations

@coderabbitai coderabbitai Bot added dependencies documentation Improvements or additions to documentation python tests labels Dec 14, 2025
@github-actions
Copy link
Copy Markdown

✅ Mutation Testing Results

Metric Value
Mutation Score 100.0%
Threshold 80%
Status Passed
What is Mutation Testing?

Mutation testing introduces small changes (mutations) to your code and checks if your tests detect them. A high mutation score indicates your tests are effective at catching bugs.

  • Killed mutants: Tests detected the change
  • Survived mutants: Tests did not detect the change (potential gap)

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30fc976 and 93a1cbf.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (12)
  • packages/gemini-image/README.md (1 hunks)
  • packages/gemini-image/pyproject.toml (1 hunks)
  • packages/gemini-image/src/gemini_image/__init__.py (1 hunks)
  • packages/gemini-image/src/gemini_image/cli.py (1 hunks)
  • packages/gemini-image/src/gemini_image/generator.py (1 hunks)
  • packages/gemini-image/src/gemini_image/models.py (1 hunks)
  • packages/gemini-image/src/gemini_image/utils.py (1 hunks)
  • packages/gemini-image/tests/conftest.py (1 hunks)
  • packages/gemini-image/tests/test_generator.py (1 hunks)
  • packages/gemini-image/tests/test_models.py (1 hunks)
  • packages/gemini-image/tests/test_utils.py (1 hunks)
  • pyproject.toml (4 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff formatting with 88 character line length for Python code
Use Ruff linting with PyStrict-aligned rules including BLE, EM, SLF, INP, ISC, PGH, RSE, TID, YTT, FA, T10, and G rules
Tag assumptions with #CRITICAL, #ASSUME, or #EDGE comments including category and reason for verification

Files:

  • packages/gemini-image/src/gemini_image/utils.py
  • packages/gemini-image/src/gemini_image/cli.py
  • packages/gemini-image/tests/test_utils.py
  • packages/gemini-image/tests/test_models.py
  • packages/gemini-image/tests/conftest.py
  • packages/gemini-image/src/gemini_image/__init__.py
  • packages/gemini-image/tests/test_generator.py
  • packages/gemini-image/src/gemini_image/models.py
  • packages/gemini-image/src/gemini_image/generator.py
**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Use 120 character line length for Markdown documentation files

Files:

  • packages/gemini-image/README.md
pyproject.toml

⚙️ CodeRabbit configuration file

pyproject.toml: Review dependency changes for:

  • Version constraint appropriateness
  • Security implications of new dependencies
  • License compatibility

Files:

  • pyproject.toml
🧠 Learnings (2)
📚 Learning: 2025-12-14T22:54:23.007Z
Learnt from: CR
Repo: ByronWilliamsCPA/python-libs PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-14T22:54:23.007Z
Learning: Applies to tests/**/*.py : Use pytest fixtures defined in tests/conftest.py for test setup and teardown

Applied to files:

  • packages/gemini-image/tests/conftest.py
📚 Learning: 2025-12-14T22:54:23.007Z
Learnt from: CR
Repo: ByronWilliamsCPA/python-libs PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-14T22:54:23.007Z
Learning: Applies to src/**/*.py : Use BasedPyright for type checking in strict mode with strict inference enabled

Applied to files:

  • pyproject.toml
🧬 Code graph analysis (4)
packages/gemini-image/tests/test_utils.py (2)
packages/gemini-image/src/gemini_image/utils.py (4)
  • decode_base64_image (81-91)
  • get_api_key (13-44)
  • get_file_extension (94-110)
  • load_image_as_base64 (47-78)
packages/gemini-image/tests/conftest.py (2)
  • sample_image_path (26-30)
  • sample_image_bytes (16-22)
packages/gemini-image/src/gemini_image/__init__.py (2)
packages/gemini-image/src/gemini_image/generator.py (2)
  • generate_image (55-304)
  • generate_story_sequence (307-428)
packages/gemini-image/src/gemini_image/models.py (1)
  • ModelConfig (13-19)
packages/gemini-image/tests/test_generator.py (2)
packages/gemini-image/src/gemini_image/generator.py (3)
  • finalize_draft (431-501)
  • generate_story_sequence (307-428)
  • generate_image (55-304)
packages/gemini-image/tests/conftest.py (2)
  • mock_genai_response (34-51)
  • sample_image_path (26-30)
packages/gemini-image/src/gemini_image/generator.py (1)
packages/gemini-image/src/gemini_image/utils.py (3)
  • get_api_key (13-44)
  • get_file_extension (94-110)
  • load_image_as_base64 (47-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Mutation Testing / Mutation Testing
🔇 Additional comments (16)
pyproject.toml (2)

271-271: LGTM! Proper monorepo integration.

The gemini_image package is correctly added to:

  • Ruff's known-first-party for import sorting
  • BasedPyright's include paths for type checking
  • Pytest's testpaths and pythonpath for test discovery

This ensures consistent tooling across the monorepo.

Also applies to: 465-465, 528-529


700-703: [rewritten comment]
[classification tag]

packages/gemini-image/pyproject.toml (3)

1-66: Well-structured package configuration.

The pyproject.toml follows best practices:

  • Comprehensive metadata with classifiers and keywords
  • Clear Python version constraints
  • Proper semantic release configuration with package-specific tag format
  • Hatchling build configuration

26-26: No action needed — google-genai>=1.0.0 is valid.

The google-genai package exists on PyPI at version 1.55.0, which satisfies the specified requirement. The package name is correct and unambiguous.


25-27: The CLI implementation uses argparse (standard library), not click. No missing dependency here.

Likely an incorrect or invalid review comment.

packages/gemini-image/tests/test_utils.py (1)

1-111: Good test coverage for utility functions.

The tests comprehensively cover:

  • API key retrieval from environment and .env files
  • Image loading with MIME type detection
  • Base64 encoding/decoding round-trip
  • File extension mapping with fallback behavior

Test structure follows pytest best practices with clear class organization and descriptive method names.

packages/gemini-image/README.md (1)

99-104: Default model documentation is correct.

The DEFAULT_MODEL in gemini_image/models.py is set to "pro", which matches the table indicating pro as the default model.

packages/gemini-image/src/gemini_image/cli.py (1)

1-230: Documentation inconsistency: CLI uses argparse, not Click.

The PR summary states "Click-based command-line interface," but the implementation uses argparse (imported on line 5). This documentation inconsistency should be corrected in the PR description.

packages/gemini-image/tests/conftest.py (1)

1-59: LGTM!

The pytest fixtures are well-structured and provide appropriate mocks for testing the Gemini image generation functionality. The fixtures correctly simulate the API response structure and follow pytest best practices.

packages/gemini-image/tests/test_models.py (1)

1-38: LGTM!

The test coverage for model configurations is appropriate and validates all the key constants used throughout the package.

packages/gemini-image/src/gemini_image/utils.py (2)

47-78: LGTM!

The load_image_as_base64 function correctly handles image loading, MIME type detection, and base64 encoding with proper error handling for missing files.


81-110: LGTM!

The decode_base64_image and get_file_extension utility functions are simple, correct, and follow best practices.

packages/gemini-image/src/gemini_image/generator.py (3)

1-52: LGTM!

The lazy loading pattern for the google.genai dependency is well-implemented and provides clear error messages. The complexity warning suppressions are appropriately documented in the module docstring.


307-428: LGTM with note on type handling.

The generate_story_sequence function correctly implements multi-part story generation with visual continuity. The function properly validates inputs, builds appropriate prompts for each part, and uses the previous image as a reference.

Note: The function assumes output_prefix is always a Path object at line 386 (.stem access), which is safe because line 349 ensures it's a Path when None is provided. The type hint correctly reflects this.


431-501: LGTM!

The finalize_draft function correctly implements draft finalization by delegating to generate_image with appropriate parameters. The function properly validates the draft exists and provides sensible defaults for aspect ratio and image size.

packages/gemini-image/src/gemini_image/models.py (1)

22-37: The model IDs "gemini-2.5-flash-image" and "gemini-3-pro-image-preview" are valid and confirmed in official Google Gemini API documentation for image generation. No changes needed.

Comment on lines +3 to +4
A comprehensive image generation library built on Google's Gemini models (Nano Banana / Nano Banana Pro).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Clarify or remove internal codenames.

"Nano Banana / Nano Banana Pro" appears to be internal codenames that may confuse end users. Consider either removing these or adding context explaining what they refer to.

🤖 Prompt for AI Agents
In packages/gemini-image/README.md around lines 3-4, the README references
internal codenames "Nano Banana / Nano Banana Pro" which may confuse users;
either remove these codenames or add a short parenthetical explaining they are
internal model nicknames (or map them to their official public model names),
update the sentence accordingly to use the public/official model names or
include a brief clarifying phrase, and ensure the README remains clear and
user-facing.

Comment on lines +61 to +63
# Multi-part story sequence
from gemini_image import generate_story_sequence

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Remove duplicate import statement.

generate_story_sequence is already imported on line 34. This duplicate import in the code example is redundant and may confuse readers.

 # Multi-part story sequence
-from gemini_image import generate_story_sequence
-
 images = generate_story_sequence(
🤖 Prompt for AI Agents
In packages/gemini-image/README.md around lines 61 to 63, the example duplicates
the import of generate_story_sequence (it's already imported on line 34); remove
the duplicate import line so the example uses the previously declared import, or
consolidate both examples to reference the single import at the top to avoid
redundancy.

Comment on lines +23 to +46
from gemini_image.generator import generate_image, generate_story_sequence
from gemini_image.models import (
ASPECT_RATIOS,
DEFAULT_MODEL,
IMAGE_SIZES,
MODELS,
AspectRatio,
ImageSize,
ModelConfig,
ModelKey,
)

__all__ = [
"ASPECT_RATIOS",
"DEFAULT_MODEL",
"IMAGE_SIZES",
"MODELS",
"AspectRatio",
"ImageSize",
"ModelConfig",
"ModelKey",
"generate_image",
"generate_story_sequence",
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing finalize_draft export.

The README.md documents finalize_draft() in the API Reference section (lines 172-185), but this function is not exported from __init__.py. Users following the documentation won't be able to import it from the package namespace.

Apply this diff to export finalize_draft:

-from gemini_image.generator import generate_image, generate_story_sequence
+from gemini_image.generator import finalize_draft, generate_image, generate_story_sequence

And update __all__:

 __all__ = [
     "ASPECT_RATIOS",
     "DEFAULT_MODEL",
     "IMAGE_SIZES",
     "MODELS",
     "AspectRatio",
     "ImageSize",
     "ModelConfig",
     "ModelKey",
+    "finalize_draft",
     "generate_image",
     "generate_story_sequence",
 ]
🤖 Prompt for AI Agents
In packages/gemini-image/src/gemini_image/__init__.py around lines 23 to 46, the
function finalize_draft is documented in the README but not exported from the
package; import finalize_draft from its module (likely from
gemini_image.generator or the correct module where it's defined) at the top of
the file and add "finalize_draft" to the __all__ list so it is available from
the package namespace for users following the docs.

Comment on lines +183 to +185
if args.story_parts < 2:
print("Error: Story must have at least 2 parts") # noqa: T201
sys.exit(1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Story parts validation should match generator's requirement.

The CLI validates args.story_parts < 2 (line 183), requiring at least 2 parts, but generate_story_sequence in generator.py validates num_parts < 1 (line 340), requiring at least 1 part. This inconsistency could confuse users who might legitimately want a single-part story.

Apply this diff to align the validation:

    # Story sequence mode
    if args.story_parts:
-        if args.story_parts < 2:
-            print("Error: Story must have at least 2 parts")  # noqa: T201
+        if args.story_parts < 1:
+            print("Error: Story must have at least 1 part")  # noqa: T201
            sys.exit(1)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if args.story_parts < 2:
print("Error: Story must have at least 2 parts") # noqa: T201
sys.exit(1)
if args.story_parts < 1:
print("Error: Story must have at least 1 part") # noqa: T201
sys.exit(1)
🤖 Prompt for AI Agents
In packages/gemini-image/src/gemini_image/cli.py around lines 183 to 185, the
CLI enforces a minimum of 2 story parts while generate_story_sequence expects at
least 1; update the CLI validation to match the generator by changing the check
to require args.story_parts < 1 and adjust the error message to "Error: Story
must have at least 1 part" (and exit with sys.exit(1)) so the two validations
are consistent.

Comment on lines +122 to +128
img_data, mime_type = load_image_as_base64(img_path)
contents.append(
types.Part.from_bytes(
data=base64.standard_b64decode(img_data),
mime_type=mime_type,
)
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Inefficient encoding/decoding roundtrip.

Lines 122-125 encode the image to base64 via load_image_as_base64, then immediately decode it back to bytes with base64.standard_b64decode(img_data). This is inefficient.

Consider one of these approaches:

  1. Option 1: Load the binary data directly without the encoding/decoding roundtrip:
with open(img_path, "rb") as f:
    img_bytes = f.read()
# Determine mime_type from suffix
suffix = img_path.suffix.lower()
mime_types = {".png": "image/png", ".jpg": "image/jpeg", ...}
mime_type = mime_types.get(suffix, "image/png")
contents.append(
    types.Part.from_bytes(
        data=img_bytes,
        mime_type=mime_type,
    )
)
  1. Option 2: Add a helper function in utils.py to load images as bytes directly, and use that instead of load_image_as_base64 here.
🤖 Prompt for AI Agents
In packages/gemini-image/src/gemini_image/generator.py around lines 122 to 128,
the code currently calls load_image_as_base64 and then immediately decodes it
back to bytes, causing an unnecessary base64 roundtrip; replace that with direct
binary reads (or call a new helper that returns bytes) and determine mime_type
from the file suffix (use a small suffix->mime map with sensible default) before
calling types.Part.from_bytes with the raw bytes and mime_type.

Comment on lines +29 to +35
if not api_key and env_file and env_file.exists():
with open(env_file) as f:
for line in f:
line = line.strip()
if line.startswith("GEMINI_API_KEY="):
api_key = line.split("=", 1)[1].strip().strip('"').strip("'")
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Specify encoding when opening text files.

Line 30 opens the .env file without specifying an encoding parameter. For text files, it's best practice to explicitly specify the encoding (typically 'utf-8') to ensure consistent behavior across platforms.

Apply this diff:

    if not api_key and env_file and env_file.exists():
-        with open(env_file) as f:
+        with open(env_file, encoding="utf-8") as f:
            for line in f:
                line = line.strip()
                if line.startswith("GEMINI_API_KEY="):
                    api_key = line.split("=", 1)[1].strip().strip('"').strip("'")
                    break
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not api_key and env_file and env_file.exists():
with open(env_file) as f:
for line in f:
line = line.strip()
if line.startswith("GEMINI_API_KEY="):
api_key = line.split("=", 1)[1].strip().strip('"').strip("'")
break
if not api_key and env_file and env_file.exists():
with open(env_file, encoding="utf-8") as f:
for line in f:
line = line.strip()
if line.startswith("GEMINI_API_KEY="):
api_key = line.split("=", 1)[1].strip().strip('"').strip("'")
break
🤖 Prompt for AI Agents
In packages/gemini-image/src/gemini_image/utils.py around lines 29 to 35, the
.env file is opened without an explicit encoding; update the open call to
specify encoding='utf-8' (e.g., with open(env_file, 'r', encoding='utf-8') as f)
so the file is read with a consistent text encoding across platforms while
preserving the existing logic.

patch.object(generator, "_types", mock_types),
patch.dict(os.environ, {}, clear=True),
):
os.environ.pop("GEMINI_API_KEY", None)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Redundant operation after clear=True.

After patch.dict(os.environ, {}, clear=True) on line 44, calling os.environ.pop("GEMINI_API_KEY", None) on line 46 is redundant since the environment has already been cleared.

Apply this diff to remove the redundant line:

        with (
            patch.object(generator, "_genai", mock_genai),
            patch.object(generator, "_types", mock_types),
            patch.dict(os.environ, {}, clear=True),
        ):
-            os.environ.pop("GEMINI_API_KEY", None)
            with pytest.raises(ValueError, match="GEMINI_API_KEY"):
                generator.generate_image("test prompt")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
os.environ.pop("GEMINI_API_KEY", None)
with (
patch.object(generator, "_genai", mock_genai),
patch.object(generator, "_types", mock_types),
patch.dict(os.environ, {}, clear=True),
):
with pytest.raises(ValueError, match="GEMINI_API_KEY"):
generator.generate_image("test prompt")
🤖 Prompt for AI Agents
In packages/gemini-image/tests/test_generator.py at line 46, the call
os.environ.pop("GEMINI_API_KEY", None) is redundant because
patch.dict(os.environ, {}, clear=True) on line 44 already clears the
environment; remove the os.environ.pop(...) line so the redundant operation is
eliminated and the test relies on the cleared environment.

Comment on lines +134 to +139
results = generator.generate_story_sequence(
base_prompt="A test story",
num_parts=3,
output_dir=tmp_path,
output_prefix=tmp_path / "story",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Type mismatch: output_prefix expects a prefix string, not a full Path.

Line 138 passes tmp_path / "story" as output_prefix, but according to the generator implementation (line 386 in generator.py), output_prefix is used as Path(f"{output_prefix.stem}_part{part_num}.png"). This works but is semantically incorrect—output_prefix should be a simple string prefix (e.g., "story"), not a full path that includes the directory.

Apply this diff to fix the semantic issue:

            results = generator.generate_story_sequence(
                base_prompt="A test story",
                num_parts=3,
                output_dir=tmp_path,
-                output_prefix=tmp_path / "story",
+                output_prefix=Path("story"),
            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
results = generator.generate_story_sequence(
base_prompt="A test story",
num_parts=3,
output_dir=tmp_path,
output_prefix=tmp_path / "story",
)
results = generator.generate_story_sequence(
base_prompt="A test story",
num_parts=3,
output_dir=tmp_path,
output_prefix=Path("story"),
)
🤖 Prompt for AI Agents
In packages/gemini-image/tests/test_generator.py around lines 134 to 139, the
test passes tmp_path / "story" (a Path) as output_prefix but the generator
expects a simple prefix string; change the call to pass output_prefix="story"
and keep output_dir=tmp_path (i.e., replace tmp_path / "story" with "story") so
the generator composes filenames correctly; update any related assertions in the
test that assumed a path-based prefix to reflect the string-based prefix usage.

Comment on lines +36 to +39
with patch.dict(os.environ, {}, clear=True):
# Remove GEMINI_API_KEY if it exists
os.environ.pop("GEMINI_API_KEY", None)
assert get_api_key(env_file=env_file) == "file-key-456"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Redundant os.environ.pop after clear=True.

When patch.dict(os.environ, {}, clear=True) is used, the environment is already cleared. The subsequent os.environ.pop("GEMINI_API_KEY", None) is redundant.

     def test_get_api_key_from_env_file(self, tmp_path: Path) -> None:
         """Test getting API key from .env file."""
         env_file = tmp_path / ".env"
         env_file.write_text('GEMINI_API_KEY="file-key-456"')

         with patch.dict(os.environ, {}, clear=True):
-            # Remove GEMINI_API_KEY if it exists
-            os.environ.pop("GEMINI_API_KEY", None)
             assert get_api_key(env_file=env_file) == "file-key-456"
🤖 Prompt for AI Agents
In packages/gemini-image/tests/test_utils.py around lines 36 to 39, the test
uses patch.dict(os.environ, {}, clear=True) which already clears the
environment, so the subsequent os.environ.pop("GEMINI_API_KEY", None) is
redundant; remove that pop call (or the entire redundant line) so the test
relies solely on the cleared env and still asserts
get_api_key(env_file=env_file) == "file-key-456".

Comment on lines +43 to +46
with patch.dict(os.environ, {}, clear=True):
os.environ.pop("GEMINI_API_KEY", None)
with pytest.raises(ValueError, match="GEMINI_API_KEY"):
get_api_key()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Same redundancy here.

The os.environ.pop is unnecessary after clear=True.

     def test_get_api_key_missing_raises(self) -> None:
         """Test that missing API key raises ValueError."""
         with patch.dict(os.environ, {}, clear=True):
-            os.environ.pop("GEMINI_API_KEY", None)
             with pytest.raises(ValueError, match="GEMINI_API_KEY"):
                 get_api_key()
🤖 Prompt for AI Agents
In packages/gemini-image/tests/test_utils.py around lines 43 to 46, the test
uses patch.dict(os.environ, {}, clear=True) and then redundantly calls
os.environ.pop("GEMINI_API_KEY", None); remove the os.environ.pop line since
clear=True already empties the environment, leaving only the patch.context and
the pytest.raises assertion to verify get_api_key() raises a ValueError
mentioning "GEMINI_API_KEY".

@williaby williaby merged commit 5592c44 into main Dec 15, 2025
23 of 33 checks passed
@williaby williaby deleted the feat/gemini-image-generation branch December 15, 2025 02:27
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies documentation Improvements or additions to documentation python tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant