Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
189 changes: 189 additions & 0 deletions packages/gemini-image/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
# Gemini Image Generation

A comprehensive image generation library built on Google's Gemini models (Nano Banana / Nano Banana Pro).

Comment on lines +3 to +4
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Clarify or remove internal codenames.

"Nano Banana / Nano Banana Pro" appears to be internal codenames that may confuse end users. Consider either removing these or adding context explaining what they refer to.

🤖 Prompt for AI Agents
In packages/gemini-image/README.md around lines 3-4, the README references
internal codenames "Nano Banana / Nano Banana Pro" which may confuse users;
either remove these codenames or add a short parenthetical explaining they are
internal model nicknames (or map them to their official public model names),
update the sentence accordingly to use the public/official model names or
include a brief clarifying phrase, and ensure the README remains clear and
user-facing.

## Features

- **Text-to-image generation** with configurable resolution and aspect ratio
- **Reference-based editing** - modify existing images with prompts
- **Multi-part story generation** - sequential images with visual continuity
- **Draft-then-finalize workflow** - 75% cost reduction during iteration
- **Thinking mode** - visualize model reasoning with intermediate images

## Installation

```bash
# Using uv (recommended)
uv add byronwilliamscpa-gemini-image

# Using pip
pip install byronwilliamscpa-gemini-image
```

## Quick Start

### Set API Key

```bash
export GEMINI_API_KEY='your-api-key'
```

### Python API

```python
from gemini_image import generate_image, generate_story_sequence

# Basic text-to-image
result = generate_image("A futuristic city at sunset")
print(f"Image saved to: {result}")

# With resolution and aspect ratio
result = generate_image(
"A technical blueprint",
aspect_ratio="16:9",
image_size="2K",
verbose=True,
)

# Draft mode for iteration (1K resolution)
draft = generate_image(
"A data governance diagram",
is_draft=True,
)

# Reference-based editing
from pathlib import Path
edited = generate_image(
"Make the title larger",
reference_images=[Path("original.png")],
)

# Multi-part story sequence
from gemini_image import generate_story_sequence

Comment on lines +61 to +63
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Remove duplicate import statement.

generate_story_sequence is already imported on line 34. This duplicate import in the code example is redundant and may confuse readers.

 # Multi-part story sequence
-from gemini_image import generate_story_sequence
-
 images = generate_story_sequence(
🤖 Prompt for AI Agents
In packages/gemini-image/README.md around lines 61 to 63, the example duplicates
the import of generate_story_sequence (it's already imported on line 34); remove
the duplicate import line so the example uses the previously declared import, or
consolidate both examples to reference the single import at the top to avoid
redundancy.

images = generate_story_sequence(
"A journey through data governance evolution",
num_parts=3,
aspect_ratio="16:9",
)
```

### Command Line

```bash
# Basic generation
gemini-image "A serene mountain landscape at dawn"

# With output path
gemini-image "A data governance diagram" -o governance.png

# Draft mode (faster, lower cost)
gemini-image "A technical blueprint" --draft-mode -o draft.png

# Finalize draft at higher resolution
gemini-image --finalize draft.png --size 2K -o final.png

# Reference-based editing
gemini-image "Make the building taller" -r blueprint.png

# Multi-part story
gemini-image "Evolution of a data platform" --story-parts 4 -o evolution

# Show thinking process
gemini-image "Complex blueprint design" --save-thoughts --verbose

# List available models
gemini-image --list-models
```

## Models

| Key | Model | Features |
|-----|-------|----------|
| `flash` | Gemini 2.5 Flash | Fast generation |
| `pro` | Gemini 3 Pro (default) | 4K, better text rendering, thinking mode |

## Resolution Options (Pro Model)

| Size | Dimensions (16:9) | Use Case |
|------|-------------------|----------|
| 1K | ~1408 x 768 | Draft mode, fast iteration |
| 2K | 2752 x 1536 | Standard documents |
| 4K | 5504 x 3072 | High-detail, large prints |

## Aspect Ratios

- `1:1` - Square
- `3:4` - Portrait
- `4:3` - Standard landscape
- `9:16` - Vertical/mobile
- `16:9` - Widescreen (default)

## Draft-Then-Finalize Workflow

Reduce costs by ~75% during iteration:

```bash
# 1. Generate draft at 1K
gemini-image "A technical blueprint" --draft-mode -o draft.png

# 2. Iterate on draft
gemini-image "Add more detail to the header" -r draft.png --draft-mode -o draft_v2.png

# 3. Finalize at 2K when satisfied
gemini-image --finalize draft_v2.png --size 2K -o final.png
```

## API Reference

### `generate_image()`

```python
def generate_image(
prompt: str,
model_key: ModelKey = "pro",
reference_images: list[Path] | None = None,
output_path: Path | None = None,
output_dir: Path | None = None,
aspect_ratio: AspectRatio | None = None,
image_size: ImageSize | None = None,
use_search: bool = False,
save_thoughts: bool = False,
verbose: bool = False,
is_draft: bool = False,
) -> Path | None:
```

### `generate_story_sequence()`

```python
def generate_story_sequence(
base_prompt: str,
num_parts: int,
model_key: ModelKey = "pro",
output_prefix: Path | None = None,
output_dir: Path | None = None,
aspect_ratio: AspectRatio | None = None,
image_size: ImageSize | None = None,
verbose: bool = False,
) -> list[Path]:
```

### `finalize_draft()`

```python
def finalize_draft(
draft_path: Path,
prompt: str | None = None,
model_key: ModelKey = "pro",
output_path: Path | None = None,
output_dir: Path | None = None,
aspect_ratio: AspectRatio | None = None,
image_size: ImageSize | None = None,
verbose: bool = False,
) -> Path | None:
```

## License

MIT
66 changes: 66 additions & 0 deletions packages/gemini-image/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
[project]
name = "byronwilliamscpa-gemini-image"
version = "0.1.0"
description = "Image generation using Google Gemini models (Nano Banana / Nano Banana Pro)"
readme = "README.md"
requires-python = ">=3.10,<3.15"
license = {text = "MIT"}
authors = [
{name = "Byron Williams", email = "byronawilliams@gmail.com"}
]
keywords = ["gemini", "image-generation", "ai", "google", "genai", "text-to-image"]
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Multimedia :: Graphics",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Typing :: Typed",
]

dependencies = [
"google-genai>=1.0.0",
]

[project.optional-dependencies]
dev = [
"pytest>=7.4.0",
"pytest-cov>=4.1.0",
"pytest-asyncio>=0.21.0",
]

[project.scripts]
gemini-image = "gemini_image.cli:main"

[project.urls]
Homepage = "https://github.com/ByronWilliamsCPA/python-libs"
Repository = "https://github.com/ByronWilliamsCPA/python-libs"
Documentation = "https://github.com/ByronWilliamsCPA/python-libs#readme"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src/gemini_image"]

# Per-package semantic release configuration
[tool.semantic_release]
version_toml = ["pyproject.toml:project.version"]
tag_format = "gemini-image-v{version}"

[tool.semantic_release.commit_parser_options]
allowed_tags = ["feat", "fix", "perf", "refactor", "docs", "style", "test", "build", "ci", "chore"]
minor_tags = ["feat"]
patch_tags = ["fix", "perf"]

[tool.semantic_release.changelog]
changelog_file = "CHANGELOG.md"

[tool.semantic_release.branches.main]
match = "(main|master)"
prerelease = false
48 changes: 48 additions & 0 deletions packages/gemini-image/src/gemini_image/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
"""Gemini Image Generation Library.

A comprehensive image generation system built on Google's Gemini models.

Features:
- Text-to-image generation with configurable resolution and aspect ratio
- Reference-based image editing and refinement
- Multi-part story sequence generation with visual continuity
- Draft-then-finalize workflow for cost optimization
- Thinking mode with intermediate image visualization

Models:
- flash: Gemini 2.5 Flash (fast generation)
- pro: Gemini 3 Pro (4K, better text rendering, thinking mode)

Example:
>>> from gemini_image import generate_image, MODELS
>>> result = generate_image("A futuristic city at sunset")
>>> print(f"Image saved to: {result}")

"""

from gemini_image.generator import generate_image, generate_story_sequence
from gemini_image.models import (
ASPECT_RATIOS,
DEFAULT_MODEL,
IMAGE_SIZES,
MODELS,
AspectRatio,
ImageSize,
ModelConfig,
ModelKey,
)

__all__ = [
"ASPECT_RATIOS",
"DEFAULT_MODEL",
"IMAGE_SIZES",
"MODELS",
"AspectRatio",
"ImageSize",
"ModelConfig",
"ModelKey",
"generate_image",
"generate_story_sequence",
]
Comment on lines +23 to +46
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing finalize_draft export.

The README.md documents finalize_draft() in the API Reference section (lines 172-185), but this function is not exported from __init__.py. Users following the documentation won't be able to import it from the package namespace.

Apply this diff to export finalize_draft:

-from gemini_image.generator import generate_image, generate_story_sequence
+from gemini_image.generator import finalize_draft, generate_image, generate_story_sequence

And update __all__:

 __all__ = [
     "ASPECT_RATIOS",
     "DEFAULT_MODEL",
     "IMAGE_SIZES",
     "MODELS",
     "AspectRatio",
     "ImageSize",
     "ModelConfig",
     "ModelKey",
+    "finalize_draft",
     "generate_image",
     "generate_story_sequence",
 ]
🤖 Prompt for AI Agents
In packages/gemini-image/src/gemini_image/__init__.py around lines 23 to 46, the
function finalize_draft is documented in the README but not exported from the
package; import finalize_draft from its module (likely from
gemini_image.generator or the correct module where it's defined) at the top of
the file and add "finalize_draft" to the __all__ list so it is available from
the package namespace for users following the docs.


__version__ = "0.1.0"
Loading
Loading