-
-
Notifications
You must be signed in to change notification settings - Fork 0
feat(gemini-image): add Gemini image generation package #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,189 @@ | ||
| # Gemini Image Generation | ||
|
|
||
| A comprehensive image generation library built on Google's Gemini models (Nano Banana / Nano Banana Pro). | ||
|
|
||
| ## Features | ||
|
|
||
| - **Text-to-image generation** with configurable resolution and aspect ratio | ||
| - **Reference-based editing** - modify existing images with prompts | ||
| - **Multi-part story generation** - sequential images with visual continuity | ||
| - **Draft-then-finalize workflow** - 75% cost reduction during iteration | ||
| - **Thinking mode** - visualize model reasoning with intermediate images | ||
|
|
||
| ## Installation | ||
|
|
||
| ```bash | ||
| # Using uv (recommended) | ||
| uv add byronwilliamscpa-gemini-image | ||
|
|
||
| # Using pip | ||
| pip install byronwilliamscpa-gemini-image | ||
| ``` | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ### Set API Key | ||
|
|
||
| ```bash | ||
| export GEMINI_API_KEY='your-api-key' | ||
| ``` | ||
|
|
||
| ### Python API | ||
|
|
||
| ```python | ||
| from gemini_image import generate_image, generate_story_sequence | ||
|
|
||
| # Basic text-to-image | ||
| result = generate_image("A futuristic city at sunset") | ||
| print(f"Image saved to: {result}") | ||
|
|
||
| # With resolution and aspect ratio | ||
| result = generate_image( | ||
| "A technical blueprint", | ||
| aspect_ratio="16:9", | ||
| image_size="2K", | ||
| verbose=True, | ||
| ) | ||
|
|
||
| # Draft mode for iteration (1K resolution) | ||
| draft = generate_image( | ||
| "A data governance diagram", | ||
| is_draft=True, | ||
| ) | ||
|
|
||
| # Reference-based editing | ||
| from pathlib import Path | ||
| edited = generate_image( | ||
| "Make the title larger", | ||
| reference_images=[Path("original.png")], | ||
| ) | ||
|
|
||
| # Multi-part story sequence | ||
| from gemini_image import generate_story_sequence | ||
|
|
||
|
Comment on lines
+61
to
+63
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick | 🔵 Trivial Remove duplicate import statement.
# Multi-part story sequence
-from gemini_image import generate_story_sequence
-
images = generate_story_sequence(🤖 Prompt for AI Agents |
||
| images = generate_story_sequence( | ||
| "A journey through data governance evolution", | ||
| num_parts=3, | ||
| aspect_ratio="16:9", | ||
| ) | ||
| ``` | ||
|
|
||
| ### Command Line | ||
|
|
||
| ```bash | ||
| # Basic generation | ||
| gemini-image "A serene mountain landscape at dawn" | ||
|
|
||
| # With output path | ||
| gemini-image "A data governance diagram" -o governance.png | ||
|
|
||
| # Draft mode (faster, lower cost) | ||
| gemini-image "A technical blueprint" --draft-mode -o draft.png | ||
|
|
||
| # Finalize draft at higher resolution | ||
| gemini-image --finalize draft.png --size 2K -o final.png | ||
|
|
||
| # Reference-based editing | ||
| gemini-image "Make the building taller" -r blueprint.png | ||
|
|
||
| # Multi-part story | ||
| gemini-image "Evolution of a data platform" --story-parts 4 -o evolution | ||
|
|
||
| # Show thinking process | ||
| gemini-image "Complex blueprint design" --save-thoughts --verbose | ||
|
|
||
| # List available models | ||
| gemini-image --list-models | ||
| ``` | ||
|
|
||
| ## Models | ||
|
|
||
| | Key | Model | Features | | ||
| |-----|-------|----------| | ||
| | `flash` | Gemini 2.5 Flash | Fast generation | | ||
| | `pro` | Gemini 3 Pro (default) | 4K, better text rendering, thinking mode | | ||
|
|
||
| ## Resolution Options (Pro Model) | ||
|
|
||
| | Size | Dimensions (16:9) | Use Case | | ||
| |------|-------------------|----------| | ||
| | 1K | ~1408 x 768 | Draft mode, fast iteration | | ||
| | 2K | 2752 x 1536 | Standard documents | | ||
| | 4K | 5504 x 3072 | High-detail, large prints | | ||
|
|
||
| ## Aspect Ratios | ||
|
|
||
| - `1:1` - Square | ||
| - `3:4` - Portrait | ||
| - `4:3` - Standard landscape | ||
| - `9:16` - Vertical/mobile | ||
| - `16:9` - Widescreen (default) | ||
|
|
||
| ## Draft-Then-Finalize Workflow | ||
|
|
||
| Reduce costs by ~75% during iteration: | ||
|
|
||
| ```bash | ||
| # 1. Generate draft at 1K | ||
| gemini-image "A technical blueprint" --draft-mode -o draft.png | ||
|
|
||
| # 2. Iterate on draft | ||
| gemini-image "Add more detail to the header" -r draft.png --draft-mode -o draft_v2.png | ||
|
|
||
| # 3. Finalize at 2K when satisfied | ||
| gemini-image --finalize draft_v2.png --size 2K -o final.png | ||
| ``` | ||
|
|
||
| ## API Reference | ||
|
|
||
| ### `generate_image()` | ||
|
|
||
| ```python | ||
| def generate_image( | ||
| prompt: str, | ||
| model_key: ModelKey = "pro", | ||
| reference_images: list[Path] | None = None, | ||
| output_path: Path | None = None, | ||
| output_dir: Path | None = None, | ||
| aspect_ratio: AspectRatio | None = None, | ||
| image_size: ImageSize | None = None, | ||
| use_search: bool = False, | ||
| save_thoughts: bool = False, | ||
| verbose: bool = False, | ||
| is_draft: bool = False, | ||
| ) -> Path | None: | ||
| ``` | ||
|
|
||
| ### `generate_story_sequence()` | ||
|
|
||
| ```python | ||
| def generate_story_sequence( | ||
| base_prompt: str, | ||
| num_parts: int, | ||
| model_key: ModelKey = "pro", | ||
| output_prefix: Path | None = None, | ||
| output_dir: Path | None = None, | ||
| aspect_ratio: AspectRatio | None = None, | ||
| image_size: ImageSize | None = None, | ||
| verbose: bool = False, | ||
| ) -> list[Path]: | ||
| ``` | ||
|
|
||
| ### `finalize_draft()` | ||
|
|
||
| ```python | ||
| def finalize_draft( | ||
| draft_path: Path, | ||
| prompt: str | None = None, | ||
| model_key: ModelKey = "pro", | ||
| output_path: Path | None = None, | ||
| output_dir: Path | None = None, | ||
| aspect_ratio: AspectRatio | None = None, | ||
| image_size: ImageSize | None = None, | ||
| verbose: bool = False, | ||
| ) -> Path | None: | ||
| ``` | ||
|
|
||
| ## License | ||
|
|
||
| MIT | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| [project] | ||
| name = "byronwilliamscpa-gemini-image" | ||
| version = "0.1.0" | ||
| description = "Image generation using Google Gemini models (Nano Banana / Nano Banana Pro)" | ||
| readme = "README.md" | ||
| requires-python = ">=3.10,<3.15" | ||
| license = {text = "MIT"} | ||
| authors = [ | ||
| {name = "Byron Williams", email = "byronawilliams@gmail.com"} | ||
| ] | ||
| keywords = ["gemini", "image-generation", "ai", "google", "genai", "text-to-image"] | ||
| classifiers = [ | ||
| "Development Status :: 4 - Beta", | ||
| "Intended Audience :: Developers", | ||
| "License :: OSI Approved :: MIT License", | ||
| "Programming Language :: Python :: 3.10", | ||
| "Programming Language :: Python :: 3.11", | ||
| "Programming Language :: Python :: 3.12", | ||
| "Programming Language :: Python :: 3.13", | ||
| "Topic :: Multimedia :: Graphics", | ||
| "Topic :: Scientific/Engineering :: Artificial Intelligence", | ||
| "Typing :: Typed", | ||
| ] | ||
|
|
||
| dependencies = [ | ||
| "google-genai>=1.0.0", | ||
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
| dev = [ | ||
| "pytest>=7.4.0", | ||
| "pytest-cov>=4.1.0", | ||
| "pytest-asyncio>=0.21.0", | ||
| ] | ||
|
|
||
| [project.scripts] | ||
| gemini-image = "gemini_image.cli:main" | ||
|
|
||
| [project.urls] | ||
| Homepage = "https://github.com/ByronWilliamsCPA/python-libs" | ||
| Repository = "https://github.com/ByronWilliamsCPA/python-libs" | ||
| Documentation = "https://github.com/ByronWilliamsCPA/python-libs#readme" | ||
|
|
||
| [build-system] | ||
| requires = ["hatchling"] | ||
| build-backend = "hatchling.build" | ||
|
|
||
| [tool.hatch.build.targets.wheel] | ||
| packages = ["src/gemini_image"] | ||
|
|
||
| # Per-package semantic release configuration | ||
| [tool.semantic_release] | ||
| version_toml = ["pyproject.toml:project.version"] | ||
| tag_format = "gemini-image-v{version}" | ||
|
|
||
| [tool.semantic_release.commit_parser_options] | ||
| allowed_tags = ["feat", "fix", "perf", "refactor", "docs", "style", "test", "build", "ci", "chore"] | ||
| minor_tags = ["feat"] | ||
| patch_tags = ["fix", "perf"] | ||
|
|
||
| [tool.semantic_release.changelog] | ||
| changelog_file = "CHANGELOG.md" | ||
|
|
||
| [tool.semantic_release.branches.main] | ||
| match = "(main|master)" | ||
| prerelease = false |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| """Gemini Image Generation Library. | ||
|
|
||
| A comprehensive image generation system built on Google's Gemini models. | ||
|
|
||
| Features: | ||
| - Text-to-image generation with configurable resolution and aspect ratio | ||
| - Reference-based image editing and refinement | ||
| - Multi-part story sequence generation with visual continuity | ||
| - Draft-then-finalize workflow for cost optimization | ||
| - Thinking mode with intermediate image visualization | ||
|
|
||
| Models: | ||
| - flash: Gemini 2.5 Flash (fast generation) | ||
| - pro: Gemini 3 Pro (4K, better text rendering, thinking mode) | ||
|
|
||
| Example: | ||
| >>> from gemini_image import generate_image, MODELS | ||
| >>> result = generate_image("A futuristic city at sunset") | ||
| >>> print(f"Image saved to: {result}") | ||
|
|
||
| """ | ||
|
|
||
| from gemini_image.generator import generate_image, generate_story_sequence | ||
| from gemini_image.models import ( | ||
| ASPECT_RATIOS, | ||
| DEFAULT_MODEL, | ||
| IMAGE_SIZES, | ||
| MODELS, | ||
| AspectRatio, | ||
| ImageSize, | ||
| ModelConfig, | ||
| ModelKey, | ||
| ) | ||
|
|
||
| __all__ = [ | ||
| "ASPECT_RATIOS", | ||
| "DEFAULT_MODEL", | ||
| "IMAGE_SIZES", | ||
| "MODELS", | ||
| "AspectRatio", | ||
| "ImageSize", | ||
| "ModelConfig", | ||
| "ModelKey", | ||
| "generate_image", | ||
| "generate_story_sequence", | ||
| ] | ||
|
Comment on lines
+23
to
+46
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing The README.md documents Apply this diff to export -from gemini_image.generator import generate_image, generate_story_sequence
+from gemini_image.generator import finalize_draft, generate_image, generate_story_sequenceAnd update __all__ = [
"ASPECT_RATIOS",
"DEFAULT_MODEL",
"IMAGE_SIZES",
"MODELS",
"AspectRatio",
"ImageSize",
"ModelConfig",
"ModelKey",
+ "finalize_draft",
"generate_image",
"generate_story_sequence",
]🤖 Prompt for AI Agents |
||
|
|
||
| __version__ = "0.1.0" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Clarify or remove internal codenames.
"Nano Banana / Nano Banana Pro" appears to be internal codenames that may confuse end users. Consider either removing these or adding context explaining what they refer to.
🤖 Prompt for AI Agents