Gemini Image Generation API

A Next.js application that leverages Google's Gemini 2.0 Flash experimental model to generate and edit images through a RESTful API. This project can run standalone or via Docker Compose, featuring persistent storage for images and metadata, interactive API documentation with Swagger UI, and helper scripts for management. The application provides endpoints for image generation from text prompts, image editing with instructions, and health monitoring, making it ideal for developers looking to integrate AI-powered image capabilities into their applications or services.

Features • Demo • Quick Start • API • Use Cases • Docker

Demo

Portrait & People Photography

Generate professional-quality portraits and people photography for business, social media, or marketing.

Prompt: "Professional portrait photo of a business person in a modern office setting, wearing a suit, photorealistic, 8k resolution, studio lighting"

Product Visualization & Editing

Create product images and easily modify product attributes with simple text instructions.

Original Product Image	Edited Product Image
Prompt: "Modern smartphone with sleek design on a white background, product photography, 8k resolution, studio lighting"	Edit Prompt: "Change the smartphone color to blue and add a holographic display showing a 3D map"

Key Features

✨ High-Quality Image Generation - Create photorealistic images from text prompts
🎨 Image Editing - Modify existing images with natural language instructions
📊 Metadata Tracking - Store and retrieve image generation metadata
📝 Interactive API Documentation - Explore the API with Swagger UI
🔄 Persistent Storage - Images and metadata are saved between sessions
🚀 Fast Response Times - Optimized for quick image generation
🔍 Health Monitoring - API health check endpoint for monitoring
⚙️ Configurable - Easily adjust settings via environment variables

Prerequisites

Node.js 18+ installed
Google Cloud account with Gemini API access
Gemini API key

Quick Start

Clone the repository

git clone https://github.com/jkmaina/gemini-image-generator.git
cd gemini-image-generator

Install dependencies
```
npm install
```
Set up environment variables
```
cp .env.example .env.local
```
Edit .env.local and add your Gemini API key:
```
GEMINI_API_KEY=your_api_key_here
```
Start the development server
```
npm run dev
```
Access the application
- Web Interface: http://localhost:3010
- API Documentation: http://localhost:3010/docs

API Reference

Endpoints

Endpoint	Method	Description
`/api/generate`	POST	Generate an image from a text prompt
`/api/edit`	POST	Edit an existing image with instructions
`/api/health`	GET	Check API health status

Using the Web Interface

Open http://localhost:3000 in your browser
Enter a text prompt describing the image you want to generate
Click "Generate" and wait for your image
For editing, select an existing image and provide editing instructions

Using the API Directly

Generate a new image:

curl -X POST http://localhost:3000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "a professional headshot of a business executive"}'

Edit an existing image:

curl -X POST http://localhost:3000/api/edit \
  -H "Content-Type: application/json" \
  -d '{
    "imageUrl": "/generated-images/your-image-filename.png",
    "prompt": "change the background to a city skyline"
  }'

Use Cases

Professional Portraits: Generate headshots and professional photos
Product Photography: Create and edit product images for e-commerce
Marketing Materials: Design visual content for marketing campaigns
Social Media Content: Generate engaging visuals for social platforms
UI/UX Prototyping: Create interface mockups and design elements
Real Estate Visualization: Generate or edit property images
Fashion & Apparel: Visualize clothing items with different styles and colors

Docker Support (Optional)

If you prefer using Docker:

# Start the application
docker-compose up -d

# Stop the application
docker-compose down

Project Structure

├── app/                  # Next.js application code
│   ├── api/             # API routes
│   └── docs/            # API documentation
├── public/              # Static files
│   └── generated-images # Generated images storage
├── lib/                 # Utility functions
└── data/               # Metadata storage

Technical Stack

Frontend/Backend: Next.js 14
API Documentation: Swagger UI / OpenAPI
Containerization: Docker with Docker Compose (optional)
Image Generation: Google Gemini 2.0 Flash API
Storage: File-based with JSON metadata

Limitations

The Gemini 2.0 Flash model is experimental and may produce inconsistent results
Image generation quality depends on the clarity and specificity of prompts
API rate limits apply based on your Google Cloud account tier
Large batch processing may require additional optimization

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

If you find this project helpful, please give it a ⭐️ on GitHub!

Made with ❤️ by jkmaina

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
app		app
components		components
data/metadata		data/metadata
lib		lib
public		public
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
API.md		API.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
cloudbuild.yaml		cloudbuild.yaml
components.json		components.json
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
middleware.ts		middleware.ts
next-env.d.ts		next-env.d.ts
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
test-rate-limit-browser.js		test-rate-limit-browser.js
test-rate-limit.js		test-rate-limit.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gemini Image Generation API

Demo

Portrait & People Photography

Product Visualization & Editing

Key Features

Prerequisites

Quick Start

API Reference

Endpoints

Using the Web Interface

Using the API Directly

Use Cases

Docker Support (Optional)

Project Structure

Technical Stack

Limitations

Contributing

License

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

jkmaina/gemini-image-generator

Folders and files

Latest commit

History

Repository files navigation

Gemini Image Generation API

Demo

Portrait & People Photography

Product Visualization & Editing

Key Features

Prerequisites

Quick Start

API Reference

Endpoints

Using the Web Interface

Using the API Directly

Use Cases

Docker Support (Optional)

Project Structure

Technical Stack

Limitations

Contributing

License

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages