A Next.js application that leverages Google's Gemini 2.0 Flash experimental model to generate and edit images through a RESTful API. This project can run standalone or via Docker Compose, featuring persistent storage for images and metadata, interactive API documentation with Swagger UI, and helper scripts for management. The application provides endpoints for image generation from text prompts, image editing with instructions, and health monitoring, making it ideal for developers looking to integrate AI-powered image capabilities into their applications or services.
Generate professional-quality portraits and people photography for business, social media, or marketing.
Prompt: "Professional portrait photo of a business person in a modern office setting, wearing a suit, photorealistic, 8k resolution, studio lighting"
Create product images and easily modify product attributes with simple text instructions.
- ✨ High-Quality Image Generation - Create photorealistic images from text prompts
- 🎨 Image Editing - Modify existing images with natural language instructions
- 📊 Metadata Tracking - Store and retrieve image generation metadata
- 📝 Interactive API Documentation - Explore the API with Swagger UI
- 🔄 Persistent Storage - Images and metadata are saved between sessions
- 🚀 Fast Response Times - Optimized for quick image generation
- 🔍 Health Monitoring - API health check endpoint for monitoring
- ⚙️ Configurable - Easily adjust settings via environment variables
- Node.js 18+ installed
- Google Cloud account with Gemini API access
- Gemini API key
-
Clone the repository
git clone https://github.com/jkmaina/gemini-image-generator.git cd gemini-image-generator
-
Install dependencies
npm install
-
Set up environment variables
cp .env.example .env.local
Edit
.env.local
and add your Gemini API key:GEMINI_API_KEY=your_api_key_here
-
Start the development server
npm run dev
-
Access the application
- Web Interface: http://localhost:3010
- API Documentation: http://localhost:3010/docs
Endpoint | Method | Description |
---|---|---|
/api/generate |
POST | Generate an image from a text prompt |
/api/edit |
POST | Edit an existing image with instructions |
/api/health |
GET | Check API health status |
- Open http://localhost:3000 in your browser
- Enter a text prompt describing the image you want to generate
- Click "Generate" and wait for your image
- For editing, select an existing image and provide editing instructions
Generate a new image:
curl -X POST http://localhost:3000/api/generate \
-H "Content-Type: application/json" \
-d '{"prompt": "a professional headshot of a business executive"}'
Edit an existing image:
curl -X POST http://localhost:3000/api/edit \
-H "Content-Type: application/json" \
-d '{
"imageUrl": "/generated-images/your-image-filename.png",
"prompt": "change the background to a city skyline"
}'
- Professional Portraits: Generate headshots and professional photos
- Product Photography: Create and edit product images for e-commerce
- Marketing Materials: Design visual content for marketing campaigns
- Social Media Content: Generate engaging visuals for social platforms
- UI/UX Prototyping: Create interface mockups and design elements
- Real Estate Visualization: Generate or edit property images
- Fashion & Apparel: Visualize clothing items with different styles and colors
If you prefer using Docker:
# Start the application
docker-compose up -d
# Stop the application
docker-compose down
├── app/ # Next.js application code
│ ├── api/ # API routes
│ └── docs/ # API documentation
├── public/ # Static files
│ └── generated-images # Generated images storage
├── lib/ # Utility functions
└── data/ # Metadata storage
- Frontend/Backend: Next.js 14
- API Documentation: Swagger UI / OpenAPI
- Containerization: Docker with Docker Compose (optional)
- Image Generation: Google Gemini 2.0 Flash API
- Storage: File-based with JSON metadata
- The Gemini 2.0 Flash model is experimental and may produce inconsistent results
- Image generation quality depends on the clarity and specificity of prompts
- API rate limits apply based on your Google Cloud account tier
- Large batch processing may require additional optimization
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
If you find this project helpful, please give it a ⭐️ on GitHub!