Generate unique robot avatars for programming languages and concepts using AI image generation models.
- Dual Model Support: Choose between OpenAI GPT-4o and Google Gemini 2.5 Flash Image (Nano Banana)
- Smart Research: Both models research the concept before generating to ensure accurate representation
- Reference-Based Generation: Uses existing robot designs as style references for consistency
- Intelligent Caching: Reuses existing images to avoid unnecessary API costs
- Cost Tracking: Real-time cost tracking for both models with session summaries
- Gallery View: Browse all generated robots in a visual gallery
- Model Testing: Built-in test functionality to verify API connectivity
Feature | OpenAI GPT-4o | Google Gemini 2.5 Flash Image |
---|---|---|
Cost per Image | ~$0.1664 | ~$0.0387 (4.3x cheaper!) |
Quality | High quality, detailed | High quality, fast generation |
Speed | Moderate | Fast |
Best For | Complex designs, high fidelity | Rapid prototyping, cost efficiency |
- $30.00 per 1 million output tokens
- Each image uses 1,290 output tokens
- Cost per image: $0.0387
- Input: $10.00 per 1M tokens
- Output: $40.00 per 1M tokens
- Image generation: ~4,160 tokens
- Cost per image: ~$0.1664
- Clone the repository:
git clone <repository-url>
cd "Robot Image Playground"
- Install dependencies:
npm install
- Start the server:
npm start
- Open your browser and navigate to:
http://localhost:3000
- Select a Model: Use the dropdown at the top to choose between OpenAI and Google Gemini
- Enter a Prompt: Type a programming language or concept (e.g., "Python", "Security", "React Native")
- Click Generate: The system will:
- Research the concept to understand its visual identity
- Analyze reference images for consistent style
- Generate a unique robot avatar
- Display cost information
Click the "Test Model" button to verify that your selected model is working correctly. This will:
- Test text generation capabilities
- Test image generation capabilities
- Report the test cost
The session cost summary shows:
- Number of API calls per model
- Total cost per model
- Combined total cost
Cached images (reused from previous generations) incur no additional cost.
Generate a robot image.
Request:
{
"prompt": "Python",
"model": "openai" | "google"
}
Response:
{
"success": true,
"filename": "python_1234567890.png",
"research": "Research results...",
"tokenUsage": { ... },
"cost": "$0.0387",
"cached": false
}
Test model connectivity and functionality.
Request:
{
"model": "openai" | "google"
}
Response:
{
"success": true,
"message": "Model is working!",
"cost": "$0.0001"
}
Get list of all generated images.
Response:
[
{
"filename": "python_1234567890.png",
"name": "python"
}
]
Both models use the same prompt generation system:
- Research Phase: Understands the concept's visual identity, colors, and characteristics
- Style Analysis: Analyzes reference images to maintain consistent robot style
- Prompt Building: Creates detailed generation instructions
- Generation: Model-specific image generation
- Finds related robots from existing library
- Uses them as primary references for consistency
- Maintains visual identity across variations
Robot Image Playground/
├── index.html # Web interface
├── server.js # Express server with dual model support
├── package.json # Dependencies
├── Generated/ # Generated robot images
├── Reference Images/ # Reference robot designs
└── Png/ # Additional reference images
The API keys are currently hardcoded in server.js
. For production use, consider using environment variables:
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
npm run dev
This uses nodemon for automatic server restarts on file changes.
To add a new model:
- Add the model option to the dropdown in
index.html
- Implement a
generateWith[ModelName]
function inserver.js
- Update the
/api/generate
endpoint to handle the new model - Add test functionality in
/api/test-model
- Verify API keys are correct
- Check network connectivity
- Ensure API quotas haven't been exceeded
- Check console logs in browser developer tools
- Review server logs for detailed error messages
- Verify reference images are present
- Use Google Gemini for cost-effective generation
- Leverage caching by reusing common prompts
- Monitor the cost summary panel
MIT