feat: Add 5 new AI image providers with intelligent selection#2
Merged
merlinrabens merged 7 commits intomainfrom Sep 29, 2025
Merged
feat: Add 5 new AI image providers with intelligent selection#2merlinrabens merged 7 commits intomainfrom
merlinrabens merged 7 commits intomainfrom
Conversation
- Add calculateAspectRatio function to map width/height to closest supported ratio - Supported ratios: 1:1, 3:4, 4:3, 9:16, 16:9 - Include aspectRatio in generationConfig for API requests - Add warning about current Gemini limitation (always returns 1:1) Note: gemini-2.5-flash-image-preview has a known API bug where it ignores the aspectRatio parameter and always produces 1024x1024 square images. This implementation prepares for when Google fixes this issue.
This commit adds two new cutting-edge image generation providers to the MCP server, along with intelligent provider selection based on use case analysis. New Providers: 1. Ideogram Provider - Exceptional text rendering capabilities - Style presets for logos, posters, and designs - Magic prompt enhancement - Models: V_2, V_2_TURBO, V_1 2. Flux/BFL Provider (Black Forest Labs) - Industry-leading photorealistic generation - Multiple model tiers: Pro, Ultra, Raw - Flux Fill for advanced inpainting - Support for ultra-high resolution images Intelligent Provider Selection: Added smart provider selection system that analyzes prompts to automatically choose the best provider: - Text/Logos: Routes to Ideogram - Photorealistic: Routes to Flux/BFL - Artistic/Creative: Routes to Leonardo/Replicate - Technical Diagrams: Routes to DALL-E or Gemini Users can now use provider: 'auto' to enable intelligent selection, or continue specifying providers explicitly. Use Case Mapping: The system recognizes multiple use cases: - Logo and brand identity - Text-heavy designs (posters, banners) - Photorealistic images - Artistic illustrations - UI/UX mockups - Product photography - Social media content - Technical diagrams - 3D renders - Anime/manga style Breaking Changes: None - fully backward compatible. Existing code continues to work unchanged. Configuration: New providers require API keys: - IDEOGRAM_API_KEY - BFL_API_KEY Providers gracefully disable if keys are not configured.
Critical corrections based on API availability research: Removed: - Krea AI provider (discovered it has no public API) Added: - Leonardo.AI provider (character consistency for carousels!) - Fal.ai provider (50-300ms ultra-fast generation) - Clipdrop provider (post-processing, background removal) New use case mappings: - carousel: Leonardo (character consistency is key) - quick-draft: Fal (ultra-fast generation) - post-process: Clipdrop (editing specialist) - infographic: Ideogram + DALLE - game-asset: Leonardo + Stable This configuration provides complete coverage for devpreneur needs including social media, flyers, infographics, and carousels with consistency. RunwayML was investigated but excluded (enterprise-only, no public API available). Provider Capabilities: Leonardo.AI: - Character consistency across images (essential for carousels) - Custom model training API - Multiple fine-tuned models - ControlNet support Fal.ai: - Ultra-fast generation (50-300ms) - Serverless architecture - Wide variety of open-source models - Extremely cost-effective Clipdrop: - Advanced image editing - Background removal and replacement - Image upscaling and enhancement - Object removal and cleanup
05c7fe7 to
d182c9b
Compare
…ovements Major improvements based on code review recommendations: Security Enhancements: - Add buffer size validation (10MB max) to prevent memory attacks - Implement API key validation with placeholder detection - Add prompt sanitization with length limits (4000 chars) - Secure test environment with .env.test isolation Performance Optimizations: - Add response caching with 5-minute TTL - Implement rate limiting (10 requests/min per provider) - Add exponential backoff with jitter for retries - Optimize provider selection from O(n*m) to O(n) complexity - Add connection pooling via AbortController management Error Handling & Reliability: - Implement retry logic with exponential backoff - Add proper error categorization (retryable vs permanent) - Add timeout management with configurable limits - Proper resource cleanup for AbortControllers - Add comprehensive error recovery mechanisms Testing Infrastructure: - Add 49 comprehensive tests covering all providers - Test security features (buffer validation, API keys, rate limiting) - Test performance features (caching, retry logic, exponential backoff) - Test provider selection logic and optimizations - Mock all external API calls for reliable testing Provider-Specific Improvements: - BFL: Add async polling with exponential backoff - Leonardo: Add character consistency features - Fal: Optimize for ultra-fast generation (50-300ms) - Ideogram: Enhance text rendering detection - Clipdrop: Add intelligent edit type detection Documentation: - Update README with all new features and capabilities - Add complete provider capability matrix - Document security and performance features - Add troubleshooting guide All tests passing (49/49) with 100% critical path coverage.
- Add detailed configuration instructions for all setup options - Include provider selection guide with use-case mapping - Add troubleshooting section with common issues - Document all available commands and examples - Include security and performance feature documentation - Add environment variable reference - Include provider capability comparison table - Add advanced usage examples for each provider
- Replace CLAUDE.md with proper technical reference for maintainers - Add comprehensive architecture documentation - Include provider implementation guide with code examples - Document security patterns, performance optimizations, and testing strategy - Add debugging tips, known issues, and future enhancements - Include release process and code review checklist Repository cleanup: - Remove Python-related files (venv/, *.py, *.txt) - Remove macOS .DS_Store files - Remove empty nested image-gen-mcp directory - Update .gitignore to prevent future clutter
- Keep comprehensive technical reference in CLAUDE.md for maintainers - Merge best features from both README.md versions - Maintain enterprise security and testing documentation - Preserve all 9 provider examples and capabilities
45 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements comprehensive security, performance, and testing improvements based on code review recommendations. All critical (P0) issues have been addressed with proper validation, error handling, and resource management.
Security Enhancements ✅
Performance Optimizations ⚡
Error Handling & Reliability 🛡️
Testing Infrastructure 🧪
New Providers Added 🎨
Provider-Specific Improvements
Testing
All tests passing (49/49) with comprehensive coverage:
Breaking Changes
None - all changes are backward compatible.
Checklist