Empower every Spring AI Java developer with enterprise-grade model routing capabilities at zero cost.
Mayfly is an enterprise-grade model routing enhancement plugin based on Spring AI, providing out-of-the-box load balancing, failover, circuit breaking, and other enterprise capabilities for Chinese Java developers, with deep integration support for domestic models (ZhiPu, Tongyi, DeepSeek, etc.).
| Feature | Description |
|---|---|
| π Unified Multi-Model API | Single interface to call different vendor models, hiding API differences |
| π― Intelligent Routing | Supports fixed, weighted, and rule-based (SpEL) routing strategies |
| βοΈ Load Balancing | Round-robin and weighted round-robin load balancing algorithms |
| π‘οΈ Failover | Automatic failover to backup models with cooldown mechanism |
| π Circuit Breaking | Circuit breaker and rate limiter based on Resilience4j |
| π Monitoring & Observability | Complete monitoring metrics based on Micrometer |
| π¨π³ Domestic Model Integration | Deep integration with ZhiPu, Tongyi Qwen, DeepSeek, and other domestic models |
| π Zero-Configuration Integration | Spring Boot Starter auto-configuration, minimal setup in just 3 lines |
<dependency>
<groupId>io.mayfly</groupId>
<artifactId>mayfly-spring-boot-starter</artifactId>
<version>1.0.0</version>
</dependency>Add configuration in application.yml:
π‘ For complete example configuration, see application-example.yml
mayfly:
models:
- name: zhipu-primary
provider: zhipu
api-key: ${ZHIPU_API_KEY}
model: glm-4
weight: 70
- name: tongyi-backup
provider: tongyi
api-key: ${TONGYI_API_KEY}
model: qwen-max
weight: 30@Service
public class ChatService {
private final ModelRouter modelRouter;
public ChatService(ModelRouter modelRouter) {
this.modelRouter = modelRouter;
}
public ChatResponse chat(String message) {
Prompt prompt = new Prompt(message);
return modelRouter.chat(prompt);
}
}That's it! Mayfly automatically handles routing, load balancing, failover, and all complex logic.
Mayfly is built on Spring AI as an enterprise-grade enhancement layer:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Application Layer β
β (User's Spring Boot App) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Mayfly Enhancement Layer β
β βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬βββββββββββββββ β
β β Smart β Load β Failover β Circuit β β
β β Router β Balancing β β Breaker β β
β βββββββββββββΌββββββββββββΌββββββββββββΌβββββββββββββββ€ β
β β Model β Health β Metrics β Config β β
β β Registry β Check β Collectorβ Manager β β
β βββββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Spring AI Layer β
β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β
β β ZhiPu β β Tongyi β βDeepSeek β β Others β β
β β Adapter β β Adapter β β Adapter β β Adapterβ β
β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Model Services β
β ZhiPu AI Tongyi Qwen DeepSeek ... β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Capability | Spring AI | Spring Cloud | Mayfly |
|---|---|---|---|
| Basic AI Calls | β | β | β (on Spring AI) |
| Model Routing | β | β | β |
| Load Balancing | β | β | β (for AI models) |
| Circuit Breaking | β | β | β (AI-optimized) |
| Domestic Models | β | β Deep Support | |
| Zero-Config | β | β | β |
Key Differentiators:
- π First of its Kind: The first enterprise-grade model governance tool for Spring AI ecosystem
- π Zero-Intrusion: Built on Spring AI, no code changes required
- π¨π³ Domestic Focus: Deep adaptation for Chinese LLM providers (ZhiPu, Tongyi, DeepSeek)
- π Complete Observability: Built-in metrics with Micrometer + Prometheus support
mayfly:
enabled: true
# Model Configuration
models:
- name: zhipu-primary
provider: zhipu
api-key: ${ZHIPU_API_KEY}
model: glm-4
weight: 70
timeout: 30000
max-retries: 2
tags:
- primary
- chinese
- name: tongyi-backup
provider: tongyi
api-key: ${TONGYI_API_KEY}
model: qwen-max
weight: 30
tags:
- backup
- name: deepseek-coder
provider: deepseek
api-key: ${DEEPSEEK_API_KEY}
model: deepseek-coder
tags:
- code
# Routing Configuration
router:
strategy: rule-based # fixed, weighted, rule-based
rules:
- name: vip-users
condition: "#request.metadata?.userType == 'VIP'"
target-model: zhipu-primary
priority: 1
- name: code-tasks
condition: "#request.metadata?.taskType == 'CODE'"
target-model: deepseek-coder
priority: 2
- name: default
condition: "true"
target-model: zhipu-primary
priority: 99
# Load Balancer Configuration
loadbalancer:
strategy: weighted-round-robin # round-robin, weighted-round-robin
health-check:
enabled: true
interval: 30s
timeout: 5s
unhealthy-threshold: 3
# Failover Configuration
failover:
enabled: true
max-retries: 2
cooldown-duration: 60s
retryable-exceptions:
- java.net.SocketTimeoutException
- org.springframework.web.client.HttpServerErrorException
# Circuit Breaker Configuration
circuit-breaker:
enabled: true
failure-rate-threshold: 50
wait-duration-in-open-state: 60s
sliding-window-size: 10
minimum-number-of-calls: 5
# Rate Limiter Configuration
rate-limiter:
enabled: true
limit-refresh-period: 1s
limit-for-period: 100
timeout-duration: 0s
# Monitoring Configuration
monitor:
enabled: trueWe welcome all forms of contributions! Please check our Contribution Guide to learn how to participate in project development.
- Issues: Submit issues or feature requests
- Email: git@xsjyby.asia
| Phase | Time | Milestone | Goals |
|---|---|---|---|
| Phase 1 | 2026.4-5 | MVP Enhancement | Support 8+ models, complete documentation |
| Phase 2 | 2026.6-7 | Production Ready | Performance optimization, Docker support, 3+ enterprise users |
| Phase 3 | 2026.8-10 | Community Growth | 100+ Stars, 500+ users, 5+ paid customers |
| Phase 4 | 2026.11-2027.3 | Ecosystem Maturity | 20+ models, 10+ partners, industry standard |
This project is licensed under the Apache License 2.0.