Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -329,16 +329,53 @@ The ZaiClient provides access to comprehensive AI services:
- `glm-4` - Standard GLM-4 model
- `glm-4-air` - Lightweight version for speed
- `glm-4-flash` - Ultra-fast response model
- `glm-4-0520` - GLM-4 model version 0520
- `glm-4-airx` - Extended Air model with additional features
- `glm-4-long` - Optimized for long-context conversations
- `glm-4-voice` - Specialized for voice interactions
- `glm-4.1v-thinking-flash` - Visual reasoning model with thinking capabilities
- `glm-z1-air` - Optimized for mathematical and logical reasoning
- `glm-z1-airx` - Fastest domestic inference model with 200 tokens/s
- `glm-z1-flash` - Completely free reasoning model service
- `glm-4-air-250414` - Enhanced with reinforcement learning optimization
- `glm-4-flash-250414` - Latest free language model
- `glm-4-flashx` - Enhanced Flash version with ultra-fast inference speed
- `glm-4-9b` - Open-source model with 9 billion parameters
- `glm-4-assistant` - AI assistant for various business scenarios
- `glm-4-alltools` - Agent model for complex task planning and execution
- `chatglm3-6b` - Open-source base model with 6 billion parameters
- `codegeex-4` - Code generation and completion model

### Audio Speech Recognition
- `glm-asr` - Context-aware audio transcription model

### Real-time Interaction
- `glm-realtime-air` - Real-time video call model with cross-modal reasoning
- `glm-realtime-flash` - Fast real-time video call model

### Vision Models
- `glm-4v-plus` - Enhanced vision model
- `glm-4v` - Standard vision model
- `glm-4v-plus-0111` - Variable resolution video and image understanding
- `glm-4v-flash` - Free and powerful image understanding model

### Image Generation
- `cogview-3-plus` - Enhanced image generation
- `cogview-3` - Standard image generation
- `cogview-3-flash` - Free image generation model
- `cogview-4-250304` - Advanced image generation with text capabilities
- `cogview-4` - Advanced image generation for precise and personalized AI image expression

### Video Generation
- `cogvideox` - Video generation from text or images
- `cogvideox-flash` - Free video generation model
- `cogvideox-2` - New video generation model
- `viduq1-text` - High-performance video generation from text input
- `viduq1-image` - Video generation from first frame image and text description
- `viduq1-start-end` - Video generation from first and last frame images
- `vidu2-image` - Enhanced video generation from first frame image and text description
- `vidu2-start-end` - Enhanced video generation from first and last frame images
- `vidu2-reference` - Video generation with reference images of people, objects, etc.

### Embeddings
- `embedding-3` - Latest embedding model
Expand All @@ -347,6 +384,7 @@ The ZaiClient provides access to comprehensive AI services:
### Specialized
- `charglm-3` - Character interaction model
- `cogtts` - Text-to-speech model
- `rerank` - Text reordering and relevance scoring

## 📈 Release Notes

Expand Down
62 changes: 50 additions & 12 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -352,28 +352,66 @@ public class AIController {
## 🤖 支持的模型

### 文本生成模型
- `glm-4-plus` - 最新的GLM-4 Plus模型
- `glm-4-0520` - GLM-4标准版
- `glm-4-long` - 长文本处理版本
- `glm-4-airx` - 轻量级版本
- `glm-4-air` - 快速响应版本
- `glm-4-flashx` - 超快响应版本
- `glm-4-flash` - 闪电版本
- `glm-4-plus` - 增强版GLM-4,具有更强的能力
- `glm-4` - 标准GLM-4模型
- `glm-4-air` - 轻量级版本,优化速度
- `glm-4-flash` - 超快响应模型
- `glm-4-0520` - GLM-4模型版本0520
- `glm-4-airx` - 扩展Air模型,具有附加功能
- `glm-4-long` - 优化长上下文对话
- `glm-4-voice` - 专为语音交互设计
- `glm-4.1v-thinking-flash` - 具有思维能力的视觉推理模型
- `glm-z1-air` - 优化数学和逻辑推理
- `glm-z1-airx` - 国内最快推理模型,200 tokens/s
- `glm-z1-flash` - 完全免费的推理模型服务
- `glm-4-air-250414` - 通过强化学习优化增强
- `glm-4-flash-250414` - 最新免费语言模型
- `glm-4-flashx` - 增强Flash版本,具有超快推理速度
- `glm-4-9b` - 90亿参数开源模型
- `glm-4-assistant` - 面向各种业务场景的AI助手
- `glm-4-alltools` - 复杂任务规划和执行的代理模型
- `chatglm3-6b` - 60亿参数开源基础模型
- `codegeex-4` - 代码生成和补全模型

### 音频语音识别
- `glm-asr` - 上下文感知音频转录模型

### 实时交互
- `glm-realtime-air` - 具有跨模态推理的实时视频通话模型
- `glm-realtime-flash` - 快速实时视频通话模型

### 视觉模型
- `glm-4v-plus` - 多模态理解模型
- `glm-4v` - 视觉理解模型
- `glm-4v-plus` - 增强视觉模型
- `glm-4v` - 标准视觉模型
- `glm-4v-plus-0111` - 可变分辨率视频和图像理解
- `glm-4v-flash` - 免费且强大的图像理解模型

### 图像生成模型
- `cogview-3-plus` - 高质量图像生成
- `cogview-3-plus` - 增强图像生成
- `cogview-3` - 标准图像生成
- `cogview-3-flash` - 免费图像生成模型
- `cogview-4-250304` - 具有文本功能的高级图像生成
- `cogview-4` - 精确个性化AI图像表达的高级图像生成

### 视频生成模型
- `cogvideox` - 从文本或图像生成视频
- `cogvideox-flash` - 免费视频生成模型
- `cogvideox-2` - 新视频生成模型
- `viduq1-text` - 从文本输入的高性能视频生成
- `viduq1-image` - 从首帧图像和文本描述生成视频
- `viduq1-start-end` - 从首末帧图像生成视频
- `vidu2-image` - 从首帧图像和文本描述的增强视频生成
- `vidu2-start-end` - 从首末帧图像的增强视频生成
- `vidu2-reference` - 使用人物、物体等参考图像的视频生成

### 嵌入模型
- `embedding-3` - 最新嵌入模型
- `embedding-2` - 标准嵌入模型
- `embedding-2` - 上一代嵌入模型

### 专业模型
- `charglm-3` - 角色扮演模型
- `charglm-3` - 角色交互模型
- `cogtts` - 文本转语音模型
- `rerank` - 文本重排序和相关性评分
- `emohaa` - 情感分析模型

## 📈 版本更新
Expand Down
184 changes: 168 additions & 16 deletions core/src/main/java/ai/z/openapi/core/Constants.java
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,93 @@ private Constants() {
*/
public static final String ModelChatGLM4Voice = "glm-4-voice";

/**
* GLM-4.1V Thinking Flash model - Visual reasoning model with thinking capabilities.
*/
public static final String ModelChatGLM41VThinkingFlash = "glm-4.1v-thinking-flash";

/**
* GLM-Z1 Air model - Optimized for mathematical and logical reasoning.
*/
public static final String ModelChatGLMZ1Air = "glm-z1-air";

/**
* GLM-Z1 AirX model - Fastest domestic inference model with 200 tokens/s.
*/
public static final String ModelChatGLMZ1AirX = "glm-z1-airx";

/**
* GLM-Z1 Flash model - Completely free reasoning model service.
*/
public static final String ModelChatGLMZ1Flash = "glm-z1-flash";

/**
* GLM-4 Air 250414 model - Enhanced with reinforcement learning optimization.
*/
public static final String ModelChatGLM4Air250414 = "glm-4-air-250414";

/**
* GLM-4 Flash 250414 model - Latest free language model.
*/
public static final String ModelChatGLM4Flash250414 = "glm-4-flash-250414";

/**
* GLM-4 FlashX model - Enhanced Flash version with ultra-fast inference speed.
*/
public static final String ModelChatGLM4FlashX = "glm-4-flashx";

/**
* GLM-4 9B model - Open-source model with 9 billion parameters.
*/
public static final String ModelChatGLM49B = "glm-4-9b";

/**
* GLM-4 Assistant model - AI assistant for various business scenarios.
*/
public static final String ModelChatGLM4Assistant = "glm-4-assistant";

/**
* GLM-4 AllTools model - Agent model for complex task planning and execution.
*/
public static final String ModelChatGLM4AllTools = "glm-4-alltools";

/**
* ChatGLM3 6B model - Open-source base model with 6 billion parameters.
*/
public static final String ModelChatGLM36B = "chatglm3-6b";

/**
* CodeGeeX-4 model - Code generation and completion model.
*/
public static final String ModelCodeGeeX4 = "codegeex-4";

// =============================================================================
// Audio Speech Recognition Models
// =============================================================================

/**
* GLM-ASR model - Context-aware audio transcription model that converts audio to
* fluent and readable text. Supports Chinese, English, and various Chinese dialects.
* Improved performance in noisy environments.
*/
public static final String ModelGLMASR = "glm-asr";

// =============================================================================
// Real-time Interaction Models
// =============================================================================

/**
* GLM-Realtime Air model - Real-time video call model with cross-modal reasoning
* capabilities across text, audio, and video. Supports real-time interruption.
*/
public static final String ModelGLMRealtimeAir = "glm-realtime-air";

/**
* GLM-Realtime Flash model - Fast real-time video call model with cross-modal
* reasoning capabilities. Supports camera interaction and screen sharing.
*/
public static final String ModelGLMRealtimeFlash = "glm-realtime-flash";

// =============================================================================
// Vision Models (Image Understanding)
// =============================================================================
Expand All @@ -89,6 +176,16 @@ private Constants() {
*/
public static final String ModelChatGLM4V = "glm-4v";

/**
* GLM-4V Plus 0111 model - Variable resolution video and image understanding.
*/
public static final String ModelChatGLM4VPlus0111 = "glm-4v-plus-0111";

/**
* GLM-4V Flash model - Free and powerful image understanding model.
*/
public static final String ModelChatGLM4VFlash = "glm-4v-flash";

// =============================================================================
// Image Generation Models
// =============================================================================
Expand All @@ -103,6 +200,75 @@ private Constants() {
*/
public static final String ModelCogView = "cogview-3";

/**
* CogView-3 Flash model - Free image generation model.
*/
public static final String ModelCogView3Flash = "cogview-3-flash";

/**
* CogView-4 250304 model - Advanced image generation with text capabilities.
*/
public static final String ModelCogView4250304 = "cogview-4-250304";

/**
* CogView-4 model - Advanced image generation for precise and personalized AI image
* expression.
*/
public static final String ModelCogView4 = "cogview-4";

// =============================================================================
// Video Generation Models
// =============================================================================

/**
* CogVideoX model - Video generation from text or images.
*/
public static final String ModelCogVideoX = "cogvideox";

/**
* CogVideoX Flash model - Free video generation model.
*/
public static final String ModelCogVideoXFlash = "cogvideox-flash";

/**
* CogVideoX-2 model - New video generation model.
*/
public static final String ModelCogVideoX2 = "cogvideox-2";

/**
* Vidu Q1 Text model - High-performance video generation from text input. Supports
* general and anime styles.
*/
public static final String ModelViduQ1Text = "viduq1-text";

/**
* Vidu Q1 Image model - Video generation from first frame image and text description.
*/
public static final String ModelViduQ1Image = "viduq1-image";

/**
* Vidu Q1 Start-End model - Video generation from first and last frame images.
*/
public static final String ModelViduQ1StartEnd = "viduq1-start-end";

/**
* Vidu 2 Image model - Enhanced video generation from first frame image and text
* description.
*/
public static final String ModelVidu2Image = "vidu2-image";

/**
* Vidu 2 Start-End model - Enhanced video generation from first and last frame
* images.
*/
public static final String ModelVidu2StartEnd = "vidu2-start-end";

/**
* Vidu 2 Reference model - Video generation with reference images of people, objects,
* etc.
*/
public static final String ModelVidu2Reference = "vidu2-reference";

// =============================================================================
// Embedding Models
// =============================================================================
Expand Down Expand Up @@ -131,23 +297,9 @@ private Constants() {
*/
public static final String ModelTTS = "cogtts";

// =============================================================================
// API Invocation Methods
// =============================================================================

/**
* Asynchronous invocation method - For non-blocking API calls.
*/
public static final String INVOKE_METHOD_ASYNC = "async-invoke";

/**
* Server-Sent Events invocation method - For streaming responses.
*/
public static final String INVOKE_METHOD_SSE = "sse-invoke";

/**
* Standard synchronous invocation method - For blocking API calls.
* Rerank model - Text reordering and relevance scoring.
*/
public static final String INVOKE_METHOD = "invoke";
public static final String ModelRerank = "rerank";

}
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,4 @@ public class ImageResult {
@JsonFormat(with = JsonFormat.Feature.ACCEPT_SINGLE_VALUE_AS_ARRAY)
List<Image> data;

/**
* Task number submitted by user in client request or task number generated by
* platform
*/
@JsonProperty("request_id")
private String requestId;

}
Loading