diff --git a/models/cometapi/.env.example b/models/cometapi/.env.example
new file mode 100644
index 000000000..da8d5af49
--- /dev/null
+++ b/models/cometapi/.env.example
@@ -0,0 +1,4 @@
+INSTALL_METHOD=remote
+REMOTE_INSTALL_URL=debug.dify.ai
+REMOTE_INSTALL_PORT=5003
+REMOTE_INSTALL_KEY=********-****-****-****-************
diff --git a/models/cometapi/PRIVACY.md b/models/cometapi/PRIVACY.md
new file mode 100644
index 000000000..2e62121e1
--- /dev/null
+++ b/models/cometapi/PRIVACY.md
@@ -0,0 +1,84 @@
+# Privacy Policy
+
+🛡️ **CometAPI Privacy and Data Protection Policy**
+
+CometAPI takes user privacy and data protection very seriously. This policy details how we collect, use, store, and protect your personal information.
+
+## 1. 💬 AI Model API Interaction Data Protection
+
+We are committed to protecting your data privacy, especially during your interactions with AI models. To ensure this, we:
+
+- **Do not collect user communications with AI models**: We never collect, store, or record any conversation content between you and the AI models, nor will we in the future.
+
+- **Do not save user communications with AI models**: Your communication content is not saved to our servers or databases, ensuring your conversations remain completely private and secure. Additionally, storing hundreds of millions of daily requests would be prohibitively expensive and unsustainable for us.
+
+Through these measures, we ensure that when using our website, you can enjoy a safe and private interactive environment.
+
+## 2. 👤 User Information
+
+### 2.1 💻 Automatically Collected Information
+
+When you visit our website, we may automatically collect the following information:
+
+- Your IP address
+- Browser type and version
+- Time and date of access
+- Pages visited and action records
+- Device type and operating system
+
+### 2.2 📝 Information You Provide
+
+We may collect your personal information through the following means:
+
+- Username and email address provided during account registration
+- Contact information submitted through online forms, email, or other methods
+- Feedback provided when participating in surveys or comments
+
+### 2.3 🔗 Third-Party Login Information
+
+#### 2.3.1 Github Login
+
+With your prior consent, when using Github login, we will collect your Github username and email for account registration and login authorization.
+
+For more information, please refer to the [Github Privacy Policy](https://docs.github.com/en/site-policy/privacy-policies/github-privacy-statement).
+
+#### 2.3.2 Google Login
+
+With your prior consent, when using Google login, we will collect your Google username and email for account registration and login authorization.
+
+For more information, please refer to the [Google Privacy Policy](https://policies.google.com/privacy).
+
+## 3. ✅ Use of Information
+
+We may use the collected information for the following purposes:
+
+- Providing and improving our services
+- Personalizing your user experience
+- Processing your requests, orders, or transactions
+- Sending promotional information, updates, and other relevant communications (you can opt out)
+- Monitoring and analyzing website usage to enhance website performance and functionality
+- Protecting our rights, property, or safety
+
+## 4. ❌ Information Sharing
+
+We do not sell or rent your personal information to third parties.
+
+## 5. 🔒 Information Security
+
+We implement reasonable technical and organizational measures to protect your personal information from unauthorized access, use, or disclosure. However, no internet transmission or electronic storage method is completely secure, and we cannot guarantee absolute security.
+
+## 6. 🍪 Cookies and Tracking Technologies
+
+We use cookies and similar technologies to enhance user experience, analyze website traffic, and personalize content. You can disable cookies in your browser settings, but this may affect the website's normal functionality.
+
+## 7. 🔄 Changes to Privacy Policy
+
+We may update this Privacy and Data Protection Policy from time to time. Changes will be posted on this page with an updated date. Please review regularly to stay informed of the latest policy.
+
+## 8. 📧 Contact Us
+
+If you have any questions or comments about this Privacy and Data Protection Policy, please contact our customer service through the contact information in the footer of this website.
+
+---
+
+Thank you for your attention to and understanding of our Privacy and Data Protection Policy.
\ No newline at end of file
diff --git a/models/cometapi/README.md b/models/cometapi/README.md
new file mode 100644
index 000000000..84762b18b
--- /dev/null
+++ b/models/cometapi/README.md
@@ -0,0 +1,99 @@
+# CometAPI
+All AI Models in One API - 500+ AI Models
+
+## Overview
+
+CometAPI is a comprehensive AI platform that provides unified access to over 500 cutting-edge AI models through a single, powerful API. Our platform simplifies AI integration by offering a diverse ecosystem of language models, image generation tools, video creation services, and specialized AI capabilities - all accessible through one streamlined interface.
+
+### Key Features
+
+- **🚀 500+ AI Models**: Access to the latest and most powerful AI models including GPT-4, Claude, GLM-4.5, Qwen3-Coder, Kimi K2, Grok 4, and many more
+- **🎨 Multi-Modal Capabilities**: Support for text generation, image creation, video production, music composition, and audio processing
+- **⚡ Unified API**: Single API endpoint for all AI services, reducing integration complexity
+- **🔄 Real-time Access**: Get instant access to newly released AI models as they become available
+- **💰 Cost-Effective**: Competitive pricing with flexible usage plans
+- **🛡️ Enterprise Ready**: Robust security, reliability, and scalable infrastructure
+
+### Model Support List
+
+The following table shows all LLM model providers supported in CometAPI:
+
+| # | Model Provider | Folder Name | Status | Notes |
+|---|----------------|-------------|--------|-------|
+| 1 | Anthropic Claude | `anthropic` | ✅ | Claude series |
+| 2 | DeepSeek | `deepseek` | ✅ | - |
+| 3 | Google Gemini | `gemini` | ✅ | - |
+| 4 | Meta Llama | `llama` | ✅ | Meta's Llama models |
+| 5 | OpenAI | `openai` | ✅ | GPT series |
+| 6 | Qwen | `qwen` | ✅ | Alibaba's Qwen models |
+| 7 | X-AI | `x-ai` | ✅ | Grok models |
+
+**Total: 7 LLM Model Providers Supported**
+For access to additional models beyond those listed, please visit: [www.cometapi.com](https://www.cometapi.com/)
+
+
+## Configuration
+
+### Step 1: Get Your API Key(If you don't have a CometAPI key yet)
+1. Visit [CometAPI Console](https://api.cometapi.com/console/token)
+2. Sign up for a free account (1M free tokens available for new users!)
+3. Generate your API key from the dashboard
+
+### Step 2: Setup in Dify
+1. Navigate to **Settings** → **Model Provider**
+2. Find **CometAPI** in the provider list
+3. Enter your API key
+4. Save the configuration
+
+### Step 3: Start Using
+Once configured, you can access all 500+ AI models through CometAPI's unified interface.
+
+## Quick Start
+
+After installation and configuration, you can immediately start using any of the available AI models:
+
+- Choose from text generation models for content creation
+- Use image generation APIs for visual content
+- Access video creation tools for multimedia projects
+- Leverage specialized coding models for development tasks
+
+## Pricing
+
+CometAPI offers flexible pricing options:
+- **Free Tier**: 1M tokens for new users
+- **Pay-as-you-go**: Cost-effective usage-based pricing
+- **Enterprise Plans**: Custom solutions for large-scale deployments
+
+
+Visit our [Pricing Page](https://api.cometapi.com/pricing) for detailed information.
+
+## Documentation & Support
+
+- **📚 API Documentation**: [api.cometapi.com/doc](https://api.cometapi.com/doc)
+- **🚀 Quick Start Guide**: [api.cometapi.com/panel](https://api.cometapi.com/panel)
+- **💬 Discord Community**: [Join our Discord](https://discord.com/invite/HMpuV6FCrG)
+- **📧 Email Support**: [support@cometapi.com](mailto:support@cometapi.com)
+
+## Social Links
+
+- **🐦 Twitter**: [@cometapi2025](https://x.com/cometapi2025)
+- **💬 Discord**: [Join our Community](https://discord.com/invite/HMpuV6FCrG)
+
+
+## Why Choose CometAPI?
+
+1. **Comprehensive Coverage**: Access to the most extensive collection of AI models in the industry
+2. **Unified Interface**: One API to rule them all - no need to manage multiple providers
+3. **Latest Models**: Always up-to-date with the newest AI breakthroughs
+4. **Developer Friendly**: Comprehensive documentation and active community support
+5. **Reliable Infrastructure**: Enterprise-grade reliability and performance
+6. **Competitive Pricing**: Best value for money with transparent pricing
+
+---
+
+**Ready to get started?** [Sign up now](https://api.cometapi.com/login) and claim your free 1M tokens!
+
+For more information, visit our official website: [www.cometapi.com](https://www.cometapi.com/)
+
+
+
diff --git a/models/cometapi/_assets/cometapi_large.png b/models/cometapi/_assets/cometapi_large.png
new file mode 100644
index 000000000..a77ddc5cd
Binary files /dev/null and b/models/cometapi/_assets/cometapi_large.png differ
diff --git a/models/cometapi/_assets/cometapi_small.png b/models/cometapi/_assets/cometapi_small.png
new file mode 100644
index 000000000..c546d6980
Binary files /dev/null and b/models/cometapi/_assets/cometapi_small.png differ
diff --git a/models/cometapi/main.py b/models/cometapi/main.py
new file mode 100644
index 000000000..7e1a983db
--- /dev/null
+++ b/models/cometapi/main.py
@@ -0,0 +1,6 @@
+from dify_plugin import Plugin, DifyPluginEnv
+
+plugin = Plugin(DifyPluginEnv(MAX_REQUEST_TIMEOUT=120))
+
+if __name__ == '__main__':
+    plugin.run()
diff --git a/models/cometapi/manifest.yaml b/models/cometapi/manifest.yaml
new file mode 100644
index 000000000..403548452
--- /dev/null
+++ b/models/cometapi/manifest.yaml
@@ -0,0 +1,41 @@
+version: 0.0.1
+type: plugin
+author: cometapi
+name: cometapi
+label:
+  en_US: CometAPI
+  zh_Hans: CometAPI
+  ja_JP: CometAPI
+  pt_BR: CometAPI
+description:
+  en_US: "500+ AI Model API, All In One API. Just In CometAPI."
+  zh_Hans: "500+ AI 模型 API，一站集成，尽在 CometAPI。"
+  ja_JP: "500以上のAIモデルAPIを一括提供。すべてはCometAPIに。"
+  pt_BR: "500+ modelos de IA em uma única API. Tudo em um só lugar : CometAPI."
+icon: cometapi_small.png
+icon_dark: cometapi_small.png
+resource:
+  memory: 268435456
+  permission:
+    model:
+      enabled: true
+      llm: true
+      text_embedding: true
+      tts: true
+      speech2text: true
+plugins:
+  models:
+    - provider/cometapi.yaml
+meta:
+  version: 0.0.1
+  arch:
+    - amd64
+    - arm64
+  runner:
+    language: python
+    version: "3.12"
+    entrypoint: main
+  minimum_dify_version: 1.0.0
+created_at: 2025-08-05T11:06:28.002855+08:00
+privacy: PRIVACY.md
+verified: false
diff --git a/models/cometapi/models/common_openai.py b/models/cometapi/models/common_openai.py
new file mode 100644
index 000000000..8a78e7c54
--- /dev/null
+++ b/models/cometapi/models/common_openai.py
@@ -0,0 +1,49 @@
+from collections.abc import Mapping
+
+import openai
+from httpx import Timeout
+
+from dify_plugin.errors.model import InvokeAuthorizationError, InvokeBadRequestError, InvokeConnectionError, InvokeError, InvokeRateLimitError, InvokeServerUnavailableError
+
+
+class _CommonOpenAI:
+    def _to_credential_kwargs(self, credentials: Mapping) -> dict:
+        """
+        Transform credentials to kwargs for model instance
+
+        :param credentials:
+        :return:
+        """
+        credentials_kwargs = {
+            "api_key": credentials['api_key'],
+            "timeout": Timeout(315.0, read=300.0, write=10.0, connect=5.0),
+            "max_retries": 1,
+        }
+
+        # CometAPI uses fixed base URL
+        credentials_kwargs["base_url"] = "https://api.cometapi.com/v1"
+
+        return credentials_kwargs
+
+    @property
+    def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
+        """
+        Map model invoke error to unified error
+        The key is the error type thrown to the caller
+        The value is the error type thrown by the model,
+        which needs to be converted into a unified error type for the caller.
+
+        :return: Invoke error mapping
+        """
+        return {
+            InvokeConnectionError: [openai.APIConnectionError, openai.APITimeoutError],
+            InvokeServerUnavailableError: [openai.InternalServerError],
+            InvokeRateLimitError: [openai.RateLimitError],
+            InvokeAuthorizationError: [openai.AuthenticationError, openai.PermissionDeniedError],
+            InvokeBadRequestError: [
+                openai.BadRequestError,
+                openai.NotFoundError,
+                openai.UnprocessableEntityError,
+                openai.APIError,
+            ],
+        }
diff --git a/models/cometapi/models/llm/__init__.py b/models/cometapi/models/llm/__init__.py
new file mode 100755
index 000000000..e69de29bb
diff --git a/models/cometapi/models/llm/_position.yaml b/models/cometapi/models/llm/_position.yaml
new file mode 100644
index 000000000..0fce7b6c5
--- /dev/null
+++ b/models/cometapi/models/llm/_position.yaml
@@ -0,0 +1,84 @@
+# anthropic models (9)
+- claude-3-5-haiku-20241022
+- claude-3-5-sonnet-20240620
+- claude-3-5-sonnet-20241022
+- claude-3-7-sonnet-20250219
+- claude-3-haiku-20240307
+- claude-3-opus-20240229
+- claude-3-sonnet-20240229
+- claude-opus-4-20250514
+- claude-sonnet-4-20250514
+
+# deepseek models (5)
+- deepseek-chat
+- deepseek-r1-0528
+- deepseek-reasoner
+- deepseek-v3
+- deepseek-v3-250324
+
+# gemini models (5)
+- gemini-2.0-flash
+- gemini-2.0-flash-lite-preview-02-05
+- gemini-2.5-flash
+- gemini-2.5-flash-lite-preview-06-17
+- gemini-2.5-pro
+
+# llama models (4)
+- llama-3-70b
+- llama-3-8b
+- llama-4-maverick
+- llama-4-scout
+
+# openai models (37)
+- chatgpt-4o-latest
+- gpt-3.5-turbo
+- gpt-3.5-turbo-0125
+- gpt-3.5-turbo-1106
+- gpt-3.5-turbo-16k
+- gpt-3.5-turbo-instruct
+- gpt-4
+- gpt-4-0125-preview
+- gpt-4-1106-preview
+- gpt-4-turbo
+- gpt-4-turbo-2024-04-09
+- gpt-4-turbo-preview
+- gpt-4.1
+- gpt-4.1-2025-04-14
+- gpt-4.1-mini
+- gpt-4.1-mini-2025-04-14
+- gpt-4.1-nano
+- gpt-4.1-nano-2025-04-14
+- gpt-4o
+- gpt-4o-2024-05-13
+- gpt-4o-2024-08-06
+- gpt-4o-2024-11-20
+- gpt-4o-mini
+- gpt-4o-mini-2024-07-18
+- o1
+- o1-mini
+- o1-mini-2024-09-12
+- o1-preview
+- o1-preview-2024-09-12
+- o3
+- o3-2025-04-16
+- o3-mini
+- o3-mini-2025-01-31
+- o3-pro
+- o3-pro-2025-06-10
+- o4-mini
+- o4-mini-2025-04-16
+
+# qwen models (5)
+- qwen2-72b-instruct
+- qwen2.5-72b-instruct
+- qwen3-30b-a3b
+- qwen3-coder
+- qwen3-coder-480b-a35b-instruct
+
+# x-ai models (6)
+- grok-2-1212
+- grok-3-beta
+- grok-3-fast
+- grok-3-mini
+- grok-3-mini-fast
+- grok-4-0709
diff --git a/models/cometapi/models/llm/anthropic/claude-3-5-haiku-20241022.yaml b/models/cometapi/models/llm/anthropic/claude-3-5-haiku-20241022.yaml
new file mode 100644
index 000000000..a1942b25a
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-3-5-haiku-20241022.yaml
@@ -0,0 +1,98 @@
+model: claude-3-5-haiku-20241022
+label:
+  en_US: claude-3-5-haiku-20241022
+model_type: llm
+features:
+  - agent-thought
+  - tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 8192
+    min: 1
+    max: 8192
+  - name: response_format
+    use_template: response_format
+  - name: prompt_caching_message_flow
+    label:
+      zh_Hans: 大消息自动缓存阈值
+      en_US: Cache Message Flow Threshold
+    type: int
+    default: 0
+    required: false
+    help:
+      zh_Hans: 当用户或助手消息的单词数超过此阈值时，自动为该消息添加缓存控制。0 表示禁用。
+      en_US: If a user or assistant message exceeds this word count, add cache_control automatically. 0 disables the feature.
+  - name: prompt_caching_system_message
+    label:
+      zh_Hans: 缓存系统消息
+      en_US: Cache System Message
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用系统消息的提示缓存。<cache></cache>中的内容将自动缓存。
+      en_US: Enable prompt caching for system messages. Content within <cache></cache> will be cached.
+  - name: prompt_caching_images
+    label:
+      zh_Hans: 缓存图片
+      en_US: Cache Images
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用图片的提示缓存。
+      en_US: Enable prompt caching for images.
+  - name: prompt_caching_documents
+    label:
+      zh_Hans: 缓存文档
+      en_US: Cache Documents
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用文档的提示缓存。
+      en_US: Enable prompt caching for documents.
+  - name: prompt_caching_tool_definitions
+    label:
+      zh_Hans: 缓存工具定义
+      en_US: Cache Tool Definitions
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具定义的提示缓存。
+      en_US: Enable prompt caching for tool definitions.
+  - name: prompt_caching_tool_results
+    label:
+      zh_Hans: 缓存工具结果
+      en_US: Cache Tool Results
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具结果的提示缓存。
+      en_US: Enable prompt caching for tool results.
+pricing:
+  input: '1.00'
+  output: '5.00'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/anthropic/claude-3-5-sonnet-20240620.yaml b/models/cometapi/models/llm/anthropic/claude-3-5-sonnet-20240620.yaml
new file mode 100644
index 000000000..fadc351f2
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-3-5-sonnet-20240620.yaml
@@ -0,0 +1,100 @@
+model: claude-3-5-sonnet-20240620
+label:
+  en_US: claude-3-5-sonnet-20240620
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+  - document
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 8192
+    min: 1
+    max: 8192
+  - name: response_format
+    use_template: response_format
+  - name: prompt_caching_message_flow
+    label:
+      zh_Hans: 大消息自动缓存阈值
+      en_US: Cache Message Flow Threshold
+    type: int
+    default: 0
+    required: false
+    help:
+      zh_Hans: 当用户或助手消息的单词数超过此阈值时，自动为该消息添加缓存控制。0 表示禁用。
+      en_US: If a user or assistant message exceeds this word count, add cache_control automatically. 0 disables the feature.
+  - name: prompt_caching_system_message
+    label:
+      zh_Hans: 缓存系统消息
+      en_US: Cache System Message
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用系统消息的提示缓存。<cache></cache>中的内容将自动缓存。
+      en_US: Enable prompt caching for system messages. Content within <cache></cache> will be cached.
+  - name: prompt_caching_images
+    label:
+      zh_Hans: 缓存图片
+      en_US: Cache Images
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用图片的提示缓存。
+      en_US: Enable prompt caching for images.
+  - name: prompt_caching_documents
+    label:
+      zh_Hans: 缓存文档
+      en_US: Cache Documents
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用文档的提示缓存。
+      en_US: Enable prompt caching for documents.
+  - name: prompt_caching_tool_definitions
+    label:
+      zh_Hans: 缓存工具定义
+      en_US: Cache Tool Definitions
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具定义的提示缓存。
+      en_US: Enable prompt caching for tool definitions.
+  - name: prompt_caching_tool_results
+    label:
+      zh_Hans: 缓存工具结果
+      en_US: Cache Tool Results
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具结果的提示缓存。
+      en_US: Enable prompt caching for tool results.
+pricing:
+  input: '3.00'
+  output: '15.00'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/anthropic/claude-3-5-sonnet-20241022.yaml b/models/cometapi/models/llm/anthropic/claude-3-5-sonnet-20241022.yaml
new file mode 100644
index 000000000..35674aabd
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-3-5-sonnet-20241022.yaml
@@ -0,0 +1,100 @@
+model: claude-3-5-sonnet-20241022
+label:
+  en_US: claude-3-5-sonnet-20241022
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+  - document
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 8192
+    min: 1
+    max: 8192
+  - name: response_format
+    use_template: response_format
+  - name: prompt_caching_message_flow
+    label:
+      zh_Hans: 大消息自动缓存阈值
+      en_US: Cache Message Flow Threshold
+    type: int
+    default: 0
+    required: false
+    help:
+      zh_Hans: 当用户或助手消息的单词数超过此阈值时，自动为该消息添加缓存控制。0 表示禁用。
+      en_US: If a user or assistant message exceeds this word count, add cache_control automatically. 0 disables the feature.
+  - name: prompt_caching_system_message
+    label:
+      zh_Hans: 缓存系统消息
+      en_US: Cache System Message
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用系统消息的提示缓存。<cache></cache>中的内容将自动缓存。
+      en_US: Enable prompt caching for system messages. Content within <cache></cache> will be cached.
+  - name: prompt_caching_images
+    label:
+      zh_Hans: 缓存图片
+      en_US: Cache Images
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用图片的提示缓存。
+      en_US: Enable prompt caching for images.
+  - name: prompt_caching_documents
+    label:
+      zh_Hans: 缓存文档
+      en_US: Cache Documents
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用文档的提示缓存。
+      en_US: Enable prompt caching for documents.
+  - name: prompt_caching_tool_definitions
+    label:
+      zh_Hans: 缓存工具定义
+      en_US: Cache Tool Definitions
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具定义的提示缓存。
+      en_US: Enable prompt caching for tool definitions.
+  - name: prompt_caching_tool_results
+    label:
+      zh_Hans: 缓存工具结果
+      en_US: Cache Tool Results
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具结果的提示缓存。
+      en_US: Enable prompt caching for tool results.
+pricing:
+  input: '3.00'
+  output: '15.00'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/anthropic/claude-3-7-sonnet-20250219.yaml b/models/cometapi/models/llm/anthropic/claude-3-7-sonnet-20250219.yaml
new file mode 100644
index 000000000..4c00aa490
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-3-7-sonnet-20250219.yaml
@@ -0,0 +1,131 @@
+model: claude-3-7-sonnet-20250219
+label:
+  en_US: claude-3-7-sonnet-20250219
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+  - document
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: thinking
+    label:
+      zh_Hans: 推理模式
+      en_US: Thinking Mode
+    type: boolean
+    default: false
+    required: false
+    help:
+      zh_Hans: 控制模型的推理能力。启用时，temperature、top_p和top_k将被禁用。
+      en_US: Controls the model's thinking capability. When enabled, temperature, top_p and top_k will be disabled.
+  - name: thinking_budget
+    label:
+      zh_Hans: 推理预算
+      en_US: Thinking Budget
+    type: int
+    default: 1024
+    min: 0
+    max: 128000
+    required: false
+    help:
+      zh_Hans: 推理的预算限制（最小1024），必须小于max_tokens。仅在推理模式启用时可用。
+      en_US: Budget limit for thinking (minimum 1024), must be less than max_tokens. Only available when thinking mode is enabled.
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 64000
+    min: 1
+    max: 128000
+  - name: response_format
+    use_template: response_format
+  - name: extended_output
+    label:
+      zh_Hans: 扩展输出
+      en_US: Extended Output
+    type: boolean
+    default: false
+    help:
+      zh_Hans: 启用长达128K标记的输出能力。
+      en_US: Enable capability for up to 128K output tokens.
+  - name: prompt_caching_message_flow
+    label:
+      zh_Hans: 大消息自动缓存阈值
+      en_US: Cache Message Flow Threshold
+    type: int
+    default: 0
+    required: false
+    help:
+      zh_Hans: 当用户或助手消息的单词数超过此阈值时，自动为该消息添加缓存控制。0 表示禁用。
+      en_US: If a user or assistant message exceeds this word count, add cache_control automatically. 0 disables the feature.
+  - name: prompt_caching_system_message
+    label:
+      zh_Hans: 缓存系统消息
+      en_US: Cache System Message
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用系统消息的提示缓存。<cache></cache>中的内容将自动缓存。
+      en_US: Enable prompt caching for system messages. Content within <cache></cache> will be cached.
+  - name: prompt_caching_images
+    label:
+      zh_Hans: 缓存图片
+      en_US: Cache Images
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用图片的提示缓存。
+      en_US: Enable prompt caching for images.
+  - name: prompt_caching_documents
+    label:
+      zh_Hans: 缓存文档
+      en_US: Cache Documents
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用文档的提示缓存。
+      en_US: Enable prompt caching for documents.
+  - name: prompt_caching_tool_definitions
+    label:
+      zh_Hans: 缓存工具定义
+      en_US: Cache Tool Definitions
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具定义的提示缓存。
+      en_US: Enable prompt caching for tool definitions.
+  - name: prompt_caching_tool_results
+    label:
+      zh_Hans: 缓存工具结果
+      en_US: Cache Tool Results
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具结果的提示缓存。
+      en_US: Enable prompt caching for tool results.
+pricing:
+  input: '3.00'
+  output: '15.00'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/anthropic/claude-3-haiku-20240307.yaml b/models/cometapi/models/llm/anthropic/claude-3-haiku-20240307.yaml
new file mode 100644
index 000000000..cb2af1308
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-3-haiku-20240307.yaml
@@ -0,0 +1,39 @@
+model: claude-3-haiku-20240307
+label:
+  en_US: claude-3-haiku-20240307
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 4096
+    min: 1
+    max: 4096
+  - name: response_format
+    use_template: response_format
+pricing:
+  input: '0.25'
+  output: '1.25'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/anthropic/claude-3-opus-20240229.yaml b/models/cometapi/models/llm/anthropic/claude-3-opus-20240229.yaml
new file mode 100644
index 000000000..101f54c3f
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-3-opus-20240229.yaml
@@ -0,0 +1,39 @@
+model: claude-3-opus-20240229
+label:
+  en_US: claude-3-opus-20240229
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 4096
+    min: 1
+    max: 4096
+  - name: response_format
+    use_template: response_format
+pricing:
+  input: '15.00'
+  output: '75.00'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/anthropic/claude-3-sonnet-20240229.yaml b/models/cometapi/models/llm/anthropic/claude-3-sonnet-20240229.yaml
new file mode 100644
index 000000000..daf55553f
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-3-sonnet-20240229.yaml
@@ -0,0 +1,39 @@
+model: claude-3-sonnet-20240229
+label:
+  en_US: claude-3-sonnet-20240229
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 4096
+    min: 1
+    max: 4096
+  - name: response_format
+    use_template: response_format
+pricing:
+  input: '3.00'
+  output: '15.00'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/anthropic/claude-opus-4-20250514.yaml b/models/cometapi/models/llm/anthropic/claude-opus-4-20250514.yaml
new file mode 100644
index 000000000..398ab6f9c
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-opus-4-20250514.yaml
@@ -0,0 +1,123 @@
+model: claude-opus-4-20250514
+label:
+  en_US: claude-opus-4-20250514
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+  - document
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: thinking
+    label:
+      zh_Hans: 推理模式
+      en_US: Thinking Mode
+    type: boolean
+    default: false
+    required: false
+    help:
+      zh_Hans: 控制模型的推理能力。启用时，temperature、top_p和top_k将被禁用。
+      en_US: Controls the model's thinking capability. When enabled, temperature, top_p and top_k will be disabled.
+  - name: thinking_budget
+    label:
+      zh_Hans: 推理预算
+      en_US: Thinking Budget
+    type: int
+    default: 1024
+    min: 0
+    max: 32000
+    required: false
+    help:
+      zh_Hans: 推理的预算限制（最小1024），必须小于max_tokens。仅在推理模式启用时可用。
+      en_US: Budget limit for thinking (minimum 1024), must be less than max_tokens. Only available when thinking mode is enabled.
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 32000
+    min: 1
+    max: 32000
+  - name: response_format
+    use_template: response_format
+  - name: prompt_caching_message_flow
+    label:
+      zh_Hans: 大消息自动缓存阈值
+      en_US: Cache Message Flow Threshold
+    type: int
+    default: 0
+    required: false
+    help:
+      zh_Hans: 当用户或助手消息的单词数超过此阈值时，自动为该消息添加缓存控制。0 表示禁用。
+      en_US: If a user or assistant message exceeds this word count, add cache_control automatically. 0 disables the feature.
+  - name: prompt_caching_system_message
+    label:
+      zh_Hans: 缓存系统消息
+      en_US: Cache System Message
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用系统消息的提示缓存。<cache></cache>中的内容将自动缓存。
+      en_US: Enable prompt caching for system messages. Content within <cache></cache> will be cached.
+  - name: prompt_caching_images
+    label:
+      zh_Hans: 缓存图片
+      en_US: Cache Images
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用图片的提示缓存。
+      en_US: Enable prompt caching for images.
+  - name: prompt_caching_documents
+    label:
+      zh_Hans: 缓存文档
+      en_US: Cache Documents
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用文档的提示缓存。
+      en_US: Enable prompt caching for documents.
+  - name: prompt_caching_tool_definitions
+    label:
+      zh_Hans: 缓存工具定义
+      en_US: Cache Tool Definitions
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具定义的提示缓存。
+      en_US: Enable prompt caching for tool definitions.
+  - name: prompt_caching_tool_results
+    label:
+      zh_Hans: 缓存工具结果
+      en_US: Cache Tool Results
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具结果的提示缓存。
+      en_US: Enable prompt caching for tool results.
+pricing:
+  input: '15.00'
+  output: '75.00'
+  unit: '0.000001'
+  currency: USD
+
diff --git a/models/cometapi/models/llm/anthropic/claude-sonnet-4-20250514.yaml b/models/cometapi/models/llm/anthropic/claude-sonnet-4-20250514.yaml
new file mode 100644
index 000000000..3304e4ea0
--- /dev/null
+++ b/models/cometapi/models/llm/anthropic/claude-sonnet-4-20250514.yaml
@@ -0,0 +1,122 @@
+model: claude-sonnet-4-20250514
+label:
+  en_US: claude-sonnet-4-20250514
+model_type: llm
+features:
+  - agent-thought
+  - vision
+  - tool-call
+  - stream-tool-call
+  - document
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+  - name: thinking
+    label:
+      zh_Hans: 推理模式
+      en_US: Thinking Mode
+    type: boolean
+    default: false
+    required: false
+    help:
+      zh_Hans: 控制模型的推理能力。启用时，temperature、top_p和top_k将被禁用。
+      en_US: Controls the model's thinking capability. When enabled, temperature, top_p and top_k will be disabled.
+  - name: thinking_budget
+    label:
+      zh_Hans: 推理预算
+      en_US: Thinking Budget
+    type: int
+    default: 1024
+    min: 0
+    max: 64000
+    required: false
+    help:
+      zh_Hans: 推理的预算限制（最小1024），必须小于max_tokens。仅在推理模式启用时可用。
+      en_US: Budget limit for thinking (minimum 1024), must be less than max_tokens. Only available when thinking mode is enabled.
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 64000
+    min: 1
+    max: 64000
+  - name: response_format
+    use_template: response_format
+  - name: prompt_caching_message_flow
+    label:
+      zh_Hans: 大消息自动缓存阈值
+      en_US: Cache Message Flow Threshold
+    type: int
+    default: 0
+    required: false
+    help:
+      zh_Hans: 当用户或助手消息的单词数超过此阈值时，自动为该消息添加缓存控制。0 表示禁用。
+      en_US: If a user or assistant message exceeds this word count, add cache_control automatically. 0 disables the feature.
+  - name: prompt_caching_system_message
+    label:
+      zh_Hans: 缓存系统消息
+      en_US: Cache System Message
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用系统消息的提示缓存。<cache></cache>中的内容将自动缓存。
+      en_US: Enable prompt caching for system messages. Content within <cache></cache> will be cached.
+  - name: prompt_caching_images
+    label:
+      zh_Hans: 缓存图片
+      en_US: Cache Images
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用图片的提示缓存。
+      en_US: Enable prompt caching for images.
+  - name: prompt_caching_documents
+    label:
+      zh_Hans: 缓存文档
+      en_US: Cache Documents
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用文档的提示缓存。
+      en_US: Enable prompt caching for documents.
+  - name: prompt_caching_tool_definitions
+    label:
+      zh_Hans: 缓存工具定义
+      en_US: Cache Tool Definitions
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具定义的提示缓存。
+      en_US: Enable prompt caching for tool definitions.
+  - name: prompt_caching_tool_results
+    label:
+      zh_Hans: 缓存工具结果
+      en_US: Cache Tool Results
+    type: boolean
+    default: true
+    required: false
+    help:
+      zh_Hans: 启用工具结果的提示缓存。
+      en_US: Enable prompt caching for tool results.
+pricing:
+  input: '3.00'
+  output: '15.00'
+  unit: '0.000001'
+  currency: USD
diff --git a/models/cometapi/models/llm/deepseek/deepseek-chat.yaml b/models/cometapi/models/llm/deepseek/deepseek-chat.yaml
new file mode 100755
index 000000000..64a522893
--- /dev/null
+++ b/models/cometapi/models/llm/deepseek/deepseek-chat.yaml
@@ -0,0 +1,56 @@
+model: deepseek-chat
+label:
+  en_US: deepseek-chat
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    type: float
+    default: 1
+    min: 0.0
+    max: 2.0
+    help:
+      zh_Hans: 控制生成结果的多样性和随机性。数值越小，越严谨；数值越大，越发散。
+      en_US: Control the diversity and randomness of generated results. The smaller the value, the more rigorous it is; the larger the value, the more divergent it is.
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 4096
+    min: 1
+    max: 4096
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+    type: float
+    default: 1
+    min: 0.01
+    max: 1.0
+    help:
+      zh_Hans: 控制生成结果的随机性。数值越小，随机性越弱；数值越大，随机性越强。一般而言，top_p 和 temperature 两个参数选择一个进行调整即可。
+      en_US: Control the randomness of generated results. The smaller the value, the weaker the randomness; the larger the value, the stronger the randomness. Generally speaking, you can adjust one of the two parameters top_p and temperature.
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
+    default: 0
+    min: -2.0
+    max: 2.0
+    help:
+      zh_Hans: 介于 -2.0 和 2.0 之间的数字。如果该值为正，那么新 token 会根据其在已有文本中的出现频率受到相应的惩罚，降低模型重复相同内容的可能性。
+      en_US: A number between -2.0 and 2.0. If the value is positive, new tokens are penalized based on their frequency of occurrence in existing text, reducing the likelihood that the model will repeat the same content.
diff --git a/models/cometapi/models/llm/deepseek/deepseek-r1-0528.yaml b/models/cometapi/models/llm/deepseek/deepseek-r1-0528.yaml
new file mode 100755
index 000000000..2a26c9366
--- /dev/null
+++ b/models/cometapi/models/llm/deepseek/deepseek-r1-0528.yaml
@@ -0,0 +1,57 @@
+model: deepseek-r1-0528
+label:
+  en_US: deepseek-r1-0528
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+  - agent-thought
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    type: float
+    default: 1
+    min: 0.0
+    max: 2.0
+    help:
+      zh_Hans: 控制生成结果的多样性和随机性。数值越小，越严谨；数值越大，越发散。
+      en_US: Control the diversity and randomness of generated results. The smaller the value, the more rigorous it is; the larger the value, the more divergent it is.
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 4096
+    min: 1
+    max: 4096
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+    type: float
+    default: 1
+    min: 0.01
+    max: 1.0
+    help:
+      zh_Hans: 控制生成结果的随机性。数值越小，随机性越弱；数值越大，随机性越强。一般而言，top_p 和 temperature 两个参数选择一个进行调整即可。
+      en_US: Control the randomness of generated results. The smaller the value, the weaker the randomness; the larger the value, the stronger the randomness. Generally speaking, you can adjust one of the two parameters top_p and temperature.
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
+    default: 0
+    min: -2.0
+    max: 2.0
+    help:
+      zh_Hans: 介于 -2.0 和 2.0 之间的数字。如果该值为正，那么新 token 会根据其在已有文本中的出现频率受到相应的惩罚，降低模型重复相同内容的可能性。
+      en_US: A number between -2.0 and 2.0. If the value is positive, new tokens are penalized based on their frequency of occurrence in existing text, reducing the likelihood that the model will repeat the same content.
diff --git a/models/cometapi/models/llm/deepseek/deepseek-reasoner.yaml b/models/cometapi/models/llm/deepseek/deepseek-reasoner.yaml
new file mode 100644
index 000000000..af9883943
--- /dev/null
+++ b/models/cometapi/models/llm/deepseek/deepseek-reasoner.yaml
@@ -0,0 +1,31 @@
+model: deepseek-reasoner
+label:
+  zh_Hans: deepseek-reasoner
+  en_US: deepseek-reasoner
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+  - agent-thought
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+  - name: max_tokens
+    use_template: max_tokens
+    min: 1
+    max: 8192
+    default: 4096
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
diff --git a/models/cometapi/models/llm/deepseek/deepseek-v3-250324.yaml b/models/cometapi/models/llm/deepseek/deepseek-v3-250324.yaml
new file mode 100755
index 000000000..41caad3d0
--- /dev/null
+++ b/models/cometapi/models/llm/deepseek/deepseek-v3-250324.yaml
@@ -0,0 +1,56 @@
+model: deepseek-v3-250324
+label:
+  en_US: deepseek-v3-250324
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 32000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    type: float
+    default: 1
+    min: 0.0
+    max: 2.0
+    help:
+      zh_Hans: 控制生成结果的多样性和随机性。数值越小，越严谨；数值越大，越发散。
+      en_US: Control the diversity and randomness of generated results. The smaller the value, the more rigorous it is; the larger the value, the more divergent it is.
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 4096
+    min: 1
+    max: 4096
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+    type: float
+    default: 1
+    min: 0.01
+    max: 1.0
+    help:
+      zh_Hans: 控制生成结果的随机性。数值越小，随机性越弱；数值越大，随机性越强。一般而言，top_p 和 temperature 两个参数选择一个进行调整即可。
+      en_US: Control the randomness of generated results. The smaller the value, the weaker the randomness; the larger the value, the stronger the randomness. Generally speaking, you can adjust one of the two parameters top_p and temperature.
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
+    default: 0
+    min: -2.0
+    max: 2.0
+    help:
+      zh_Hans: 介于 -2.0 和 2.0 之间的数字。如果该值为正，那么新 token 会根据其在已有文本中的出现频率受到相应的惩罚，降低模型重复相同内容的可能性。
+      en_US: A number between -2.0 and 2.0. If the value is positive, new tokens are penalized based on their frequency of occurrence in existing text, reducing the likelihood that the model will repeat the same content.
diff --git a/models/cometapi/models/llm/deepseek/deepseek-v3.yaml b/models/cometapi/models/llm/deepseek/deepseek-v3.yaml
new file mode 100755
index 000000000..9f5904573
--- /dev/null
+++ b/models/cometapi/models/llm/deepseek/deepseek-v3.yaml
@@ -0,0 +1,54 @@
+model: deepseek-v3
+label:
+  en_US: deepseek-v3
+model_type: llm
+features:
+  - vision
+model_properties:
+  mode: chat
+  context_size: 32000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    type: float
+    default: 1
+    min: 0.0
+    max: 2.0
+    help:
+      zh_Hans: 控制生成结果的多样性和随机性。数值越小，越严谨；数值越大，越发散。
+      en_US: Control the diversity and randomness of generated results. The smaller the value, the more rigorous it is; the larger the value, the more divergent it is.
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 4096
+    min: 1
+    max: 4096
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+    type: float
+    default: 1
+    min: 0.01
+    max: 1.0
+    help:
+      zh_Hans: 控制生成结果的随机性。数值越小，随机性越弱；数值越大，随机性越强。一般而言，top_p 和 temperature 两个参数选择一个进行调整即可。
+      en_US: Control the randomness of generated results. The smaller the value, the weaker the randomness; the larger the value, the stronger the randomness. Generally speaking, you can adjust one of the two parameters top_p and temperature.
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
+    default: 0
+    min: -2.0
+    max: 2.0
+    help:
+      zh_Hans: 介于 -2.0 和 2.0 之间的数字。如果该值为正，那么新 token 会根据其在已有文本中的出现频率受到相应的惩罚，降低模型重复相同内容的可能性。
+      en_US: A number between -2.0 and 2.0. If the value is positive, new tokens are penalized based on their frequency of occurrence in existing text, reducing the likelihood that the model will repeat the same content.
diff --git a/models/cometapi/models/llm/gemini/gemini-2.0-flash-lite-preview-02-05.yaml b/models/cometapi/models/llm/gemini/gemini-2.0-flash-lite-preview-02-05.yaml
new file mode 100755
index 000000000..fc354a989
--- /dev/null
+++ b/models/cometapi/models/llm/gemini/gemini-2.0-flash-lite-preview-02-05.yaml
@@ -0,0 +1,45 @@
+model: gemini-2.0-flash-lite-preview-02-05
+label:
+  en_US: gemini-2.0-flash-lite-preview-02-05
+model_type: llm
+features:
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 1048576
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 8192
+    min: 1
+    max: 8192
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
diff --git a/models/cometapi/models/llm/gemini/gemini-2.0-flash.yaml b/models/cometapi/models/llm/gemini/gemini-2.0-flash.yaml
new file mode 100755
index 000000000..4545283a8
--- /dev/null
+++ b/models/cometapi/models/llm/gemini/gemini-2.0-flash.yaml
@@ -0,0 +1,45 @@
+model: gemini-2.0-flash
+label:
+  en_US: gemini-2.0-flash
+model_type: llm
+features:
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 1048576
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 8192
+    min: 1
+    max: 8192
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
diff --git a/models/cometapi/models/llm/gemini/gemini-2.5-flash-lite-preview-06-17.yaml b/models/cometapi/models/llm/gemini/gemini-2.5-flash-lite-preview-06-17.yaml
new file mode 100755
index 000000000..f051b7083
--- /dev/null
+++ b/models/cometapi/models/llm/gemini/gemini-2.5-flash-lite-preview-06-17.yaml
@@ -0,0 +1,80 @@
+model: gemini-2.5-flash-lite-preview-06-17
+label:
+  en_US: gemini-2.5-flash-lite-preview-06-17
+model_type: llm
+features:
+  - multi-tool-call
+  - stream-tool-call
+  - vision
+model_properties:
+  mode: chat
+  context_size: 1048576
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 65536
+    min: 1
+    max: 65536
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
+  - name: enable_thinking
+    required: false
+    type: string
+    default: dynamic
+    label:
+      zh_Hans: 思考模式
+      en_US: Thinking mode
+    help:
+      zh_Hans: 是否开启思考模式。
+      en_US: Whether to enable thinking mode.
+    options:
+      - dynamic
+      - manual
+  - name: reasoning_budget
+    label:
+      zh_Hans: 思考预算
+      en_US: Reasoning budget
+    type: int
+    help:
+      zh_Hans: 思考预算，单位为 token
+      en_US: Reasoning budget, in tokens
+    required: false
+    min: 512
+    max: 24576
+  - name: exclude_reasoning_tokens
+    label:
+      zh_Hans: 隐藏思考过程
+      en_US: Hide the thought process
+    type: boolean
+    default: true
+    help:
+      zh_Hans: 是否隐藏思考过程。
+      en_US: Whether to hide the thought process.
+    required: false
diff --git a/models/cometapi/models/llm/gemini/gemini-2.5-flash.yaml b/models/cometapi/models/llm/gemini/gemini-2.5-flash.yaml
new file mode 100755
index 000000000..8acf2e289
--- /dev/null
+++ b/models/cometapi/models/llm/gemini/gemini-2.5-flash.yaml
@@ -0,0 +1,80 @@
+model: gemini-2.5-flash
+label:
+  en_US: gemini-2.5-flash
+model_type: llm
+features:
+  - multi-tool-call
+  - stream-tool-call
+  - vision
+model_properties:
+  mode: chat
+  context_size: 1048576
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 65536
+    min: 1
+    max: 65536
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
+  - name: enable_thinking
+    required: false
+    type: string
+    default: dynamic
+    label:
+      zh_Hans: 思考模式
+      en_US: Thinking mode
+    help:
+      zh_Hans: 是否开启思考模式。
+      en_US: Whether to enable thinking mode.
+    options:
+      - dynamic
+      - manual
+  - name: reasoning_budget
+    label:
+      zh_Hans: 思考预算
+      en_US: Reasoning budget
+    type: int
+    help:
+      zh_Hans: 思考预算，单位为 token
+      en_US: Reasoning budget, in tokens
+    required: false
+    min: 0
+    max: 24576
+  - name: exclude_reasoning_tokens
+    label:
+      zh_Hans: 隐藏思考过程
+      en_US: Hide the thought process
+    type: boolean
+    default: true
+    help:
+      zh_Hans: 是否隐藏思考过程。
+      en_US: Whether to hide the thought process.
+    required: false
diff --git a/models/cometapi/models/llm/gemini/gemini-2.5-pro.yaml b/models/cometapi/models/llm/gemini/gemini-2.5-pro.yaml
new file mode 100755
index 000000000..eafad66cf
--- /dev/null
+++ b/models/cometapi/models/llm/gemini/gemini-2.5-pro.yaml
@@ -0,0 +1,81 @@
+model: gemini-2.5-pro
+label:
+  en_US: gemini-2.5-pro
+model_type: llm
+features:
+  - multi-tool-call
+  - stream-tool-call
+  - vision
+  - document
+model_properties:
+  mode: chat
+  context_size: 1048576
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 65536
+    min: 1
+    max: 65536
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
+  - name: enable_thinking
+    required: false
+    type: string
+    default: dynamic
+    label:
+      zh_Hans: 思考模式
+      en_US: Thinking mode
+    help:
+      zh_Hans: 切换思考模式。
+      en_US: Switch thinking mode.
+    options:
+      - dynamic
+      - manual
+  - name: reasoning_budget
+    label:
+      zh_Hans: 思考预算
+      en_US: Reasoning budget
+    type: int
+    help:
+      zh_Hans: 思考预算，单位为 token
+      en_US: Reasoning budget, in tokens
+    required: false
+    min: 128
+    max: 32768
+  - name: exclude_reasoning_tokens
+    label:
+      zh_Hans: 隐藏思考过程
+      en_US: Hide the thought process
+    type: boolean
+    default: true
+    help:
+      zh_Hans: 是否隐藏思考过程。
+      en_US: Whether to hide the thought process.
+    required: false
diff --git a/models/cometapi/models/llm/llama/llama-3-70b.yaml b/models/cometapi/models/llm/llama/llama-3-70b.yaml
new file mode 100755
index 000000000..0f6d37f47
--- /dev/null
+++ b/models/cometapi/models/llm/llama/llama-3-70b.yaml
@@ -0,0 +1,31 @@
+model: llama-3-70b
+label:
+  en_US: llama-3-70b
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 512
+    min: 1
+    max: 2048
diff --git a/models/cometapi/models/llm/llama/llama-3-8b.yaml b/models/cometapi/models/llm/llama/llama-3-8b.yaml
new file mode 100755
index 000000000..090374ec9
--- /dev/null
+++ b/models/cometapi/models/llm/llama/llama-3-8b.yaml
@@ -0,0 +1,31 @@
+model: llama-3-8b
+label:
+  en_US: llama-3-8b
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 512
+    min: 1
+    max: 2048
diff --git a/models/cometapi/models/llm/llama/llama-4-maverick.yaml b/models/cometapi/models/llm/llama/llama-4-maverick.yaml
new file mode 100755
index 000000000..1021ee43c
--- /dev/null
+++ b/models/cometapi/models/llm/llama/llama-4-maverick.yaml
@@ -0,0 +1,31 @@
+model: llama-4-maverick
+label:
+  en_US: llama-4-maverick
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 512
+    min: 1
+    max: 2048
diff --git a/models/cometapi/models/llm/llama/llama-4-scout.yaml b/models/cometapi/models/llm/llama/llama-4-scout.yaml
new file mode 100755
index 000000000..b4a88cd06
--- /dev/null
+++ b/models/cometapi/models/llm/llama/llama-4-scout.yaml
@@ -0,0 +1,32 @@
+model: llama-4-scout
+label:
+  en_US: llama-4-scout
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+  - vision
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: max_tokens
+    use_template: max_tokens
+    required: true
+    default: 512
+    min: 1
+    max: 2048
diff --git a/models/cometapi/models/llm/llm.py b/models/cometapi/models/llm/llm.py
new file mode 100755
index 000000000..0a83c5f47
--- /dev/null
+++ b/models/cometapi/models/llm/llm.py
@@ -0,0 +1,137 @@
+from collections.abc import Generator
+from typing import Optional, Union
+from dify_plugin.entities.model import AIModelEntity, ModelFeature
+from dify_plugin.entities.model.llm import LLMResult, LLMResultChunk, LLMResultChunkDelta
+from dify_plugin.entities.model.message import PromptMessage, PromptMessageTool
+from dify_plugin import OAICompatLargeLanguageModel
+
+
+class CometAPILargeLanguageModel(OAICompatLargeLanguageModel):
+    def _update_credential(self, model: str, credentials: dict):
+        credentials["endpoint_url"] = "https://api.cometapi.com/v1"
+        credentials["mode"] = self.get_model_mode(model).value
+        credentials["openai_api_key"] = credentials["api_key"]  # Map api_key to openai_api_key
+        schema = self.get_model_schema(model, credentials)
+        if schema and {ModelFeature.TOOL_CALL, ModelFeature.MULTI_TOOL_CALL}.intersection(
+            schema.features or []
+        ):
+            credentials["function_calling_type"] = "tool_call"
+
+        # Add CometAPI specific headers
+        credentials["extra_headers"] = {
+            "HTTP-Referer": "https://dify.ai/",
+            "X-Title": "Dify"
+        }
+
+    def _invoke(
+        self,
+        model: str,
+        credentials: dict,
+        prompt_messages: list[PromptMessage],
+        model_parameters: dict,
+        tools: Optional[list[PromptMessageTool]] = None,
+        stop: Optional[list[str]] = None,
+        stream: bool = True,
+        user: Optional[str] = None,
+    ) -> Union[LLMResult, Generator]:
+        self._update_credential(model, credentials)
+        # reasoning
+        reasoning_params = {}
+        reasoning_budget = model_parameters.pop('reasoning_budget', None)
+        enable_thinking = model_parameters.pop('enable_thinking', None)
+        if enable_thinking == 'dynamic':
+            reasoning_budget = -1
+        if reasoning_budget is not None:
+            reasoning_params['max_tokens'] = reasoning_budget
+        reasoning_effort = model_parameters.pop('reasoning_effort', None)
+        if reasoning_effort is not None:
+            reasoning_params['effort'] = reasoning_effort
+        exclude_reasoning_tokens = model_parameters.pop('exclude_reasoning_tokens', None)
+        if exclude_reasoning_tokens is not None:
+            reasoning_params['exclude'] = exclude_reasoning_tokens
+        if reasoning_params:
+            model_parameters['reasoning'] = reasoning_params
+        return self._generate(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
+
+    def validate_credentials(self, model: str, credentials: dict) -> None:
+        self._update_credential(model, credentials)
+        return super().validate_credentials(model, credentials)
+
+    def _generate(
+        self,
+        model: str,
+        credentials: dict,
+        prompt_messages: list[PromptMessage],
+        model_parameters: dict,
+        tools: Optional[list[PromptMessageTool]] = None,
+        stop: Optional[list[str]] = None,
+        stream: bool = True,
+        user: Optional[str] = None,
+    ) -> Union[LLMResult, Generator]:
+        self._update_credential(model, credentials)
+        return super()._generate(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
+
+    def _wrap_thinking_by_reasoning_content(self, delta: dict, is_reasoning: bool) -> tuple[str, bool]:
+        """
+        If the reasoning response is from delta.get("reasoning") or delta.get("reasoning_content"),
+        we wrap it with HTML think tag.
+
+        :param delta: delta dictionary from LLM streaming response
+        :param is_reasoning: is reasoning
+        :return: tuple of (processed_content, is_reasoning)
+        """
+
+        content = delta.get("content") or ""
+        # NOTE(hzw): CometAPI uses "reasoning" instead of "reasoning_content".
+        reasoning_content = delta.get("reasoning") or delta.get("reasoning_content")
+
+        if reasoning_content:
+            if not is_reasoning:
+                content = "<think>\n" + reasoning_content
+                is_reasoning = True
+            else:
+                content = reasoning_content
+        elif is_reasoning and content:
+            content = "\n</think>" + content
+            is_reasoning = False
+        return content, is_reasoning
+
+    def _generate_block_as_stream(
+        self,
+        model: str,
+        credentials: dict,
+        prompt_messages: list[PromptMessage],
+        model_parameters: dict,
+        tools: Optional[list[PromptMessageTool]] = None,
+        stop: Optional[list[str]] = None,
+        user: Optional[str] = None,
+    ) -> Generator:
+        resp = super()._generate(model, credentials, prompt_messages, model_parameters, tools, stop, False, user)
+        yield LLMResultChunk(
+            model=model,
+            prompt_messages=prompt_messages,
+            delta=LLMResultChunkDelta(
+                index=0,
+                message=resp.message,
+                usage=self._calc_response_usage(
+                    model=model,
+                    credentials=credentials,
+                    prompt_tokens=resp.usage.prompt_tokens,
+                    completion_tokens=resp.usage.completion_tokens,
+                ),
+                finish_reason="stop",
+            ),
+        )
+
+    def get_customizable_model_schema(self, model: str, credentials: dict) -> AIModelEntity:
+        return super().get_customizable_model_schema(model, credentials)
+
+    def get_num_tokens(
+        self,
+        model: str,
+        credentials: dict,
+        prompt_messages: list[PromptMessage],
+        tools: Optional[list[PromptMessageTool]] = None,
+    ) -> int:
+        self._update_credential(model, credentials)
+        return super().get_num_tokens(model, credentials, prompt_messages, tools)
diff --git a/models/cometapi/models/llm/openai/chatgpt-4o-latest.yaml b/models/cometapi/models/llm/openai/chatgpt-4o-latest.yaml
new file mode 100644
index 000000000..633bacb89
--- /dev/null
+++ b/models/cometapi/models/llm/openai/chatgpt-4o-latest.yaml
@@ -0,0 +1,39 @@
+model: chatgpt-4o-latest
+label:
+  zh_Hans: chatgpt-4o-latest
+  en_US: chatgpt-4o-latest
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 16384
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-3.5-turbo-0125.yaml b/models/cometapi/models/llm/openai/gpt-3.5-turbo-0125.yaml
new file mode 100644
index 000000000..cb160df7a
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-3.5-turbo-0125.yaml
@@ -0,0 +1,38 @@
+model: gpt-3.5-turbo-0125
+label:
+  zh_Hans: gpt-3.5-turbo-0125
+  en_US: gpt-3.5-turbo-0125
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 16385
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-3.5-turbo-1106.yaml b/models/cometapi/models/llm/openai/gpt-3.5-turbo-1106.yaml
new file mode 100644
index 000000000..2003a9a2e
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-3.5-turbo-1106.yaml
@@ -0,0 +1,38 @@
+model: gpt-3.5-turbo-1106
+label:
+  zh_Hans: gpt-3.5-turbo-1106
+  en_US: gpt-3.5-turbo-1106
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 16385
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-3.5-turbo-16k.yaml b/models/cometapi/models/llm/openai/gpt-3.5-turbo-16k.yaml
new file mode 100644
index 000000000..47dee27b4
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-3.5-turbo-16k.yaml
@@ -0,0 +1,28 @@
+model: gpt-3.5-turbo-16k
+label:
+  zh_Hans: gpt-3.5-turbo-16k
+  en_US: gpt-3.5-turbo-16k
+model_type: llm
+features:
+  - multi-tool-call
+  - agent-thought
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 16385
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: max_tokens
+    use_template: max_tokens
+    default: 512
+    min: 1
+    max: 16385
+  - name: response_format
+    use_template: response_format
diff --git a/models/cometapi/models/llm/openai/gpt-3.5-turbo-instruct.yaml b/models/cometapi/models/llm/openai/gpt-3.5-turbo-instruct.yaml
new file mode 100644
index 000000000..2def79cb4
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-3.5-turbo-instruct.yaml
@@ -0,0 +1,25 @@
+model: gpt-3.5-turbo-instruct
+label:
+  zh_Hans: gpt-3.5-turbo-instruct
+  en_US: gpt-3.5-turbo-instruct
+model_type: llm
+features: []
+model_properties:
+  mode: chat
+  context_size: 4096
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: response_format
+  use_template: response_format
diff --git a/models/cometapi/models/llm/openai/gpt-3.5-turbo.yaml b/models/cometapi/models/llm/openai/gpt-3.5-turbo.yaml
new file mode 100644
index 000000000..d106b9933
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-3.5-turbo.yaml
@@ -0,0 +1,38 @@
+model: gpt-3.5-turbo
+label:
+  zh_Hans: gpt-3.5-turbo
+  en_US: gpt-3.5-turbo
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 16385
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4-0125-preview.yaml b/models/cometapi/models/llm/openai/gpt-4-0125-preview.yaml
new file mode 100644
index 000000000..4871e08fa
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4-0125-preview.yaml
@@ -0,0 +1,47 @@
+model: gpt-4-0125-preview
+label:
+  zh_Hans: gpt-4-0125-preview
+  en_US: gpt-4-0125-preview
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: seed
+  label:
+    zh_Hans: 种子
+    en_US: Seed
+  type: int
+  help:
+    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint 响应参数来监视变化。
+    en_US: If specified, model will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
+  required: false
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4-1106-preview.yaml b/models/cometapi/models/llm/openai/gpt-4-1106-preview.yaml
new file mode 100644
index 000000000..69abea53f
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4-1106-preview.yaml
@@ -0,0 +1,47 @@
+model: gpt-4-1106-preview
+label:
+  zh_Hans: gpt-4-1106-preview
+  en_US: gpt-4-1106-preview
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: seed
+  label:
+    zh_Hans: 种子
+    en_US: Seed
+  type: int
+  help:
+    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint 响应参数来监视变化。
+    en_US: If specified, model will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
+  required: false
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4-turbo-2024-04-09.yaml b/models/cometapi/models/llm/openai/gpt-4-turbo-2024-04-09.yaml
new file mode 100644
index 000000000..3c13a6d04
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4-turbo-2024-04-09.yaml
@@ -0,0 +1,48 @@
+model: gpt-4-turbo-2024-04-09
+label:
+  zh_Hans: gpt-4-turbo-2024-04-09
+  en_US: gpt-4-turbo-2024-04-09
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: seed
+  label:
+    zh_Hans: 种子
+    en_US: Seed
+  type: int
+  help:
+    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint 响应参数来监视变化。
+    en_US: If specified, model will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
+  required: false
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4-turbo-preview.yaml b/models/cometapi/models/llm/openai/gpt-4-turbo-preview.yaml
new file mode 100644
index 000000000..e1e4dde74
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4-turbo-preview.yaml
@@ -0,0 +1,47 @@
+model: gpt-4-turbo-preview
+label:
+  zh_Hans: gpt-4-turbo-preview
+  en_US: gpt-4-turbo-preview
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: seed
+  label:
+    zh_Hans: 种子
+    en_US: Seed
+  type: int
+  help:
+    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint 响应参数来监视变化。
+    en_US: If specified, model will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
+  required: false
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4-turbo.yaml b/models/cometapi/models/llm/openai/gpt-4-turbo.yaml
new file mode 100644
index 000000000..e54cce440
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4-turbo.yaml
@@ -0,0 +1,48 @@
+model: gpt-4-turbo
+label:
+  zh_Hans: gpt-4-turbo
+  en_US: gpt-4-turbo
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 4096
+- name: seed
+  label:
+    zh_Hans: 种子
+    en_US: Seed
+  type: int
+  help:
+    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint 响应参数来监视变化。
+    en_US: If specified, model will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
+  required: false
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4.1-2025-04-14.yaml b/models/cometapi/models/llm/openai/gpt-4.1-2025-04-14.yaml
new file mode 100644
index 000000000..f0a1d589d
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4.1-2025-04-14.yaml
@@ -0,0 +1,42 @@
+model: gpt-4.1-2025-04-14
+label:
+  zh_Hans: gpt-4.1-2025-04-14
+  en_US: gpt-4.1-2025-04-14
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 1047576
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4.1-mini-2025-04-14.yaml b/models/cometapi/models/llm/openai/gpt-4.1-mini-2025-04-14.yaml
new file mode 100644
index 000000000..6e586855c
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4.1-mini-2025-04-14.yaml
@@ -0,0 +1,42 @@
+model: gpt-4.1-mini-2025-04-14
+label:
+  zh_Hans: gpt-4.1-mini-2025-04-14
+  en_US: gpt-4.1-mini-2025-04-14
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 1047576
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4.1-mini.yaml b/models/cometapi/models/llm/openai/gpt-4.1-mini.yaml
new file mode 100644
index 000000000..675e6b625
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4.1-mini.yaml
@@ -0,0 +1,42 @@
+model: gpt-4.1-mini
+label:
+  zh_Hans: gpt-4.1-mini
+  en_US: gpt-4.1-mini
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 1047576
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4.1-nano-2025-04-14.yaml b/models/cometapi/models/llm/openai/gpt-4.1-nano-2025-04-14.yaml
new file mode 100644
index 000000000..631d6ba23
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4.1-nano-2025-04-14.yaml
@@ -0,0 +1,42 @@
+model: gpt-4.1-nano-2025-04-14
+label:
+  zh_Hans: gpt-4.1-nano-2025-04-14
+  en_US: gpt-4.1-nano-2025-04-14
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 1047576
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4.1-nano.yaml b/models/cometapi/models/llm/openai/gpt-4.1-nano.yaml
new file mode 100644
index 000000000..8aaea8047
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4.1-nano.yaml
@@ -0,0 +1,42 @@
+model: gpt-4.1-nano
+label:
+  zh_Hans: gpt-4.1-nano
+  en_US: gpt-4.1-nano
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 1047576
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4.1.yaml b/models/cometapi/models/llm/openai/gpt-4.1.yaml
new file mode 100644
index 000000000..7a9a2750f
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4.1.yaml
@@ -0,0 +1,42 @@
+model: gpt-4.1
+label:
+  zh_Hans: gpt-4.1
+  en_US: gpt-4.1
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 1047576
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4.yaml b/models/cometapi/models/llm/openai/gpt-4.yaml
new file mode 100644
index 000000000..6d841d22e
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4.yaml
@@ -0,0 +1,47 @@
+model: gpt-4
+label:
+  zh_Hans: gpt-4
+  en_US: gpt-4
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 8192
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 8192
+- name: seed
+  label:
+    zh_Hans: 种子
+    en_US: Seed
+  type: int
+  help:
+    zh_Hans: 如果指定，模型将尽最大努力进行确定性采样，使得重复的具有相同种子和参数的请求应该返回相同的结果。不能保证确定性，您应该参考 system_fingerprint 响应参数来监视变化。
+    en_US: If specified, model will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
+  required: false
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4o-2024-05-13.yaml b/models/cometapi/models/llm/openai/gpt-4o-2024-05-13.yaml
new file mode 100644
index 000000000..fe94a633f
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4o-2024-05-13.yaml
@@ -0,0 +1,39 @@
+model: gpt-4o-2024-05-13
+label:
+  zh_Hans: gpt-4o-2024-05-13
+  en_US: gpt-4o-2024-05-13
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 2048
+  min: 1
+  max: 4096
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/gpt-4o-2024-08-06.yaml b/models/cometapi/models/llm/openai/gpt-4o-2024-08-06.yaml
new file mode 100644
index 000000000..fc49b93d5
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4o-2024-08-06.yaml
@@ -0,0 +1,42 @@
+model: gpt-4o-2024-08-06
+label:
+  zh_Hans: gpt-4o-2024-08-06
+  en_US: gpt-4o-2024-08-06
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 8192
+  min: 1
+  max: 16384
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4o-2024-11-20.yaml b/models/cometapi/models/llm/openai/gpt-4o-2024-11-20.yaml
new file mode 100644
index 000000000..0a7552ada
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4o-2024-11-20.yaml
@@ -0,0 +1,42 @@
+model: gpt-4o-2024-11-20
+label:
+  zh_Hans: gpt-4o-2024-11-20
+  en_US: gpt-4o-2024-11-20
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 8192
+  min: 1
+  max: 16384
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4o-mini-2024-07-18.yaml b/models/cometapi/models/llm/openai/gpt-4o-mini-2024-07-18.yaml
new file mode 100644
index 000000000..ad50b7851
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4o-mini-2024-07-18.yaml
@@ -0,0 +1,42 @@
+model: gpt-4o-mini-2024-07-18
+label:
+  zh_Hans: gpt-4o-mini-2024-07-18
+  en_US: gpt-4o-mini-2024-07-18
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 16384
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4o-mini.yaml b/models/cometapi/models/llm/openai/gpt-4o-mini.yaml
new file mode 100644
index 000000000..95660975e
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4o-mini.yaml
@@ -0,0 +1,42 @@
+model: gpt-4o-mini
+label:
+  zh_Hans: gpt-4o-mini
+  en_US: gpt-4o-mini
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 512
+  min: 1
+  max: 16384
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/gpt-4o.yaml b/models/cometapi/models/llm/openai/gpt-4o.yaml
new file mode 100644
index 000000000..6c37f90ef
--- /dev/null
+++ b/models/cometapi/models/llm/openai/gpt-4o.yaml
@@ -0,0 +1,42 @@
+model: gpt-4o
+label:
+  zh_Hans: gpt-4o
+  en_US: gpt-4o
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: temperature
+  use_template: temperature
+- name: top_p
+  use_template: top_p
+- name: presence_penalty
+  use_template: presence_penalty
+- name: frequency_penalty
+  use_template: frequency_penalty
+- name: max_tokens
+  use_template: max_tokens
+  default: 8192
+  min: 1
+  max: 16384
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: Response Format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o1-mini-2024-09-12.yaml b/models/cometapi/models/llm/openai/o1-mini-2024-09-12.yaml
new file mode 100644
index 000000000..abcb29c3d
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o1-mini-2024-09-12.yaml
@@ -0,0 +1,28 @@
+model: o1-mini-2024-09-12
+label:
+  zh_Hans: o1-mini-2024-09-12
+  en_US: o1-mini-2024-09-12
+model_type: llm
+features:
+- agent-thought
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 65536
+  min: 1
+  max: 65536
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/o1-mini.yaml b/models/cometapi/models/llm/openai/o1-mini.yaml
new file mode 100644
index 000000000..9c12f094e
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o1-mini.yaml
@@ -0,0 +1,28 @@
+model: o1-mini
+label:
+  zh_Hans: o1-mini
+  en_US: o1-mini
+model_type: llm
+features:
+- agent-thought
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 65536
+  min: 1
+  max: 65536
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/o1-preview-2024-09-12.yaml b/models/cometapi/models/llm/openai/o1-preview-2024-09-12.yaml
new file mode 100644
index 000000000..020582430
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o1-preview-2024-09-12.yaml
@@ -0,0 +1,28 @@
+model: o1-preview-2024-09-12
+label:
+  zh_Hans: o1-preview-2024-09-12
+  en_US: o1-preview-2024-09-12
+model_type: llm
+features:
+- agent-thought
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 32768
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/o1-preview.yaml b/models/cometapi/models/llm/openai/o1-preview.yaml
new file mode 100644
index 000000000..aa1f0c115
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o1-preview.yaml
@@ -0,0 +1,28 @@
+model: o1-preview
+label:
+  zh_Hans: o1-preview
+  en_US: o1-preview
+model_type: llm
+features:
+- agent-thought
+model_properties:
+  mode: chat
+  context_size: 128000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 32768
+  min: 1
+  max: 32768
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
diff --git a/models/cometapi/models/llm/openai/o1.yaml b/models/cometapi/models/llm/openai/o1.yaml
new file mode 100644
index 000000000..380082ca2
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o1.yaml
@@ -0,0 +1,47 @@
+model: o1
+label:
+  zh_Hans: o1
+  en_US: o1
+model_type: llm
+features:
+- multi-tool-call
+- agent-thought
+- stream-tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 50000
+  min: 1
+  max: 50000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o3-2025-04-16.yaml b/models/cometapi/models/llm/openai/o3-2025-04-16.yaml
new file mode 100644
index 000000000..03105e3cc
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o3-2025-04-16.yaml
@@ -0,0 +1,47 @@
+model: o3-2025-04-16
+label:
+  zh_Hans: o3-2025-04-16
+  en_US: o3-2025-04-16
+model_type: llm
+features:
+- agent-thought
+- tool-call
+- vision
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o3-mini-2025-01-31.yaml b/models/cometapi/models/llm/openai/o3-mini-2025-01-31.yaml
new file mode 100644
index 000000000..cb2c8018a
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o3-mini-2025-01-31.yaml
@@ -0,0 +1,46 @@
+model: o3-mini-2025-01-31
+label:
+  zh_Hans: o3-mini-2025-01-31
+  en_US: o3-mini-2025-01-31
+model_type: llm
+features:
+- agent-thought
+- tool-call
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o3-mini.yaml b/models/cometapi/models/llm/openai/o3-mini.yaml
new file mode 100644
index 000000000..f4a673925
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o3-mini.yaml
@@ -0,0 +1,46 @@
+model: o3-mini
+label:
+  zh_Hans: o3-mini
+  en_US: o3-mini
+model_type: llm
+features:
+- agent-thought
+- tool-call
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o3-pro-2025-06-10.yaml b/models/cometapi/models/llm/openai/o3-pro-2025-06-10.yaml
new file mode 100644
index 000000000..d51db26d6
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o3-pro-2025-06-10.yaml
@@ -0,0 +1,46 @@
+model: o3-pro-2025-06-10
+label:
+  zh_Hans: o3-pro-2025-06-10
+  en_US: o3-pro-2025-06-10
+model_type: llm
+features:
+- agent-thought
+- tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o3-pro.yaml b/models/cometapi/models/llm/openai/o3-pro.yaml
new file mode 100644
index 000000000..9aa5c4b22
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o3-pro.yaml
@@ -0,0 +1,46 @@
+model: o3-pro
+label:
+  zh_Hans: o3-pro
+  en_US: o3-pro
+model_type: llm
+features:
+- agent-thought
+- tool-call
+- vision
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o3.yaml b/models/cometapi/models/llm/openai/o3.yaml
new file mode 100644
index 000000000..01fc1070a
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o3.yaml
@@ -0,0 +1,47 @@
+model: o3
+label:
+  zh_Hans: o3
+  en_US: o3
+model_type: llm
+features:
+- agent-thought
+- vision
+- tool-call
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o4-mini-2025-04-16.yaml b/models/cometapi/models/llm/openai/o4-mini-2025-04-16.yaml
new file mode 100644
index 000000000..41085fbcc
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o4-mini-2025-04-16.yaml
@@ -0,0 +1,47 @@
+model: o4-mini-2025-04-16
+label:
+  zh_Hans: o4-mini-2025-04-16
+  en_US: o4-mini-2025-04-16
+model_type: llm
+features:
+- agent-thought
+- tool-call
+- vision
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/openai/o4-mini.yaml b/models/cometapi/models/llm/openai/o4-mini.yaml
new file mode 100644
index 000000000..789ee187c
--- /dev/null
+++ b/models/cometapi/models/llm/openai/o4-mini.yaml
@@ -0,0 +1,47 @@
+model: o4-mini
+label:
+  zh_Hans: o4-mini
+  en_US: o4-mini
+model_type: llm
+features:
+- agent-thought
+- tool-call
+- vision
+- stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 200000
+parameter_rules:
+- name: max_tokens
+  use_template: max_tokens
+  default: 100000
+  min: 1
+  max: 100000
+- name: reasoning_effort
+  label:
+    zh_Hans: 推理工作
+    en_US: reasoning_effort
+  type: string
+  help:
+    zh_Hans: 限制推理模型的推理工作
+    en_US: constrains effort on reasoning for reasoning models
+  required: false
+  options:
+  - low
+  - medium
+  - high
+- name: response_format
+  label:
+    zh_Hans: 回复格式
+    en_US: response_format
+  type: string
+  help:
+    zh_Hans: 指定模型必须输出的格式
+    en_US: specifying the format that the model must output
+  required: false
+  options:
+  - text
+  - json_object
+  - json_schema
+- name: json_schema
+  use_template: json_schema
diff --git a/models/cometapi/models/llm/qwen/qwen2-72b-instruct.yaml b/models/cometapi/models/llm/qwen/qwen2-72b-instruct.yaml
new file mode 100755
index 000000000..8daf73871
--- /dev/null
+++ b/models/cometapi/models/llm/qwen/qwen2-72b-instruct.yaml
@@ -0,0 +1,36 @@
+model: qwen2-72b-instruct
+label:
+  en_US: qwen2-72b-instruct
+model_type: llm
+features:
+  - tool-call
+model_properties:
+  mode: chat
+  context_size: 32768
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 512
+    min: 1
+    max: 4096
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
+metadata:
+  updated: "2025-08-06"
diff --git a/models/cometapi/models/llm/qwen/qwen2.5-72b-instruct.yaml b/models/cometapi/models/llm/qwen/qwen2.5-72b-instruct.yaml
new file mode 100755
index 000000000..5799a27c4
--- /dev/null
+++ b/models/cometapi/models/llm/qwen/qwen2.5-72b-instruct.yaml
@@ -0,0 +1,34 @@
+model: qwen2.5-72b-instruct
+label:
+  en_US: qwen2.5-72b-instruct
+model_type: llm
+features:
+  - tool-call
+model_properties:
+  mode: chat
+  context_size: 131072
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 512
+    min: 1
+    max: 8192
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
diff --git a/models/cometapi/models/llm/qwen/qwen3-30b-a3b.yaml b/models/cometapi/models/llm/qwen/qwen3-30b-a3b.yaml
new file mode 100755
index 000000000..4c3a708b0
--- /dev/null
+++ b/models/cometapi/models/llm/qwen/qwen3-30b-a3b.yaml
@@ -0,0 +1,34 @@
+model: qwen3-30b-a3b
+label:
+  en_US: qwen3-30b-a3b
+model_type: llm
+features:
+  - tool-call
+model_properties:
+  mode: chat
+  context_size: 131072
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 512
+    min: 1
+    max: 8192
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
diff --git a/models/cometapi/models/llm/qwen/qwen3-coder-480b-a35b-instruct.yaml b/models/cometapi/models/llm/qwen/qwen3-coder-480b-a35b-instruct.yaml
new file mode 100755
index 000000000..1f0bc5363
--- /dev/null
+++ b/models/cometapi/models/llm/qwen/qwen3-coder-480b-a35b-instruct.yaml
@@ -0,0 +1,34 @@
+model: qwen3-coder-480b-a35b-instruct
+label:
+  en_US: qwen3-coder-480b-a35b-instruct
+model_type: llm
+features:
+  - tool-call
+model_properties:
+  mode: chat
+  context_size: 262144
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 512
+    min: 1
+    max: 8192
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
diff --git a/models/cometapi/models/llm/qwen/qwen3-coder.yaml b/models/cometapi/models/llm/qwen/qwen3-coder.yaml
new file mode 100755
index 000000000..b40842dee
--- /dev/null
+++ b/models/cometapi/models/llm/qwen/qwen3-coder.yaml
@@ -0,0 +1,34 @@
+model: qwen3-coder
+label:
+  en_US: qwen3-coder
+model_type: llm
+features:
+  - tool-call
+model_properties:
+  mode: chat
+  context_size: 262144
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: max_tokens
+    use_template: max_tokens
+    type: int
+    default: 512
+    min: 1
+    max: 8192
+    help:
+      zh_Hans: 指定生成结果长度的上限。如果生成结果截断，可以调大该参数。
+      en_US: Specifies the upper limit on the length of generated results. If the generated results are truncated, you can increase this parameter.
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: frequency_penalty
+    use_template: frequency_penalty
diff --git a/models/cometapi/models/llm/x-ai/grok-2-1212.yaml b/models/cometapi/models/llm/x-ai/grok-2-1212.yaml
new file mode 100644
index 000000000..ad1cb1734
--- /dev/null
+++ b/models/cometapi/models/llm/x-ai/grok-2-1212.yaml
@@ -0,0 +1,41 @@
+model: grok-2-1212
+label:
+  en_US: grok-2-1212
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 131072
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+  - name: top_p
+    use_template: top_p
+  - name: presence_penalty
+    use_template: presence_penalty
+    min: -2.0
+    max: 2.0
+  - name: frequency_penalty
+    use_template: frequency_penalty
+    min: -2.0
+    max: 2.0
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - json_object
+      - json_schema
+  - name: json_schema
+    use_template: json_schema
diff --git a/models/cometapi/models/llm/x-ai/grok-3-beta.yaml b/models/cometapi/models/llm/x-ai/grok-3-beta.yaml
new file mode 100644
index 000000000..569db2ceb
--- /dev/null
+++ b/models/cometapi/models/llm/x-ai/grok-3-beta.yaml
@@ -0,0 +1,40 @@
+model: grok-3-beta
+label:
+  en_US: grok-3-beta
+model_type: llm
+features:
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 131072
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+  - name: top_p
+    use_template: top_p
+  - name: top_k
+    label:
+      zh_Hans: 取样数量
+      en_US: Top k
+    type: int
+    help:
+      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
+      en_US: Only sample from the top K options for each subsequent token.
+    required: false
+  - name: presence_penalty
+    use_template: presence_penalty
+  - name: frequency_penalty
+    use_template: frequency_penalty
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - text
+      - json_object
diff --git a/models/cometapi/models/llm/x-ai/grok-3-fast.yaml b/models/cometapi/models/llm/x-ai/grok-3-fast.yaml
new file mode 100644
index 000000000..bf69c71f4
--- /dev/null
+++ b/models/cometapi/models/llm/x-ai/grok-3-fast.yaml
@@ -0,0 +1,47 @@
+model: grok-3-fast
+label:
+  en_US: grok-3-fast
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 131072
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+  - name: top_p
+    use_template: top_p
+  - name: frequency_penalty
+    use_template: frequency_penalty
+    min: -2.0
+    max: 2.0
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - json_object
+      - json_schema
+  - name: json_schema
+    use_template: json_schema
+  - name: search_parameters
+    label:
+      zh_Hans: 联网搜索参数
+      en_US: Live Search Parameters
+    type: text
+    default: "{\n  \"mode\": \"auto\"\n}"
+    help:
+      zh_Hans: 传递给联网搜索的参数，具体参数见 https://docs.x.ai/docs/api-reference#chat-completions
+      en_US: Parameters to pass to the live search, see https://docs.x.ai/docs/api-reference#chat-completions
+    required: false
diff --git a/models/cometapi/models/llm/x-ai/grok-3-mini-fast.yaml b/models/cometapi/models/llm/x-ai/grok-3-mini-fast.yaml
new file mode 100644
index 000000000..0498ba25c
--- /dev/null
+++ b/models/cometapi/models/llm/x-ai/grok-3-mini-fast.yaml
@@ -0,0 +1,55 @@
+model: grok-3-mini-fast
+label:
+  en_US: grok-3-mini-fast
+model_type: llm
+features:
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 131072
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+  - name: top_p
+    use_template: top_p
+  - name: reasoning_effort
+    label:
+      zh_Hans: 推理工作
+      en_US: reasoning_effort
+    type: string
+    help:
+      zh_Hans: 限制推理模型的推理工作
+      en_US: constrains effort on reasoning for reasoning models
+    required: false
+    options:
+      - low
+      - high
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - json_object
+      - json_schema
+  - name: json_schema
+    use_template: json_schema
+  - name: search_parameters
+    label:
+      zh_Hans: 联网搜索参数
+      en_US: Live Search Parameters
+    type: text
+    default: "{\n  \"mode\": \"auto\"\n}"
+    help:
+      zh_Hans: 传递给联网搜索的参数，具体参数见 https://docs.x.ai/docs/api-reference#chat-completions
+      en_US: Parameters to pass to the live search, see https://docs.x.ai/docs/api-reference#chat-completions
+    required: false
diff --git a/models/cometapi/models/llm/x-ai/grok-3-mini.yaml b/models/cometapi/models/llm/x-ai/grok-3-mini.yaml
new file mode 100644
index 000000000..97310b4aa
--- /dev/null
+++ b/models/cometapi/models/llm/x-ai/grok-3-mini.yaml
@@ -0,0 +1,56 @@
+model: grok-3-mini
+label:
+  en_US: grok-3-mini
+model_type: llm
+features:
+  - agent-thought
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 131072
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+  - name: top_p
+    use_template: top_p
+  - name: reasoning_effort
+    label:
+      zh_Hans: 推理工作
+      en_US: reasoning_effort
+    type: string
+    help:
+      zh_Hans: 限制推理模型的推理工作
+      en_US: constrains effort on reasoning for reasoning models
+    required: false
+    options:
+      - low
+      - high
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - json_object
+      - json_schema
+  - name: json_schema
+    use_template: json_schema
+  - name: search_parameters
+    label:
+      zh_Hans: 联网搜索参数
+      en_US: Live Search Parameters
+    type: text
+    default: "{\n  \"mode\": \"auto\"\n}"
+    help:
+      zh_Hans: 传递给联网搜索的参数，具体参数见 https://docs.x.ai/docs/api-reference#chat-completions
+      en_US: Parameters to pass to the live search, see https://docs.x.ai/docs/api-reference#chat-completions
+    required: false
diff --git a/models/cometapi/models/llm/x-ai/grok-4-0709.yaml b/models/cometapi/models/llm/x-ai/grok-4-0709.yaml
new file mode 100644
index 000000000..92ec0646f
--- /dev/null
+++ b/models/cometapi/models/llm/x-ai/grok-4-0709.yaml
@@ -0,0 +1,44 @@
+model: grok-4-0709
+label:
+  en_US: grok-4-0709
+model_type: llm
+features:
+  - agent-thought
+  - tool-call
+  - multi-tool-call
+  - stream-tool-call
+model_properties:
+  mode: chat
+  context_size: 256000
+parameter_rules:
+  - name: temperature
+    use_template: temperature
+    default: 1.0
+    min: 0.0
+    max: 2.0
+  - name: top_p
+    use_template: top_p
+  - name: response_format
+    label:
+      zh_Hans: 回复格式
+      en_US: Response Format
+    type: string
+    help:
+      zh_Hans: 指定模型必须输出的格式
+      en_US: specifying the format that the model must output
+    required: false
+    options:
+      - json_object
+      - json_schema
+  - name: json_schema
+    use_template: json_schema
+  - name: search_parameters
+    label:
+      zh_Hans: 联网搜索参数
+      en_US: Live Search Parameters
+    type: text
+    default: "{\n  \"mode\": \"auto\"\n}"
+    help:
+      zh_Hans: 传递给联网搜索的参数，具体参数见 https://docs.x.ai/docs/api-reference#chat-completions
+      en_US: Parameters to pass to the live search, see https://docs.x.ai/docs/api-reference#chat-completions
+    required: false
diff --git a/models/cometapi/models/speech2text/_position.yaml b/models/cometapi/models/speech2text/_position.yaml
new file mode 100644
index 000000000..69a0aeb5c
--- /dev/null
+++ b/models/cometapi/models/speech2text/_position.yaml
@@ -0,0 +1,4 @@
+# unknown models (3)
+- gpt-4o-mini-transcribe
+- gpt-4o-transcribe
+- whisper-1
diff --git a/models/cometapi/models/speech2text/gpt-4o-mini-transcribe.yaml b/models/cometapi/models/speech2text/gpt-4o-mini-transcribe.yaml
new file mode 100644
index 000000000..f432019e4
--- /dev/null
+++ b/models/cometapi/models/speech2text/gpt-4o-mini-transcribe.yaml
@@ -0,0 +1,5 @@
+model: gpt-4o-mini-transcribe
+model_type: speech2text
+model_properties:
+  file_upload_limit: 25
+  supported_file_extensions: flac,mp3,mp4,mpeg,mpga,m4a,ogg,wav,webm 
\ No newline at end of file
diff --git a/models/cometapi/models/speech2text/gpt-4o-transcribe.yaml b/models/cometapi/models/speech2text/gpt-4o-transcribe.yaml
new file mode 100644
index 000000000..ed2216696
--- /dev/null
+++ b/models/cometapi/models/speech2text/gpt-4o-transcribe.yaml
@@ -0,0 +1,5 @@
+model: gpt-4o-transcribe
+model_type: speech2text
+model_properties:
+  file_upload_limit: 25
+  supported_file_extensions: flac,mp3,mp4,mpeg,mpga,m4a,ogg,wav,webm 
\ No newline at end of file
diff --git a/models/cometapi/models/speech2text/speech2text.py b/models/cometapi/models/speech2text/speech2text.py
new file mode 100644
index 000000000..662a8726c
--- /dev/null
+++ b/models/cometapi/models/speech2text/speech2text.py
@@ -0,0 +1,61 @@
+from typing import IO, Optional
+
+from openai import OpenAI
+
+from dify_plugin import Speech2TextModel
+from dify_plugin.errors.model import CredentialsValidateFailedError
+from ..common_openai import _CommonOpenAI
+
+class OpenAISpeech2TextModel(_CommonOpenAI, Speech2TextModel):
+    """
+    Model class for OpenAI Speech to text model.
+    """
+
+    def _invoke(self, model: str, credentials: dict,
+                file: IO[bytes], user: Optional[str] = None) \
+            -> str:
+        """
+        Invoke speech2text model
+
+        :param model: model name
+        :param credentials: model credentials
+        :param file: audio file
+        :param user: unique user id
+        :return: text for given audio file
+        """
+        return self._speech2text_invoke(model, credentials, file)
+
+    def validate_credentials(self, model: str, credentials: dict) -> None:
+        """
+        Validate model credentials
+
+        :param model: model name
+        :param credentials: model credentials
+        :return:
+        """
+        try:
+            audio_file_path = self._get_demo_file_path()
+
+            with open(audio_file_path, 'rb') as audio_file:
+                self._speech2text_invoke(model, credentials, audio_file)
+        except Exception as ex:
+            raise CredentialsValidateFailedError(str(ex))
+
+    def _speech2text_invoke(self, model: str, credentials: dict, file: IO[bytes]) -> str:
+        """
+        Invoke speech2text model
+
+        :param model: model name
+        :param credentials: model credentials
+        :param file: audio file
+        :return: text for given audio file
+        """
+        # transform credentials to kwargs for model instance
+        credentials_kwargs = self._to_credential_kwargs(credentials)
+
+        # init model client
+        client = OpenAI(**credentials_kwargs)
+
+        response = client.audio.transcriptions.create(model=model, file=file)
+
+        return response.text
diff --git a/models/cometapi/models/speech2text/whisper-1.yaml b/models/cometapi/models/speech2text/whisper-1.yaml
new file mode 100644
index 000000000..6c14c7661
--- /dev/null
+++ b/models/cometapi/models/speech2text/whisper-1.yaml
@@ -0,0 +1,5 @@
+model: whisper-1
+model_type: speech2text
+model_properties:
+  file_upload_limit: 25
+  supported_file_extensions: flac,mp3,mp4,mpeg,mpga,m4a,ogg,wav,webm
diff --git a/models/cometapi/models/text_embedding/_position.yaml b/models/cometapi/models/text_embedding/_position.yaml
new file mode 100644
index 000000000..74610bb34
--- /dev/null
+++ b/models/cometapi/models/text_embedding/_position.yaml
@@ -0,0 +1,4 @@
+# unknown models (3)
+- text-embedding-3-large
+- text-embedding-3-small
+- text-embedding-ada-002
diff --git a/models/cometapi/models/text_embedding/text-embedding-3-large.yaml b/models/cometapi/models/text_embedding/text-embedding-3-large.yaml
new file mode 100644
index 000000000..9489170fd
--- /dev/null
+++ b/models/cometapi/models/text_embedding/text-embedding-3-large.yaml
@@ -0,0 +1,9 @@
+model: text-embedding-3-large
+model_type: text-embedding
+model_properties:
+  context_size: 8191
+  max_chunks: 32
+pricing:
+  input: '0.00013'
+  unit: '0.001'
+  currency: USD
diff --git a/models/cometapi/models/text_embedding/text-embedding-3-small.yaml b/models/cometapi/models/text_embedding/text-embedding-3-small.yaml
new file mode 100644
index 000000000..586ba2b28
--- /dev/null
+++ b/models/cometapi/models/text_embedding/text-embedding-3-small.yaml
@@ -0,0 +1,9 @@
+model: text-embedding-3-small
+model_type: text-embedding
+model_properties:
+  context_size: 8191
+  max_chunks: 32
+pricing:
+  input: '0.00002'
+  unit: '0.001'
+  currency: USD
diff --git a/models/cometapi/models/text_embedding/text-embedding-ada-002.yaml b/models/cometapi/models/text_embedding/text-embedding-ada-002.yaml
new file mode 100644
index 000000000..ef1c49b01
--- /dev/null
+++ b/models/cometapi/models/text_embedding/text-embedding-ada-002.yaml
@@ -0,0 +1,9 @@
+model: text-embedding-ada-002
+model_type: text-embedding
+model_properties:
+  context_size: 8097
+  max_chunks: 32
+pricing:
+  input: '0.0001'
+  unit: '0.001'
+  currency: USD
diff --git a/models/cometapi/models/text_embedding/text_embedding.py b/models/cometapi/models/text_embedding/text_embedding.py
new file mode 100644
index 000000000..0b91c55e3
--- /dev/null
+++ b/models/cometapi/models/text_embedding/text_embedding.py
@@ -0,0 +1,236 @@
+import base64
+import time
+from typing import Optional, Union
+
+import numpy as np
+import tiktoken
+from dify_plugin import TextEmbeddingModel
+from dify_plugin.entities.model import EmbeddingInputType, PriceType
+from dify_plugin.entities.model.text_embedding import (
+    EmbeddingUsage,
+    TextEmbeddingResult,
+)
+from dify_plugin.errors.model import CredentialsValidateFailedError
+from openai import OpenAI
+
+from ..common_openai import _CommonOpenAI
+
+
+class OpenAITextEmbeddingModel(_CommonOpenAI, TextEmbeddingModel):
+    """
+    Model class for OpenAI text embedding model.
+    """
+
+    def _invoke(
+        self,
+        model: str,
+        credentials: dict,
+        texts: list[str],
+        user: Optional[str] = None,
+        input_type: EmbeddingInputType = EmbeddingInputType.DOCUMENT,
+    ) -> TextEmbeddingResult:
+        """
+        Invoke text embedding model
+
+        :param model: model name
+        :param credentials: model credentials
+        :param texts: texts to embed
+        :param user: unique user id
+        :return: embeddings result
+        """
+        # transform credentials to kwargs for model instance
+        credentials_kwargs = self._to_credential_kwargs(credentials)
+        # init model client
+        client = OpenAI(**credentials_kwargs)
+
+        extra_model_kwargs = {}
+        if user:
+            extra_model_kwargs["user"] = user
+
+        extra_model_kwargs["encoding_format"] = "base64"
+
+        # get model properties
+        context_size = self._get_context_size(model, credentials)
+        max_chunks = self._get_max_chunks(model, credentials)
+
+        embeddings: list[list[float]] = [[] for _ in range(len(texts))]
+        tokens = []
+        indices = []
+        used_tokens = 0
+
+        try:
+            enc = tiktoken.encoding_for_model(model)
+        except KeyError:
+            enc = tiktoken.get_encoding("cl100k_base")
+
+        for i, text in enumerate(texts):
+            token = enc.encode(text)
+            for j in range(0, len(token), context_size):
+                tokens += [token[j : j + context_size]]
+                indices += [i]
+
+        batched_embeddings = []
+        _iter = range(0, len(tokens), max_chunks)
+
+        for i in _iter:
+            # call embedding model
+            embeddings_batch, embedding_used_tokens = self._embedding_invoke(
+                model=model,
+                client=client,
+                texts=tokens[i : i + max_chunks],
+                extra_model_kwargs=extra_model_kwargs,
+            )
+
+            used_tokens += embedding_used_tokens
+            batched_embeddings += embeddings_batch
+
+        results: list[list[list[float]]] = [[] for _ in range(len(texts))]
+        num_tokens_in_batch: list[list[int]] = [[] for _ in range(len(texts))]
+        for i in range(len(indices)):
+            results[indices[i]].append(batched_embeddings[i])
+            num_tokens_in_batch[indices[i]].append(len(tokens[i]))
+
+        for i in range(len(texts)):
+            _result = results[i]
+            if len(_result) == 0:
+                embeddings_batch, embedding_used_tokens = self._embedding_invoke(
+                    model=model,
+                    client=client,
+                    texts="",
+                    extra_model_kwargs=extra_model_kwargs,
+                )
+
+                used_tokens += embedding_used_tokens
+                average = embeddings_batch[0]
+            else:
+                average = np.average(_result, axis=0, weights=num_tokens_in_batch[i])
+            embedding = (average / np.linalg.norm(average)).tolist()  # type: ignore
+            if np.isnan(embedding).any():
+                raise ValueError("Normalized embedding is nan please try again")
+            embeddings[i] = embedding
+
+        # calc usage
+        usage = self._calc_response_usage(
+            model=model, credentials=credentials, tokens=used_tokens
+        )
+
+        return TextEmbeddingResult(embeddings=embeddings, usage=usage, model=model)
+
+    def get_num_tokens(
+        self, model: str, credentials: dict, texts: list[str]
+    ) -> list[int]:
+        """
+        Get number of tokens for given prompt messages
+
+        :param model: model name
+        :param credentials: model credentials
+        :param texts: texts to embed
+        :return:
+        """
+        if len(texts) == 0:
+            return []
+
+        try:
+            enc = tiktoken.encoding_for_model(model)
+        except KeyError:
+            enc = tiktoken.get_encoding("cl100k_base")
+
+        total_num_tokens = []
+        for text in texts:
+            # calculate the number of tokens in the encoded text
+            tokenized_text = enc.encode(text)
+            total_num_tokens.append(len(tokenized_text))
+
+        return total_num_tokens
+
+    def validate_credentials(self, model: str, credentials: dict) -> None:
+        """
+        Validate model credentials
+
+        :param model: model name
+        :param credentials: model credentials
+        :return:
+        """
+        try:
+            # transform credentials to kwargs for model instance
+            credentials_kwargs = self._to_credential_kwargs(credentials)
+            client = OpenAI(**credentials_kwargs)
+
+            # call embedding model
+            self._embedding_invoke(
+                model=model, client=client, texts=["ping"], extra_model_kwargs={}
+            )
+        except Exception as ex:
+            raise CredentialsValidateFailedError(str(ex))
+
+    def _embedding_invoke(
+        self,
+        model: str,
+        client: OpenAI,
+        texts: Union[list[str], str],
+        extra_model_kwargs: dict,
+    ) -> tuple[list[list[float]], int]:
+        """
+        Invoke embedding model
+
+        :param model: model name
+        :param client: model client
+        :param texts: texts to embed
+        :param extra_model_kwargs: extra model kwargs
+        :return: embeddings and used tokens
+        """
+        # call embedding model
+        response = client.embeddings.create(
+            input=texts,
+            model=model,
+            **extra_model_kwargs,
+        )
+
+        if (
+            "encoding_format" in extra_model_kwargs
+            and extra_model_kwargs["encoding_format"] == "base64"
+        ):
+            # decode base64 embedding
+            return (
+                [
+                    list(
+                        np.frombuffer(base64.b64decode(data.embedding), dtype="float32")
+                    )
+                    for data in response.data
+                ],  # type: ignore
+                response.usage.total_tokens,
+            )
+
+        return [data.embedding for data in response.data], response.usage.total_tokens
+
+    def _calc_response_usage(
+        self, model: str, credentials: dict, tokens: int
+    ) -> EmbeddingUsage:
+        """
+        Calculate response usage
+
+        :param model: model name
+        :param credentials: model credentials
+        :param tokens: input tokens
+        :return: usage
+        """
+        # get input price info
+        input_price_info = self.get_price(
+            model=model,
+            credentials=credentials,
+            price_type=PriceType.INPUT,
+            tokens=tokens,
+        )
+
+        # transform usage
+        usage = EmbeddingUsage(
+            tokens=tokens,
+            total_tokens=tokens,
+            unit_price=input_price_info.unit_price,
+            price_unit=input_price_info.unit,
+            total_price=input_price_info.total_amount,
+            currency=input_price_info.currency,
+            latency=time.perf_counter() - self.started_at,
+        )
+
+        return usage
diff --git a/models/cometapi/models/tts/_position.yaml b/models/cometapi/models/tts/_position.yaml
new file mode 100644
index 000000000..eb8a24c04
--- /dev/null
+++ b/models/cometapi/models/tts/_position.yaml
@@ -0,0 +1,4 @@
+# unknown models (3)
+- gpt-4o-mini-tts
+- tts-1
+- tts-1-hd
diff --git a/models/cometapi/models/tts/gpt-4o-mini-tts.yaml b/models/cometapi/models/tts/gpt-4o-mini-tts.yaml
new file mode 100644
index 000000000..9caa26d53
--- /dev/null
+++ b/models/cometapi/models/tts/gpt-4o-mini-tts.yaml
@@ -0,0 +1,46 @@
+model: gpt-4o-mini-tts
+model_type: tts
+model_properties:
+  default_voice: 'alloy'
+  voices:
+    - mode: 'alloy'
+      name: 'Alloy'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'ash'
+      name: 'Ash'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'ballad'
+      name: 'Ballad'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'coral'
+      name: 'Coral'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'echo'
+      name: 'Echo'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'fable'
+      name: 'Fable'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'onyx'
+      name: 'Onyx'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'nova'
+      name: 'Nova'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'sage'
+      name: 'Sage'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'shimmer'
+      name: 'Shimmer'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'verse'
+      name: 'Verse'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+  word_limit: 3500
+  audio_type: 'mp3'
+  max_workers: 5
+pricing:
+  input: '0.012'
+  output: '0'
+  unit: '0.001'
+  currency: USD
diff --git a/models/cometapi/models/tts/tts-1-hd.yaml b/models/cometapi/models/tts/tts-1-hd.yaml
new file mode 100644
index 000000000..cfdc0835e
--- /dev/null
+++ b/models/cometapi/models/tts/tts-1-hd.yaml
@@ -0,0 +1,46 @@
+model: tts-1-hd
+model_type: tts
+model_properties:
+  default_voice: 'alloy'
+  voices:
+    - mode: 'alloy'
+      name: 'Alloy'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'ash'
+      name: 'Ash'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'ballad'
+      name: 'Ballad'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'coral'
+      name: 'Coral'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'echo'
+      name: 'Echo'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'fable'
+      name: 'Fable'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'onyx'
+      name: 'Onyx'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'nova'
+      name: 'Nova'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'sage'
+      name: 'Sage'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'shimmer'
+      name: 'Shimmer'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'verse'
+      name: 'Verse'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+  word_limit: 3500
+  audio_type: 'mp3'
+  max_workers: 5
+pricing:
+  input: '0.03'
+  output: '0'
+  unit: '0.001'
+  currency: USD
diff --git a/models/cometapi/models/tts/tts-1.yaml b/models/cometapi/models/tts/tts-1.yaml
new file mode 100644
index 000000000..b2904df27
--- /dev/null
+++ b/models/cometapi/models/tts/tts-1.yaml
@@ -0,0 +1,46 @@
+model: tts-1
+model_type: tts
+model_properties:
+  default_voice: 'alloy'
+  voices:
+    - mode: 'alloy'
+      name: 'Alloy'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'ash'
+      name: 'Ash'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'ballad'
+      name: 'Ballad'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'coral'
+      name: 'Coral'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'echo'
+      name: 'Echo'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'fable'
+      name: 'Fable'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'onyx'
+      name: 'Onyx'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'nova'
+      name: 'Nova'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'sage'
+      name: 'Sage'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'shimmer'
+      name: 'Shimmer'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+    - mode: 'verse'
+      name: 'Verse'
+      language: ['zh-Hans', 'en-US', 'de-DE', 'fr-FR', 'es-ES', 'it-IT', 'th-TH', 'id-ID']
+  word_limit: 3500
+  audio_type: 'mp3'
+  max_workers: 5
+pricing:
+  input: '0.015'
+  output: '0'
+  unit: '0.001'
+  currency: USD
diff --git a/models/cometapi/models/tts/tts.py b/models/cometapi/models/tts/tts.py
new file mode 100644
index 000000000..c706d67e8
--- /dev/null
+++ b/models/cometapi/models/tts/tts.py
@@ -0,0 +1,212 @@
+from collections.abc import Generator
+import concurrent.futures
+from functools import reduce
+from io import BytesIO
+from typing import Optional
+
+from openai import OpenAI
+from pydub import AudioSegment
+
+from dify_plugin import TTSModel
+from dify_plugin.errors.model import (
+    CredentialsValidateFailedError,
+    InvokeBadRequestError,
+)
+from ..common_openai import _CommonOpenAI
+
+
+class OpenAIText2SpeechModel(_CommonOpenAI, TTSModel):
+    """
+    Model class for OpenAI Speech to text model.
+    """
+
+    def _invoke(
+        self,
+        model: str,
+        tenant_id: str,
+        credentials: dict,
+        content_text: str,
+        voice: str,
+        user: Optional[str] = None,
+    ) -> bytes | Generator[bytes, None, None]:
+        """
+        _invoke text2speech model
+
+        :param model: model name
+        :param tenant_id: user tenant id
+        :param credentials: model credentials
+        :param content_text: text content to be translated
+        :param voice: model timbre
+        :param user: unique user id
+        :return: text translated to audio file
+        """
+
+        voices = self.get_tts_model_voices(model=model, credentials=credentials)
+        if not voices:
+            raise InvokeBadRequestError("No voices found for the model")
+
+        if not voice or voice not in [d["value"] for d in voices]:
+            voice = self._get_model_default_voice(model, credentials)
+
+        # if streaming:
+        return self._tts_invoke_streaming(
+            model=model, credentials=credentials, content_text=content_text, voice=voice
+        )
+
+    def validate_credentials(
+        self, model: str, credentials: dict, user: Optional[str] = None
+    ) -> None:
+        """
+        validate credentials text2speech model
+
+        :param model: model name
+        :param credentials: model credentials
+        :param user: unique user id
+        :return: text translated to audio file
+        """
+        try:
+            self._tts_invoke(
+                model=model,
+                credentials=credentials,
+                content_text="Hello Dify!",
+                voice=self._get_model_default_voice(model, credentials),
+            )
+        except Exception as ex:
+            raise CredentialsValidateFailedError(str(ex))
+
+    def _tts_invoke(
+        self, model: str, credentials: dict, content_text: str, voice: str
+    ) -> bytes:
+        """
+        _tts_invoke text2speech model
+
+        :param model: model name
+        :param credentials: model credentials
+        :param content_text: text content to be translated
+        :param voice: model timbre
+        :return: text translated to audio file
+        """
+        audio_type = self._get_model_audio_type(model, credentials)
+        word_limit = self._get_model_word_limit(model, credentials) or 500
+        max_workers = self._get_model_workers_limit(model, credentials)
+        try:
+            sentences = list(
+                self._split_text_into_sentences(
+                    org_text=content_text, max_length=word_limit
+                )
+            )
+            audio_bytes_list = []
+
+            # Create a thread pool and map the function to the list of sentences
+            with concurrent.futures.ThreadPoolExecutor(
+                max_workers=max_workers
+            ) as executor:
+                futures = [
+                    executor.submit(
+                        self._process_sentence,
+                        sentence=sentence,
+                        model=model,
+                        voice=voice,
+                        credentials=credentials,
+                    )
+                    for sentence in sentences
+                ]
+                for future in futures:
+                    try:
+                        if future.result():
+                            audio_bytes_list.append(future.result())
+                    except Exception as ex:
+                        raise InvokeBadRequestError(str(ex))
+
+            if len(audio_bytes_list) > 0:
+                audio_segments = [
+                    AudioSegment.from_file(BytesIO(audio_bytes), format=audio_type)
+                    for audio_bytes in audio_bytes_list
+                    if audio_bytes
+                ]
+                combined_segment = reduce(lambda x, y: x + y, audio_segments)
+                buffer: BytesIO = BytesIO()
+                combined_segment.export(buffer, format=audio_type)
+                buffer.seek(0)
+
+                return buffer.read()
+            else:
+                raise InvokeBadRequestError("No audio bytes found")
+        except Exception as ex:
+            raise InvokeBadRequestError(str(ex))
+
+    def _tts_invoke_streaming(
+        self, model: str, credentials: dict, content_text: str, voice: str
+    ) -> Generator[bytes, None, None]:
+        """
+        _tts_invoke_streaming text2speech model
+
+        :param model: model name
+        :param credentials: model credentials
+        :param content_text: text content to be translated
+        :param voice: model timbre
+        :return: text translated to audio file
+        """
+        try:
+            # doc: https://platform.openai.com/docs/guides/text-to-speech
+            credentials_kwargs = self._to_credential_kwargs(credentials)
+            client = OpenAI(**credentials_kwargs)
+
+            voices = self.get_tts_model_voices(model=model, credentials=credentials)
+            if not voices:
+                raise InvokeBadRequestError("No voices found for the model")
+
+            if not voice or voice not in voices:
+                voice = self._get_model_default_voice(model, credentials)
+
+            word_limit = self._get_model_word_limit(model, credentials) or 500
+            if len(content_text) > word_limit:
+                sentences = self._split_text_into_sentences(
+                    content_text, max_length=word_limit
+                )
+                executor = concurrent.futures.ThreadPoolExecutor(
+                    max_workers=min(3, len(sentences))
+                )
+                futures = [
+                    executor.submit(
+                        client.audio.speech.with_streaming_response.create,
+                        model=model,
+                        response_format="mp3",
+                        input=sentences[i],
+                        voice=voice,  # type: ignore
+                    )
+                    for i in range(len(sentences))
+                ]
+                for index, future in enumerate(futures):
+                    yield from future.result().__enter__().iter_bytes(1024)
+
+            else:
+                response = client.audio.speech.with_streaming_response.create(
+                    model=model,
+                    voice=voice,  # type: ignore
+                    response_format="mp3",
+                    input=content_text.strip(),
+                )
+
+                yield from response.__enter__().iter_bytes(1024)
+        except Exception as ex:
+            raise InvokeBadRequestError(str(ex))
+
+    def _process_sentence(self, sentence: str, model: str, voice, credentials: dict):
+        """
+        _tts_invoke openai text2speech model api
+
+        :param model: model name
+        :param credentials: model credentials
+        :param voice: model timbre
+        :param sentence: text content to be translated
+        :return: text translated to audio file
+        """
+        # transform credentials to kwargs for model instance
+        credentials_kwargs = self._to_credential_kwargs(credentials)
+        client = OpenAI(**credentials_kwargs)
+        response = client.audio.speech.create(
+            model=model, voice=voice, input=sentence.strip()
+        )
+        if isinstance(response.read(), bytes):
+            return response.read()
diff --git a/models/cometapi/provider/cometapi.py b/models/cometapi/provider/cometapi.py
new file mode 100644
index 000000000..14ac25f4e
--- /dev/null
+++ b/models/cometapi/provider/cometapi.py
@@ -0,0 +1,31 @@
+import logging
+from dify_plugin.entities.model import ModelType
+from dify_plugin.errors.model import CredentialsValidateFailedError
+from dify_plugin import ModelProvider
+
+logger = logging.getLogger(__name__)
+
+class CometapiProvider(ModelProvider):
+    
+    def validate_provider_credentials(self, credentials: dict) -> None:
+        """
+        Validate provider credentials
+        if validate failed, raise exception
+
+        :param credentials: provider credentials, credentials form defined in `provider_credential_schema`.
+        """
+        try:
+            model_instance = self.get_model_instance(ModelType.LLM)
+
+            # Use `gpt-4.1-nano` model for validate,
+            # no matter what model you pass in, text completion model or chat model
+            model_instance.validate_credentials(
+                model="gpt-4.1-nano", credentials=credentials
+            )
+        except CredentialsValidateFailedError as ex:
+            raise ex
+        except Exception as ex:
+            logger.exception(
+                f"{self.get_provider_schema().provider} credentials validate failed"
+            )
+            raise ex
\ No newline at end of file
diff --git a/models/cometapi/provider/cometapi.yaml b/models/cometapi/provider/cometapi.yaml
new file mode 100644
index 000000000..23846e178
--- /dev/null
+++ b/models/cometapi/provider/cometapi.yaml
@@ -0,0 +1,150 @@
+identity:
+  author: cometapi
+  name: cometapi
+  label:
+    en_US: CometAPI
+    zh_Hans: CometAPI
+    ja_JP: CometAPI
+    pt_BR: CometAPI
+  description:
+    en_US: 500+ AI Model API, All In One API. Just In CometAPI.
+    zh_Hans: 500+ AI 模型 API，一站集成，尽在 CometAPI。
+    ja_JP: 500以上のAIモデルAPIを一括提供。すべてはCometAPIに。
+    pt_BR: "500+ modelos de IA em uma única API. Tudo em um só lugar : CometAPI."
+  icon: cometapi_small.png
+background: "#1E40AF"
+configurate_methods:
+  - predefined-model
+  - customizable-model
+extra:
+  python:
+    source: provider/cometapi.py
+    provider_source: provider/cometapi.py
+    model_sources:
+      - "models/llm/llm.py"
+      - "models/text_embedding/text_embedding.py"
+      - "models/speech2text/speech2text.py"
+      - "models/tts/tts.py"
+help:
+  title:
+    en_US: Get your API key from CometAPI Console
+    zh_Hans: 从 CometAPI 控制台获取 API Key
+    ja_JP: CometAPI コンソールから API キーを取得
+    pt_BR: Obtenha sua chave de API do Console CometAPI
+  url:
+    en_US: https://api.cometapi.com/console/token
+    zh_Hans: https://api.cometapi.com/console/token
+    ja_JP: https://api.cometapi.com/console/token
+    pt_BR: https://api.cometapi.com/console/token
+supported_model_types:
+  - llm
+  - text-embedding
+  - speech2text
+  - tts
+icon_large:
+  en_US: cometapi_large.png
+  zh_Hans: cometapi_large.png
+  ja_JP: cometapi_large.png
+  pt_BR: cometapi_large.png
+icon_small:
+  en_US: cometapi_small.png
+  zh_Hans: cometapi_small.png
+  ja_JP: cometapi_small.png
+  pt_BR: cometapi_small.png
+label:
+  en_US: CometAPI
+  zh_Hans: CometAPI
+  ja_JP: CometAPI
+  pt_BR: CometAPI
+model_credential_schema:
+  credential_form_schemas:
+    - variable: api_key
+      label:
+        en_US: API Key
+        zh_Hans: API 密钥
+        ja_JP: API キー
+        pt_BR: Chave da API
+      type: secret-input
+      required: true
+      placeholder:
+        en_US: Enter your API Key
+        zh_Hans: 在此输入您的 API Key
+        ja_JP: API キーを入力してください
+        pt_BR: Digite sua chave de API
+    - variable: mode
+      label:
+        en_US: Completion mode
+        zh_Hans: 完成模式
+        ja_JP: 完了モード
+        pt_BR: Modo de conclusão
+      type: select
+      required: false
+      default: chat
+      options:
+        - label:
+            en_US: Completion
+            zh_Hans: 补全
+            ja_JP: 完了
+            pt_BR: Conclusão
+          value: completion
+        - label:
+            en_US: Chat
+            zh_Hans: 对话
+            ja_JP: チャット
+            pt_BR: Chat
+          value: chat
+      placeholder:
+        en_US: Select completion mode
+        zh_Hans: 选择对话类型
+        ja_JP: 完了モードを選択
+        pt_BR: Selecione o modo de conclusão
+      show_on:
+        - variable: __model_type
+          value: llm
+  model:
+    label:
+      en_US: Model Name
+      zh_Hans: 模型名称
+      ja_JP: モデル名
+      pt_BR: Nome do Modelo
+    placeholder:
+      en_US: Enter full model name
+      zh_Hans: 输入模型全称
+      ja_JP: 完全なモデル名を入力
+      pt_BR: Digite o nome completo do modelo
+models:
+  llm:
+    position: models/llm/_position.yaml
+    predefined:
+      - models/llm/anthropic/*.yaml
+      - models/llm/deepseek/*.yaml
+      - models/llm/gemini/*.yaml
+      - models/llm/llama/*.yaml
+      - models/llm/openai/*.yaml
+      - models/llm/qwen/*.yaml
+      - models/llm/x-ai/*.yaml
+  text_embedding:
+    predefined:
+      - "models/text_embedding/*.yaml"
+  tts:
+    predefined:
+      - "models/tts/*.yaml"
+  speech2text:
+    predefined:
+      - "models/speech2text/*.yaml"
+provider: cometapi
+provider_credential_schema:
+  credential_form_schemas:
+    - variable: api_key
+      label:
+        en_US: API Key
+        zh_Hans: API 密钥
+        ja_JP: API キー
+        pt_BR: Chave da API
+      type: secret-input
+      required: true
+      placeholder:
+        en_US: Enter your API Key
+        zh_Hans: 在此输入您的 API Key
+        ja_JP: API キーを入力してください
+        pt_BR: Digite sua chave de API
diff --git a/models/cometapi/requirements.txt b/models/cometapi/requirements.txt
new file mode 100644
index 000000000..53e1f1610
--- /dev/null
+++ b/models/cometapi/requirements.txt
@@ -0,0 +1,5 @@
+dify_plugin>=0.3.0,<0.5.0
+tiktoken~=0.8.0
+openai~=1.97.0
+numpy~=1.26.4
+pydub~=0.25.1
diff --git a/models/cometapi/tools/renew_model_list.py b/models/cometapi/tools/renew_model_list.py
new file mode 100644
index 000000000..d14d49cca
--- /dev/null
+++ b/models/cometapi/tools/renew_model_list.py
@@ -0,0 +1,260 @@
+
+"""
+Automatic _position.yaml generator
+This script recursively searches for YAML model files and updates _position.yaml files
+Automatically generates and updates _position.yaml files grouped by provider
+"""
+
+import os
+import yaml
+from pathlib import Path
+from collections import defaultdict
+import re
+
+
+def find_yaml_files(root_dir):
+    """
+    Recursively search for all yaml files (excluding _position.yaml and other special files)
+    """
+    yaml_files = []
+    root_path = Path(root_dir)
+    
+    for yaml_file in root_path.rglob("*.yaml"):
+        # Skip special files
+        if yaml_file.name.startswith('_') or yaml_file.name in ['manifest.yaml']:
+            continue
+            
+        # Ensure this is a model configuration file (contains model field)
+        try:
+            with open(yaml_file, 'r', encoding='utf-8') as f:
+                content = yaml.safe_load(f)
+                if isinstance(content, dict) and 'model' in content:
+                    # Check and fix filename if necessary
+                    corrected_file = check_and_fix_filename(yaml_file)
+                    yaml_files.append(corrected_file)
+        except Exception as e:
+            print(f"Warning: Unable to read file {yaml_file}: {e}")
+            
+    return yaml_files
+
+
+def extract_provider_from_path(yaml_file, base_dir):
+    """
+    Extract provider name from file path
+    Example: models/llm/openai/gpt-4.yaml -> openai
+    """
+    relative_path = yaml_file.relative_to(base_dir)
+    parts = relative_path.parts
+    
+    # For files directly in provider directories like: openai/gpt-4.yaml
+    if len(parts) >= 2:
+        return parts[0]  # Provider name is the first directory
+    
+    return "unknown"
+
+
+def check_and_fix_filename(yaml_file):
+    """
+    Check if YAML filename matches the model name in the file, and rename if necessary
+    """
+    try:
+        with open(yaml_file, 'r', encoding='utf-8') as f:
+            content = yaml.safe_load(f)
+            model_name = content.get('model', '')
+            
+        if not model_name:
+            print(f"Warning: No 'model' field found in {yaml_file}")
+            return yaml_file
+        
+        # Get current filename without extension
+        current_name = yaml_file.stem
+        expected_name = model_name
+        
+        # Check if filename matches model name
+        if current_name != expected_name:
+            # Generate new filename
+            new_filename = f"{expected_name}.yaml"
+            new_file_path = yaml_file.parent / new_filename
+            
+            # Check if target file already exists
+            if new_file_path.exists() and new_file_path != yaml_file:
+                print(f"Error: Cannot rename {yaml_file.name} to {new_filename} - target file already exists")
+                return yaml_file
+            
+            try:
+                # Rename the file
+                yaml_file.rename(new_file_path)
+                print(f"Renamed: {yaml_file.name} -> {new_filename} (model: {model_name})")
+                return new_file_path
+            except Exception as e:
+                print(f"Error renaming {yaml_file.name} to {new_filename}: {e}")
+                return yaml_file
+        
+        return yaml_file
+        
+    except Exception as e:
+        print(f"Error checking filename for {yaml_file}: {e}")
+        return yaml_file
+
+
+def load_model_info(yaml_file):
+    """
+    Load model file and extract basic information
+    """
+    try:
+        with open(yaml_file, 'r', encoding='utf-8') as f:
+            content = yaml.safe_load(f)
+            return {
+                'model': content.get('model', ''),
+                'label': content.get('label', {}).get('en_US', content.get('model', '')),
+                'model_type': content.get('model_type', 'llm'),
+                'file_path': yaml_file
+            }
+    except Exception as e:
+        print(f"Error loading {yaml_file}: {e}")
+        return None
+
+
+def group_models_by_provider(yaml_files, base_dir):
+    """
+    Group models by provider
+    """
+    provider_groups = defaultdict(list)
+    
+    for yaml_file in yaml_files:
+        provider = extract_provider_from_path(yaml_file, base_dir)
+        model_info = load_model_info(yaml_file)
+        
+        if model_info:
+            provider_groups[provider].append(model_info)
+    
+    return provider_groups
+
+
+def get_provider_display_name(provider):
+    """
+    Get display name for provider (just use the folder name as-is)
+    """
+    return provider
+
+
+def generate_position_yaml_content(provider_groups):
+    """
+    Generate _position.yaml file content
+    """
+    lines = []
+    
+    # Sort by provider name
+    sorted_providers = sorted(provider_groups.keys())
+    
+    for provider in sorted_providers:
+        models = provider_groups[provider]
+        if not models:
+            continue
+            
+        # Add provider comment
+        display_name = get_provider_display_name(provider)
+        lines.append(f"# {display_name} models ({len(models)})")
+        
+        # Sort by model name
+        sorted_models = sorted(models, key=lambda x: x['model'])
+        
+        for model in sorted_models:
+            lines.append(f"- {model['model']}")
+        
+        lines.append("")  # Empty line separator
+    
+    # Remove last empty line
+    if lines and lines[-1] == "":
+        lines.pop()
+        
+    return "\n".join(lines) + "\n"
+
+
+def update_position_yaml(model_type_dir, provider_groups):
+    """
+    Update _position.yaml file in specified directory
+    """
+    position_file = model_type_dir / "_position.yaml"
+    
+    if not provider_groups:
+        print(f"No models found for {model_type_dir}")
+        return
+    
+    content = generate_position_yaml_content(provider_groups)
+    
+    try:
+        # Backup existing file
+        if position_file.exists():
+            backup_file = position_file.with_suffix('.yaml.backup')
+            position_file.rename(backup_file)
+            print(f"Backed up existing file to {backup_file}")
+        
+        # Write new content
+        with open(position_file, 'w', encoding='utf-8') as f:
+            f.write(content)
+        
+        print(f"Updated {position_file}")
+        print(f"Total models: {sum(len(models) for models in provider_groups.values())}")
+        
+    except Exception as e:
+        print(f"Error updating {position_file}: {e}")
+
+
+def main():
+    """
+    Main function: automatically update _position.yaml files
+    """
+    # Get parent directory of script directory (cometapi directory)
+    script_dir = Path(__file__).parent
+    cometapi_dir = script_dir.parent
+    models_dir = cometapi_dir / "models"
+    
+    if not models_dir.exists():
+        print(f"Models directory not found: {models_dir}")
+        return
+    
+    print(f"Scanning models directory: {models_dir}")
+    
+    # Process by model type groups
+    model_types = ['llm', 'text_embedding', 'rerank', 'tts', 'speech2text', 'moderation']
+    
+    for model_type in model_types:
+        model_type_dir = models_dir / model_type
+        if not model_type_dir.exists():
+            continue
+            
+        print(f"\nProcessing {model_type} models...")
+        
+        # Search for all yaml files under this type and check/fix filenames
+        yaml_files = []
+        for yaml_file in model_type_dir.rglob("*.yaml"):
+            if yaml_file.name.startswith('_') or yaml_file.name == 'manifest.yaml':
+                continue
+            
+            # Check if it's a valid model file and fix filename if needed
+            try:
+                with open(yaml_file, 'r', encoding='utf-8') as f:
+                    content = yaml.safe_load(f)
+                    if isinstance(content, dict) and 'model' in content:
+                        # Check and fix filename
+                        corrected_file = check_and_fix_filename(yaml_file)
+                        yaml_files.append(corrected_file)
+            except Exception as e:
+                print(f"Warning: Unable to process file {yaml_file}: {e}")
+        
+        if not yaml_files:
+            print(f"No YAML files found in {model_type_dir}")
+            continue
+        
+        # Group by provider
+        provider_groups = group_models_by_provider(yaml_files, model_type_dir)
+        
+        # Update _position.yaml
+        update_position_yaml(model_type_dir, provider_groups)
+    
+    print("\n✅ _position.yaml files updated successfully!")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/models/cometapi/tools/yaml_batch_operator.py b/models/cometapi/tools/yaml_batch_operator.py
new file mode 100755
index 000000000..c5c797dad
--- /dev/null
+++ b/models/cometapi/tools/yaml_batch_operator.py
@@ -0,0 +1,757 @@
+
+"""
+批量 YAML 对象操作工具
+这个脚本可以递归搜索 YAML 文件并对指定的对象/字段执行批量操作
+支持使用点表示法的嵌套对象操作 (例如 pricing, metadata.version)
+操作类型: delete, add, modify, replace, remove
+支持对象和数组两种数据类型的完整操作
+
+新特性:
+- 使用相对路径进行文件匹配，更灵活的 --include 参数
+- 支持多种数组操作模式 (append, prepend, insert, by-value, by-index)
+- 安全的强制 --include 参数要求，防止意外批量修改
+- 自动备份功能，支持干运行模式预览
+"""
+
+import os
+import sys
+import yaml
+import argparse
+import fnmatch
+from pathlib import Path
+from typing import Dict, Any, List, Optional, Union
+import shutil
+from datetime import datetime
+import json
+
+
+class YAMLObjectOperator:
+    """Handler for performing operations on objects in YAML files"""
+    
+    def __init__(self, dry_run: bool = False, backup: bool = True):
+        self.dry_run = dry_run
+        self.backup = backup
+        self.processed_files = []
+        self.modified_files = []
+        self.errors = []
+    
+    def find_yaml_files(self, root_dir: Path, exclude_patterns: List[str] = None) -> List[Path]:
+        """
+        Recursively search for all YAML files
+        """
+        if exclude_patterns is None:
+            exclude_patterns = ['_position.yaml', 'manifest.yaml', '*.backup']
+        
+        yaml_files = []
+        
+        for yaml_file in root_dir.rglob("*.yaml"):
+            # Check exclude patterns
+            if any(yaml_file.match(pattern) or yaml_file.name == pattern for pattern in exclude_patterns):
+                continue
+            
+            # Verify it's a readable YAML file
+            try:
+                with open(yaml_file, 'r', encoding='utf-8') as f:
+                    yaml.safe_load(f)
+                yaml_files.append(yaml_file)
+            except Exception as e:
+                self.errors.append(f"Warning: Unable to read {yaml_file}: {e}")
+        
+        return yaml_files
+    
+    def delete_nested_object(self, data: Dict[Any, Any], key_path: str) -> bool:
+        """
+        Delete nested object using dot notation
+        Returns True if object was found and deleted
+        """
+        keys = key_path.split('.')
+        current = data
+        
+        # Navigate to parent of target object
+        for key in keys[:-1]:
+            if not isinstance(current, dict) or key not in current:
+                return False
+            current = current[key]
+        
+        # Delete the target key
+        target_key = keys[-1]
+        if isinstance(current, dict) and target_key in current:
+            del current[target_key]
+            return True
+        
+        return False
+    
+    def add_nested_object(self, data: Dict[Any, Any], key_path: str, value: Any, array_mode: str = 'auto') -> bool:
+        """
+        Add nested object using dot notation
+        array_mode: 'auto' (detect), 'append', 'prepend', 'insert:index', 'replace'
+        Returns True if object was successfully added
+        """
+        keys = key_path.split('.')
+        current = data
+        
+        # Navigate to parent, creating nested dicts as needed
+        for key in keys[:-1]:
+            if not isinstance(current, dict):
+                return False
+            if key not in current:
+                current[key] = {}
+            current = current[key]
+        
+        # Handle the target key
+        target_key = keys[-1]
+        if not isinstance(current, dict):
+            return False
+        
+        # If auto mode, detect if target should be array or object
+        if array_mode == 'auto':
+            # Check if target already exists and is an array
+            if target_key in current and isinstance(current[target_key], list):
+                # Target is an array, append to it
+                current[target_key].append(value)
+                return True
+            else:
+                # Target is not an array, set as regular object
+                current[target_key] = value
+                return True
+        
+        # Handle specific array modes
+        if array_mode == 'replace':
+            current[target_key] = value
+            return True
+        
+        # For other array modes, ensure target is an array
+        if target_key not in current:
+            current[target_key] = []
+        
+        if not isinstance(current[target_key], list):
+            return False
+        
+        # Add element based on array mode
+        if array_mode == 'append':
+            current[target_key].append(value)
+        elif array_mode == 'prepend':
+            current[target_key].insert(0, value)
+        elif array_mode == 'insert':
+            # For insert mode, we need an index from the operation
+            return False  # This should be handled by the caller
+        else:
+            return False
+        
+        return True
+    
+    def modify_nested_object(self, data: Dict[Any, Any], key_path: str, value: Any) -> bool:
+        """
+        Modify existing nested object using dot notation
+        Returns True if object was found and modified
+        """
+        keys = key_path.split('.')
+        current = data
+        
+        # Navigate to parent of target object
+        for key in keys[:-1]:
+            if not isinstance(current, dict) or key not in current:
+                return False
+            current = current[key]
+        
+        # Modify the target key only if it exists
+        target_key = keys[-1]
+        if isinstance(current, dict) and target_key in current:
+            current[target_key] = value
+            return True
+        
+        return False
+    
+    def get_nested_object(self, data: Dict[Any, Any], key_path: str) -> tuple[Any, bool]:
+        """
+        Get nested object using dot notation
+        Returns (object, exists) tuple
+        """
+        keys = key_path.split('.')
+        current = data
+        
+        # Navigate to target object
+        for key in keys:
+            if not isinstance(current, dict) or key not in current:
+                return None, False
+            current = current[key]
+        
+        return current, True
+    
+    def is_array_target(self, data: Dict[Any, Any], key_path: str) -> bool:
+        """
+        Check if the target object is an array
+        """
+        obj, exists = self.get_nested_object(data, key_path)
+        return exists and isinstance(obj, list)
+
+    def parse_value(self, value_str: str) -> Any:
+        """
+        Parse string value to appropriate type
+        """
+        if not value_str:
+            return ""
+        
+        # Try to parse as JSON first (handles objects, arrays, etc.)
+        try:
+            return json.loads(value_str)
+        except json.JSONDecodeError:
+            # If not JSON, return as string
+            return value_str
+    
+    def add_to_array(self, data: Dict[Any, Any], key_path: str, value: Any, mode: str = 'append') -> bool:
+        """
+        Add element to array using dot notation
+        mode: 'append' (default), 'prepend', 'insert:index'
+        Returns True if element was successfully added
+        """
+        keys = key_path.split('.')
+        current = data
+        
+        # Navigate to parent of target array
+        for key in keys[:-1]:
+            if not isinstance(current, dict):
+                return False
+            if key not in current:
+                current[key] = {}
+            current = current[key]
+        
+        # Get the target array
+        target_key = keys[-1]
+        if not isinstance(current, dict):
+            return False
+        
+        # Create array if it doesn't exist
+        if target_key not in current:
+            current[target_key] = []
+        
+        # Ensure target is an array
+        if not isinstance(current[target_key], list):
+            return False
+        
+        # Add element based on mode
+        if mode == 'append':
+            current[target_key].append(value)
+        elif mode == 'prepend':
+            current[target_key].insert(0, value)
+        elif mode.startswith('insert:'):
+            try:
+                index = int(mode.split(':')[1])
+                current[target_key].insert(index, value)
+            except (ValueError, IndexError):
+                return False
+        else:
+            return False
+        
+        return True
+    
+    def remove_from_array(self, data: Dict[Any, Any], key_path: str, value: Any = None, index: int = None) -> bool:
+        """
+        Remove element from array using dot notation
+        Can remove by value or by index
+        Returns True if element was successfully removed
+        """
+        keys = key_path.split('.')
+        current = data
+        
+        # Navigate to parent of target array
+        for key in keys[:-1]:
+            if not isinstance(current, dict) or key not in current:
+                return False
+            current = current[key]
+        
+        # Get the target array
+        target_key = keys[-1]
+        if not isinstance(current, dict) or target_key not in current:
+            return False
+        
+        if not isinstance(current[target_key], list):
+            return False
+        
+        target_array = current[target_key]
+        
+        # Remove by index
+        if index is not None:
+            try:
+                target_array.pop(index)
+                return True
+            except IndexError:
+                return False
+        
+        # Remove by value
+        if value is not None:
+            try:
+                target_array.remove(value)
+                return True
+            except ValueError:
+                return False
+        
+        return False
+    
+    def modify_array_element(self, data: Dict[Any, Any], key_path: str, index: int, value: Any) -> bool:
+        """
+        Modify array element at specific index
+        Returns True if element was successfully modified
+        """
+        keys = key_path.split('.')
+        current = data
+        
+        # Navigate to parent of target array
+        for key in keys[:-1]:
+            if not isinstance(current, dict) or key not in current:
+                return False
+            current = current[key]
+        
+        # Get the target array
+        target_key = keys[-1]
+        if not isinstance(current, dict) or target_key not in current:
+            return False
+        
+        if not isinstance(current[target_key], list):
+            return False
+        
+        target_array = current[target_key]
+        
+        # Modify element at index
+        try:
+            target_array[index] = value
+            return True
+        except IndexError:
+            return False
+    
+    def process_file(self, yaml_file: Path, operations: List[Dict[str, Any]]) -> bool:
+        """
+        Process a single YAML file to perform specified operations
+        Returns True if file was modified
+        """
+        self.processed_files.append(yaml_file)
+        
+        try:
+            # Load YAML content
+            with open(yaml_file, 'r', encoding='utf-8') as f:
+                content = yaml.safe_load(f)
+            
+            if not isinstance(content, dict):
+                return False
+            
+            # Track if any modifications were made
+            modified = False
+            completed_operations = []
+            
+            # Perform each operation
+            for operation in operations:
+                op_type = operation['type']
+                key_path = operation['key']
+                value = operation.get('value')
+                target_type = operation.get('target_type', 'auto')
+                array_mode = operation.get('array_mode', 'append')
+                index = operation.get('index')
+                
+                success = False
+                op_description = ""
+                
+                if op_type == 'delete':
+                    success = self.delete_nested_object(content, key_path)
+                    op_description = f"delete {key_path}"
+                
+                elif op_type == 'add':
+                    if target_type == 'object':
+                        success = self.add_nested_object(content, key_path, value, 'replace')
+                        op_description = f"add object {key_path}={value}"
+                    elif target_type == 'array':
+                        if array_mode == 'insert' and index is not None:
+                            # Special handling for insert with index
+                            keys = key_path.split('.')
+                            current = content
+                            for key in keys[:-1]:
+                                if not isinstance(current, dict):
+                                    success = False
+                                    break
+                                if key not in current:
+                                    current[key] = {}
+                                current = current[key]
+                            else:
+                                target_key = keys[-1]
+                                if isinstance(current, dict):
+                                    if target_key not in current:
+                                        current[target_key] = []
+                                    if isinstance(current[target_key], list):
+                                        try:
+                                            current[target_key].insert(index, value)
+                                            success = True
+                                        except IndexError:
+                                            success = False
+                            op_description = f"insert into array {key_path}[{index}]: {value}"
+                        else:
+                            success = self.add_nested_object(content, key_path, value, array_mode)
+                            op_description = f"{array_mode} to array {key_path}: {value}"
+                    else:  # auto
+                        success = self.add_nested_object(content, key_path, value, 'auto')
+                        op_description = f"add (auto-detect) {key_path}={value}"
+                
+                elif op_type == 'modify':
+                    if target_type == 'array' and index is not None:
+                        success = self.modify_array_element(content, key_path, index, value)
+                        op_description = f"modify array {key_path}[{index}]={value}"
+                    else:
+                        success = self.modify_nested_object(content, key_path, value)
+                        op_description = f"modify {key_path}={value}"
+                
+                elif op_type == 'replace':
+                    success = self.add_nested_object(content, key_path, value, 'replace')
+                    type_desc = "array" if target_type == 'array' else "object"
+                    op_description = f"replace {type_desc} {key_path}={value}"
+                
+                elif op_type == 'remove':
+                    if target_type == 'array':
+                        if array_mode == 'by-value':
+                            success = self.remove_from_array(content, key_path, value=value)
+                            op_description = f"remove from array {key_path}: {value}"
+                        elif array_mode == 'by-index' and index is not None:
+                            success = self.remove_from_array(content, key_path, index=index)
+                            op_description = f"remove from array {key_path}[{index}]"
+                        else:
+                            success = False
+                    else:
+                        success = self.delete_nested_object(content, key_path)
+                        op_description = f"remove {key_path}"
+                
+                if success:
+                    completed_operations.append(op_description)
+                    modified = True
+            
+            # If no modifications, skip file
+            if not modified:
+                return False
+            
+            # In dry-run mode, just report what would be done
+            if self.dry_run:
+                print(f"[DRY RUN] Would perform on {yaml_file.relative_to(yaml_file.parents[3])}: {', '.join(completed_operations)}")
+                return True
+            
+            # Create backup if enabled
+            if self.backup:
+                backup_file = yaml_file.with_suffix(f'.yaml.backup.{datetime.now().strftime("%Y%m%d_%H%M%S")}')
+                shutil.copy2(yaml_file, backup_file)
+            
+            # Write modified content back to file with proper YAML formatting
+            with open(yaml_file, 'w', encoding='utf-8') as f:
+                yaml.dump(content, f, 
+                         default_flow_style=False, 
+                         allow_unicode=True, 
+                         sort_keys=False,
+                         indent=2,
+                         width=float('inf'),  # Prevent line wrapping
+                         default_style=None)
+            
+            self.modified_files.append(yaml_file)
+            print(f"✓ Modified {yaml_file.relative_to(yaml_file.parents[3])}: {', '.join(completed_operations)}")
+            
+            return True
+            
+        except Exception as e:
+            error_msg = f"Error processing {yaml_file}: {e}"
+            self.errors.append(error_msg)
+            print(f"✗ {error_msg}")
+            return False
+    
+    def batch_operate(self, root_dir: Path, operations: List[Dict[str, Any]], include_patterns: List[str] = None, exclude_patterns: List[str] = None) -> None:
+        """
+        Batch perform operations on objects in YAML files
+        """
+        print(f"🔍 Scanning directory: {root_dir}")
+        
+        # Print operations summary
+        op_summary = []
+        for op in operations:
+            op_type = op['type']
+            key = op['key']
+            value = op.get('value')
+            if op_type == 'delete':
+                op_summary.append(f"delete {key}")
+            elif value is not None:
+                op_summary.append(f"{op_type} {key}={value}")
+            else:
+                op_summary.append(f"{op_type} {key}")
+        
+        print(f"🎯 Operations to perform: {', '.join(op_summary)}")
+        
+        if self.dry_run:
+            print("🧪 Running in DRY-RUN mode - no files will be modified")
+        
+        # Find all YAML files
+        yaml_files = self.find_yaml_files(root_dir, exclude_patterns)
+        
+        # Filter by include patterns (now required)
+        # Convert patterns to work with relative paths from root_dir
+        filtered_files = []
+        for yaml_file in yaml_files:
+            # Get relative path from root_dir
+            try:
+                rel_path = yaml_file.relative_to(root_dir)
+                # Check if any include pattern matches the relative path
+                if any(fnmatch.fnmatch(str(rel_path), pattern) for pattern in include_patterns):
+                    filtered_files.append(yaml_file)
+            except ValueError:
+                # File is not under root_dir, skip it
+                continue
+        yaml_files = filtered_files
+        
+        if not yaml_files:
+            print("❌ No YAML files found matching criteria")
+            return
+        
+        print(f"📄 Found {len(yaml_files)} YAML files to process")
+        print()
+        
+        # Process each file
+        for yaml_file in yaml_files:
+            self.process_file(yaml_file, operations)
+        
+        # Print summary
+        print()
+        print("=" * 60)
+        print("📊 SUMMARY")
+        print("=" * 60)
+        print(f"Total files processed: {len(self.processed_files)}")
+        print(f"Files modified: {len(self.modified_files)}")
+        print(f"Errors encountered: {len(self.errors)}")
+        
+        if self.errors:
+            print("\n⚠️  ERRORS:")
+            for error in self.errors:
+                print(f"  - {error}")
+        
+        if self.modified_files and not self.dry_run:
+            print(f"\n✅ Successfully modified {len(self.modified_files)} files")
+            if self.backup:
+                print("💾 Backup files created with timestamp suffix")
+
+
+def parse_arguments():
+    """Parse command line arguments"""
+    parser = argparse.ArgumentParser(
+        description="Batch operations on objects in YAML files",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+使用示例:
+  # 删除对象 - 推荐用法
+  python tools/yaml_batch_operator.py --operator delete --key pricing --dry-run --include "models/llm/openai/*.yaml"
+  
+  # 添加对象
+  python tools/yaml_batch_operator.py --operator add --key pricing --value '{"input": "0.50", "output": "0.75"}' --type object --include "models/llm/anthropic/*.yaml"
+  
+  # 修改对象
+  python tools/yaml_batch_operator.py --operator modify --key pricing.input --value '"0.60"' --type object --include "models/llm/gemini/*.yaml"
+        """
+    )
+    
+    parser.add_argument(
+        '--operator', '-o',
+        action='append',
+        choices=['delete', 'add', 'modify', 'replace', 'remove'],
+        required=True,
+        help='Operation to perform (can be used multiple times for different operations)'
+    )
+    
+    parser.add_argument(
+        '--key', '-k',
+        action='append',
+        required=True,
+        help='Object keys to operate on (supports dot notation for nested objects, use multiple times to match --operator)'
+    )
+    
+    parser.add_argument(
+        '--value', '-v',
+        action='append',
+        help='Values for add/modify operations (JSON format supported, use multiple times to match --operator)'
+    )
+    
+    parser.add_argument(
+        '--type', '-t',
+        action='append',
+        choices=['object', 'array', 'auto'],
+        default=None,
+        help='Specify target type: object (key-value), array (list with -), or auto (detect). Use multiple times to match --operator'
+    )
+    
+    parser.add_argument(
+        '--array-mode',
+        action='append',
+        choices=['append', 'prepend', 'insert', 'by-value', 'by-index'],
+        help='Array operation mode: append (end), prepend (start), insert (at index), by-value (remove), by-index (remove). Use multiple times to match --operator'
+    )
+    
+    parser.add_argument(
+        '--index', '-i',
+        action='append',
+        type=int,
+        help='Index for array operations (use multiple times to match --operator)'
+    )
+    
+    parser.add_argument(
+        '--dir',
+        type=str,
+        default='.',
+        help='Root directory to search (default: current directory)'
+    )
+    
+    parser.add_argument(
+        '--include',
+        nargs='+',
+        required=True,
+        help='只处理匹配这些模式的文件 (使用相对路径，例如 "models/llm/openai/*.yaml" 或 "openai/*.yaml") - 出于安全考虑此参数为必需'
+    )
+    
+    parser.add_argument(
+        '--exclude',
+        nargs='+',
+        default=['_position.yaml', 'manifest.yaml', '*.backup'],
+        help='Exclude files matching these patterns'
+    )
+    
+    parser.add_argument(
+        '--dry-run',
+        action='store_true',
+        help='Preview changes without modifying files'
+    )
+    
+    parser.add_argument(
+        '--no-backup',
+        action='store_true',
+        help='Skip creating backup files'
+    )
+    
+    parser.add_argument(
+        '--verbose', '-b',
+        action='store_true',
+        help='Enable verbose output'
+    )
+    
+    return parser.parse_args()
+
+
+def validate_arguments(args):
+    """Validate command line arguments"""
+    # Check if directory exists
+    root_dir = Path(args.dir)
+    if not root_dir.exists():
+        print(f"❌ Error: Directory '{args.dir}' does not exist")
+        sys.exit(1)
+    
+    if not root_dir.is_dir():
+        print(f"❌ Error: '{args.dir}' is not a directory")
+        sys.exit(1)
+    
+    # Check that operator and key lists have compatible lengths
+    num_operators = len(args.operator)
+    num_keys = len(args.key)
+    num_values = len(args.value) if args.value else 0
+    num_types = len(args.type) if args.type else 0
+    num_array_modes = len(args.array_mode) if args.array_mode else 0
+    num_indices = len(args.index) if args.index else 0
+    
+    if num_operators != num_keys:
+        print(f"❌ Error: Number of operators ({num_operators}) must match number of keys ({num_keys})")
+        sys.exit(1)
+    
+    # Check if values are required for certain operations
+    for i, op in enumerate(args.operator):
+        if op in ['add', 'modify', 'replace', 'remove']:
+            # For remove operation with array type, value might be required
+            if op == 'remove':
+                array_mode = args.array_mode[i] if args.array_mode and i < len(args.array_mode) else None
+                if array_mode == 'by-value' and (not args.value or i >= len(args.value)):
+                    print(f"❌ Error: Remove operation with 'by-value' mode requires a value for key '{args.key[i]}'")
+                    sys.exit(1)
+                elif array_mode == 'by-index' and (not args.index or i >= len(args.index)):
+                    print(f"❌ Error: Remove operation with 'by-index' mode requires an index for key '{args.key[i]}'")
+                    sys.exit(1)
+            elif op in ['add', 'modify', 'replace']:
+                if not args.value or i >= len(args.value) or not args.value[i]:
+                    print(f"❌ Error: Operation '{op}' requires a value for key '{args.key[i]}'")
+                    sys.exit(1)
+    
+    # Validate key format
+    for key in args.key:
+        if not key.strip():
+            print(f"❌ Error: Empty key specified")
+            sys.exit(1)
+    
+    return root_dir
+
+
+def prepare_operations(args) -> List[Dict[str, Any]]:
+    """Prepare operations list from arguments"""
+    operations = []
+    operator = YAMLObjectOperator()  # Temporary instance for value parsing
+    
+    for i, op_type in enumerate(args.operator):
+        operation = {
+            'type': op_type,
+            'key': args.key[i]
+        }
+        
+        # Add value for operations that need it
+        if op_type in ['add', 'modify', 'replace'] and args.value and i < len(args.value):
+            operation['value'] = operator.parse_value(args.value[i])
+        elif op_type == 'remove':
+            # For remove operation, value might be needed for by-value mode
+            array_mode = args.array_mode[i] if args.array_mode and i < len(args.array_mode) else None
+            if array_mode == 'by-value' and args.value and i < len(args.value):
+                operation['value'] = operator.parse_value(args.value[i])
+        
+        # Add target type
+        if args.type and i < len(args.type):
+            operation['target_type'] = args.type[i]
+        else:
+            operation['target_type'] = 'auto'
+        
+        # Add array mode
+        if args.array_mode and i < len(args.array_mode):
+            operation['array_mode'] = args.array_mode[i]
+        else:
+            operation['array_mode'] = 'append'  # default
+        
+        # Add index for operations that need it
+        if args.index and i < len(args.index):
+            operation['index'] = args.index[i]
+        
+        operations.append(operation)
+    
+    return operations
+
+
+def main():
+    """Main function"""
+    args = parse_arguments()
+    root_dir = validate_arguments(args)
+    
+    # Prepare operations
+    operations = prepare_operations(args)
+    
+    # Create operator instance
+    operator = YAMLObjectOperator(
+        dry_run=args.dry_run,
+        backup=not args.no_backup
+    )
+    
+    try:
+        # Perform batch operations
+        operator.batch_operate(
+            root_dir=root_dir,
+            operations=operations,
+            include_patterns=args.include,
+            exclude_patterns=args.exclude
+        )
+        
+    except KeyboardInterrupt:
+        print("\n⏹️  Operation cancelled by user")
+        sys.exit(1)
+    except Exception as e:
+        print(f"\n❌ Unexpected error: {e}")
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()