fix(route/huggingface): add huggingface group models detail #20646

WuNein · 2025-12-06T11:10:05Z

Involved Issue / 该 PR 相关 Issue

Close #

Example for the Proposed Route(s) / 路由地址示例

/huggingface/models/deepseek-ai
/huggingface/models/facebook
/huggingface/models/ianyang02

New RSS Route Checklist / 新 RSS 路由检查表

New Route / 新的路由
- Follows Script Standard / 跟随路由规范
Anti-bot or rate limit / 反爬/频率限制
- If yes, do your code reflect this sign? / 如果有, 是否有对应的措施?
[x ] Date and time / 日期和时间
- [ x] Parsed / 可以解析
- [ x] Correct time zone / 时区正确
New package added / 添加了新的包
Puppeteer

Note / 说明

在今天#20631 merge的基础上增加了模型的详细内容

Adding model detail on the base of #20631

github-actions · 2025-12-06T11:14:57Z

Successfully generated as following:

http://localhost:1200/huggingface/models/deepseek-ai - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface deepseek-ai Models</title>
    <link>https://huggingface.co/deepseek-ai/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/deepseek-ai" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface deepseek-ai Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sat, 06 Dec 2025 11:14:50 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Speciale</title>
      <description>deepseek-ai/DeepSeek-V3.2-Speciale Text Generation • 685B • Updated 5 days ago • 4.72k • 516
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune
        ---
        # DeepSeek-V3.2: Efficient Reasoning &amp;amp; Agentic AI
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2/blob/main/assets/paper.pdf&quot;&gt;&lt;b&gt;Technical Report&lt;/b&gt;👁️&lt;/a&gt;
        &lt;/p&gt;
        ## Introduction
        We introduce **DeepSeek-V3.2**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
        1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
        2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
        - *Achievement:* 🥇 **Gold-medal performance** in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI).
        3. **Large-Scale Agentic Task Synthesis Pipeline:** To integrate **reasoning into tool-use** scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This facilitates scalable agentic post-training, improving compliance and generalization in complex interactive environments.
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/benchmark.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        We have also released the final submissions for IOI 2025, ICPC World Finals, IMO 2025 and CMO 2025, which were selected based on our designed pipeline. These materials are provided for the community to conduct secondary verification. The files can be accessed at `assets/olympiad_cases`.
        ## Chat Template
        DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a &quot;thinking with tools&quot; capability.
        To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model&#39;s text output.
        A brief example is illustrated below:
        ```python
        import transformers
        # encoding/encoding_dsv32.py
        from encoding_dsv32 import encode_messages, parse_message_from_completion_text
        tokenizer = transformers.AutoTokenizer.from_pretrained(&quot;deepseek-ai/DeepSeek-V3.2&quot;)
        messages = [
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;hello&quot;},
        {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;Hello! I am DeepSeek.&quot;, &quot;reasoning_content&quot;: &quot;thinking...&quot;},
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;1+1=?&quot;}
        ]
        encode_config = dict(thinking_mode=&quot;thinking&quot;, drop_thinking=True, add_default_bos_token=True)
        # messages -&amp;gt; string
        prompt = encode_messages(messages, **encode_config)
        # Output: &quot;&amp;lt;｜begin▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;hello&amp;lt;｜Assistant｜&amp;gt;Hello! I am DeepSeek.&amp;lt;｜end▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;1+1=?&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;&quot;
        # string -&amp;gt; tokens
        tokens = tokenizer.encode(prompt)
        # Output: [0, 128803, 33310, 128804, 128799, 19923, 3, 342, 1030, 22651, 4374, 1465, 16, 1, 128803, 19, 13, 19, 127252, 128804, 128798]
        ```
        Important Notes:
        1. This release does not include a Jinja-format chat template. Please refer to the Python code mentioned above.
        2. The output parsing function included in the code is designed to handle well-formatted strings only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. It is not suitable for production use without robust error handling.
        3. A new role named `developer` has been introduced in the chat template. This role is dedicated exclusively to search agent scenarios and is designated for no other tasks. The official API does not accept messages assigned to `developer`.
        ## How to Run Locally
        The model structure of DeepSeek-V3.2 and DeepSeek-V3.2-Speciale are the same as DeepSeek-V3.2-Exp. Please visit [DeepSeek-V3.2-Exp](https://github.com/deepseek-ai/DeepSeek-V3.2-Exp) repo for more information about running this model locally.
        Usage Recommendations:
        1. For local deployment, we recommend setting the sampling parameters to `temperature = 1.0, top_p = 0.95`.
        2. Please note that the DeepSeek-V3.2-Speciale variant is designed exclusively for deep reasoning tasks and does not support the tool-calling functionality.
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2025deepseekv32,
        title={DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models},
        author={DeepSeek-AI},
        year={2025},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
        &lt;/think&gt;</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale</guid>
      <pubDate>Mon, 01 Dec 2025 03:06:03 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2</title>
      <description>deepseek-ai/DeepSeek-V3.2 Text Generation • 685B • Updated 5 days ago • 18.1k • • 738
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune
        ---
        # DeepSeek-V3.2: Efficient Reasoning &amp;amp; Agentic AI
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2/blob/main/assets/paper.pdf&quot;&gt;&lt;b&gt;Technical Report&lt;/b&gt;👁️&lt;/a&gt;
        &lt;/p&gt;
        ## Introduction
        We introduce **DeepSeek-V3.2**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
        1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
        2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
        - *Achievement:* 🥇 **Gold-medal performance** in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI).
        3. **Large-Scale Agentic Task Synthesis Pipeline:** To integrate **reasoning into tool-use** scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This facilitates scalable agentic post-training, improving compliance and generalization in complex interactive environments.
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/benchmark.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        We have also released the final submissions for IOI 2025, ICPC World Finals, IMO 2025 and CMO 2025, which were selected based on our designed pipeline. These materials are provided for the community to conduct secondary verification. The files can be accessed at `assets/olympiad_cases`.
        ## Chat Template
        DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a &quot;thinking with tools&quot; capability.
        To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model&#39;s text output.
        A brief example is illustrated below:
        ```python
        import transformers
        # encoding/encoding_dsv32.py
        from encoding_dsv32 import encode_messages, parse_message_from_completion_text
        tokenizer = transformers.AutoTokenizer.from_pretrained(&quot;deepseek-ai/DeepSeek-V3.2&quot;)
        messages = [
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;hello&quot;},
        {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;Hello! I am DeepSeek.&quot;, &quot;reasoning_content&quot;: &quot;thinking...&quot;},
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;1+1=?&quot;}
        ]
        encode_config = dict(thinking_mode=&quot;thinking&quot;, drop_thinking=True, add_default_bos_token=True)
        # messages -&amp;gt; string
        prompt = encode_messages(messages, **encode_config)
        # Output: &quot;&amp;lt;｜begin▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;hello&amp;lt;｜Assistant｜&amp;gt;Hello! I am DeepSeek.&amp;lt;｜end▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;1+1=?&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;&quot;
        # string -&amp;gt; tokens
        tokens = tokenizer.encode(prompt)
        # Output: [0, 128803, 33310, 128804, 128799, 19923, 3, 342, 1030, 22651, 4374, 1465, 16, 1, 128803, 19, 13, 19, 127252, 128804, 128798]
        ```
        Important Notes:
        1. This release does not include a Jinja-format chat template. Please refer to the Python code mentioned above.
        2. The output parsing function included in the code is designed to handle well-formatted strings only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. It is not suitable for production use without robust error handling.
        3. A new role named `developer` has been introduced in the chat template. This role is dedicated exclusively to search agent scenarios and is designated for no other tasks. The official API does not accept messages assigned to `developer`.
        ## How to Run Locally
        The model structure of DeepSeek-V3.2 and DeepSeek-V3.2-Speciale are the same as DeepSeek-V3.2-Exp. Please visit [DeepSeek-V3.2-Exp](https://github.com/deepseek-ai/DeepSeek-V3.2-Exp) repo for more information about running this model locally.
        Usage Recommendations:
        1. For local deployment, we recommend setting the sampling parameters to `temperature = 1.0, top_p = 0.95`.
        2. Please note that the DeepSeek-V3.2-Speciale variant is designed exclusively for deep reasoning tasks and does not support the tool-calling functionality.
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2025deepseekv32,
        title={DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models},
        author={DeepSeek-AI},
        year={2025},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
        &lt;/think&gt;</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2</guid>
      <pubDate>Mon, 01 Dec 2025 03:04:59 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-Math-V2</title>
      <description>deepseek-ai/DeepSeek-Math-V2 Text Generation • 685B • Updated 9 days ago • 8.96k • 637
        ---
        license: apache-2.0
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-Math-V2
        ---
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot;&gt;&lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot;&gt;&lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot;&gt;&lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;br&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot;&gt;&lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot;&gt;&lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot;&gt;&lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;br&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-Apache%202.0-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;br&gt;
        &lt;/div&gt;
        # DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
        ## 1. Introduction
        Large language models have made significant progress in mathematical reasoning, which serves as an important testbed for AI and could impact scientific research if further advanced.
        By scaling reasoning with reinforcement learning that rewards correct final answers, LLMs have improved from poor performance to saturating quantitative reasoning competitions like AIME and HMMT in one year.
        However, this approach faces fundamental limitations.
        Pursuing higher final answer accuracy doesn&#39;t address a key issue: correct answers don&#39;t guarantee correct reasoning.
        Moreover, many mathematical tasks like theorem proving require rigorous step-by-step derivation rather than numerical answers, making final answer rewards inapplicable.
        To push the limits of deep reasoning, we believe it is necessary to verify the comprehensiveness and rigor of mathematical reasoning.
        Self-verification is particularly important for scaling test-time compute, especially for open problems without known solutions.
        Towards self-verifiable mathematical reasoning, we investigate how to train an accurate and faithful LLM-based verifier for theorem proving.
        We then train a proof generator using the verifier as the reward model, and incentivize the generator to identify and resolve as many issues as possible in their own proofs before finalizing them.
        To maintain the generation-verification gap as the generator becomes stronger, we propose to scale verification compute to automatically label new hard-to-verify proofs, creating training data to further improve the verifier.
        Our resulting model, DeepSeekMath-V2, demonstrates strong theorem-proving capabilities, achieving gold-level scores on IMO 2025 and CMO 2024 and a near-perfect 118/120 on Putnam 2024 with scaled test-time compute.
        While much work remains, these results suggest that self-verifiable mathematical reasoning is a feasible research direction that may help develop more capable mathematical AI systems.
        ## 2. Evaluation Results
        Below are evaluation results on [IMO-ProofBench](https://github.com/google-deepmind/superhuman/tree/main/imobench) (developed by the DeepMind team behind DeepThink IMO-Gold) and recent mathematics competitions including IMO 2025, CMO 2024, and Putnam 2024.
        **IMO-ProofBench**
        &lt;p align=&quot;center&quot;&gt;
        &lt;img width=&quot;100%&quot; src=&quot;https://raw.githubusercontent.com/deepseek-ai/DeepSeek-Math-V2/refs/heads/main/figures/IMO-ProofBench.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        ---
        **Mathematics Competitions**
        &lt;p align=&quot;center&quot;&gt;
        &lt;img width=&quot;41%&amp;quot;&quot; src=&quot;https://raw.githubusercontent.com/deepseek-ai/DeepSeek-Math-V2/refs/heads/main/figures/Competitions.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        ## 4. Quick Start
        DeepSeekMath-V2 is built on top of DeepSeek-V3.2-Exp-Base.
        For inference support, please refer to [the DeepSeek-V3.2-Exp github repository](https://github.com/deepseek-ai/DeepSeek-V3.2-Exp).
        ## 6. License
        This repository and the model weights are licensed under [the Apache License, Version 2.0 (Apache 2.0)](LICENSE).
        ## 7. Citation
        ```
        @misc{deepseek-math-v2,
        author = {Zhihong Shao, Yuxiang Luo, Chengda Lu, Z.Z. Ren, Jiewen Hu, Tian Ye, Zhibin Gou, Shirong Ma, Xiaokang Zhang},
        title = {DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning},
        year = {2025},
        }
        ```
        ## 8. Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-Math-V2</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-Math-V2</guid>
      <pubDate>Thu, 27 Nov 2025 02:35:52 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Exp</title>
      <description>deepseek-ai/DeepSeek-V3.2-Exp Text Generation • 685B • Updated 18 days ago • 66.2k • • 899
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune
        ---
        # DeepSeek-V3.2-Exp
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        ## Introduction
        We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.
        This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/cost.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        - DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.
        - To rigorously evaluate the impact of introducing sparse attention, we deliberately aligned the training configurations of DeepSeek-V3.2-Exp with V3.1-Terminus. Across public benchmarks in various domains, DeepSeek-V3.2-Exp demonstrates performance on par with V3.1-Terminus.
        | Benchmark | DeepSeek-V3.1-Terminus | DeepSeek-V3.2-Exp |
        | :--- | :---: | :---: |
        | **Reasoning Mode w/o Tool Use** | | |
        | MMLU-Pro | 85.0 | 85.0 |
        | GPQA-Diamond | 80.7 | 79.9 |
        | Humanity&#39;s Last Exam | 21.7 | 19.8 |
        | LiveCodeBench | 74.9 | 74.1 |
        | AIME 2025 | 88.4 | 89.3 |
        | HMMT 2025 | 86.1 | 83.6 |
        | Codeforces | 2046 | 2121 |
        | Aider-Polyglot | 76.1 | 74.5 |
        | **Agentic Tool Use** | | |
        | BrowseComp | 38.5 | 40.1 |
        | BrowseComp-zh | 45.0 | 47.9 |
        | SimpleQA | 96.8 | 97.1 |
        | SWE Verified | 68.4 | 67.8 |
        | SWE-bench Multilingual | 57.8 | 57.9 |
        | Terminal-bench | 36.7 | 37.7 |
        ## Update
        - 2025.11.17: **We have identified that previous versions of the inference demo code contained an implementation discrepancy in Rotary Position Embedding (RoPE) within the indexer module, potentially leading to degraded model performance.** Specifically, the input tensor to RoPE in the indexer module requires a non-interleaved layout, whereas RoPE in the MLA module expects an interleaved layout. This issue has now been resolved. Please refer to the updated version of the inference demo code and take note of this implementation detail.
        ## How to Run Locally
        ### HuggingFace
        We provide an updated inference demo code in the [inference](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp/tree/main/inference) folder to help the community quickly get started with our model and understand its architectural details.
        First convert huggingface model weights to the the format required by our inference demo. Set `MP` to match your available GPU count:
        ```bash
        cd inference
        export EXPERTS=256
        python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}
        ```
        Launch the interactive chat interface and start exploring DeepSeek&#39;s capabilities:
        ```bash
        export CONFIG=config_671B_v3.2.json
        torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive
        ```
        ### SGLang
        #### Installation with Docker
        ```
        # H200
        docker pull lmsysorg/sglang:dsv32
        # MI350
        docker pull lmsysorg/sglang:dsv32-rocm
        # NPUs
        docker pull lmsysorg/sglang:dsv32-a2
        docker pull lmsysorg/sglang:dsv32-a3
        ```
        #### Launch Command
        ```bash
        python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --dp 8 --enable-dp-attention
        ```
        ### vLLM
        vLLM provides day-0 support of DeepSeek-V3.2-Exp. See the [recipes](https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-V3_2-Exp.html) for up-to-date details.
        ## Open-Source Kernels
        For TileLang kernels with **better readability and research-purpose design**, please refer to [TileLang](https://github.com/tile-ai/tilelang/tree/main/examples/deepseek_v32).
        For **high-performance CUDA kernels**, indexer logit kernels (including paged versions) are available in [DeepGEMM](https://github.com/deepseek-ai/DeepGEMM/pull/200). Sparse attention kernels are released in [FlashMLA](https://github.com/deepseek-ai/FlashMLA/pull/98).
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2024deepseekv32,
        title={DeepSeek-V3.2-Exp: Boosting Long-Context Efficiency with DeepSeek Sparse Attention},
        author={DeepSeek-AI},
        year={2025},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp</guid>
      <pubDate>Mon, 17 Nov 2025 18:39:54 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-OCR</title>
      <description>deepseek-ai/DeepSeek-OCR Image-Text-to-Text • 3B • Updated Nov 4 • 5.47M • 2.93k
        ---
        pipeline_tag: image-text-to-text
        language:
        - multilingual
        tags:
        - deepseek
        - vision-language
        - ocr
        - custom_code
        license: mit
        library_name: transformers
        ---
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek AI&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-OCR&quot;&gt;&lt;b&gt;🌟 Github&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR&quot;&gt;&lt;b&gt;📥 Model Download&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf&quot;&gt;&lt;b&gt;📄 Paper Link&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://arxiv.org/abs/2510.18234&quot;&gt;&lt;b&gt;📄 Arxiv Paper Link&lt;/b&gt;&lt;/a&gt; |
        &lt;/p&gt;
        &lt;h2&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/papers/2510.18234&quot;&gt;DeepSeek-OCR: Contexts Optical Compression&lt;/a&gt;
        &lt;/p&gt;
        &lt;/h2&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/fig1.png&quot; style=&quot;width: 1000px&quot; align=&quot;center&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/papers/2510.18234&quot;&gt;Explore the boundaries of visual-text compression.&lt;/a&gt;
        &lt;/p&gt;
        ## Usage
        Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8：
        ```
        torch==2.6.0
        transformers==4.46.3
        tokenizers==0.20.3
        einops
        addict
        easydict
        pip install flash-attn==2.7.3 --no-build-isolation
        ```
        ```python
        from transformers import AutoModel, AutoTokenizer
        import torch
        import os
        os.environ[&quot;CUDA_VISIBLE_DEVICES&quot;] = &#39;0&#39;
        model_name = &#39;deepseek-ai/DeepSeek-OCR&#39;
        tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
        model = AutoModel.from_pretrained(model_name, _attn_implementation=&#39;flash_attention_2&#39;, trust_remote_code=True, use_safetensors=True)
        model = model.eval().cuda().to(torch.bfloat16)
        # prompt = &quot;&lt;img referrerpolicy=&quot;no-referrer&quot;&gt;\nFree OCR. &quot;
        prompt = &quot;&lt;img referrerpolicy=&quot;no-referrer&quot;&gt;\n&amp;lt;|grounding|&amp;gt;Convert the document to markdown. &quot;
        image_file = &#39;your_image.jpg&#39;
        output_path = &#39;your/output/dir&#39;
        # infer(self, tokenizer, prompt=&#39;&#39;, image_file=&#39;&#39;, output_path = &#39; &#39;, base_size = 1024, image_size = 640, crop_mode = True, test_compress = False, save_results = False):
        # Tiny: base_size = 512, image_size = 512, crop_mode = False
        # Small: base_size = 640, image_size = 640, crop_mode = False
        # Base: base_size = 1024, image_size = 1024, crop_mode = False
        # Large: base_size = 1280, image_size = 1280, crop_mode = False
        # Gundam: base_size = 1024, image_size = 640, crop_mode = True
        res = model.infer(tokenizer, prompt=prompt, image_file=image_file, output_path = output_path, base_size = 1024, image_size = 640, crop_mode=True, save_results = True, test_compress = True)
        ```
        ## vLLM
        Refer to [🌟GitHub](https://github.com/deepseek-ai/DeepSeek-OCR/) for guidance on model inference acceleration and PDF processing, etc.&lt;!-- --&gt;
        [2025/10/23] 🚀🚀🚀 DeepSeek-OCR is now officially supported in upstream [vLLM](https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-OCR.html#installing-vllm).
        ```shell
        uv venv
        source .venv/bin/activate
        # Until v0.11.1 release, you need to install vLLM from nightly build
        uv pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
        ```
        ```python
        from vllm import LLM, SamplingParams
        from vllm.model_executor.models.deepseek_ocr import NGramPerReqLogitsProcessor
        from PIL import Image
        # Create model instance
        llm = LLM(
        model=&quot;deepseek-ai/DeepSeek-OCR&quot;,
        enable_prefix_caching=False,
        mm_processor_cache_gb=0,
        logits_processors=[NGramPerReqLogitsProcessor]
        )
        # Prepare batched input with your image file
        image_1 = Image.open(&quot;path/to/your/image_1.png&quot;).convert(&quot;RGB&quot;)
        image_2 = Image.open(&quot;path/to/your/image_2.png&quot;).convert(&quot;RGB&quot;)
        prompt = &quot;&lt;img referrerpolicy=&quot;no-referrer&quot;&gt;\nFree OCR.&quot;
        model_input = [
        {
        &quot;prompt&quot;: prompt,
        &quot;multi_modal_data&quot;: {&quot;image&quot;: image_1}
        },
        {
        &quot;prompt&quot;: prompt,
        &quot;multi_modal_data&quot;: {&quot;image&quot;: image_2}
        }
        ]
        sampling_param = SamplingParams(
        temperature=0.0,
        max_tokens=8192,
        # ngram logit processor args
        extra_args=dict(
        ngram_size=30,
        window_size=90,
        whitelist_token_ids={128821, 128822}, # whitelist: ,
        ),
        skip_special_tokens=False,
        )
        # Generate output
        model_outputs = llm.generate(model_input, sampling_param)
        # Print output
        for output in model_outputs:
        print(output.outputs[0].text)
        ```
        ## Visualizations
        &lt;table&gt;
        &lt;tbody&gt;&lt;tr&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show1.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show2.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show3.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show4.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;/tbody&gt;&lt;/table&gt;
        ## Acknowledgement
        We would like to thank [Vary](https://github.com/Ucas-HaoranWei/Vary/), [GOT-OCR2.0](https://github.com/Ucas-HaoranWei/GOT-OCR2.0/), [MinerU](https://github.com/opendatalab/MinerU), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR), [OneChart](https://github.com/LingyvKong/OneChart), [Slow Perception](https://github.com/Ucas-HaoranWei/Slow-Perception) for their valuable models and ideas.
        We also appreciate the benchmarks: [Fox](https://github.com/ucaslcl/Fox), [OminiDocBench](https://github.com/opendatalab/OmniDocBench).
        ## Citation
        ```bibtex
        @article{wei2025deepseek,
        title={DeepSeek-OCR: Contexts Optical Compression},
        author={Wei, Haoran and Sun, Yaofeng and Li, Yukun},
        journal={arXiv preprint arXiv:2510.18234},
        year={2025}
        }</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-OCR</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-OCR</guid>
      <pubDate>Mon, 03 Nov 2025 18:36:12 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Exp-Base</title>
      <description>deepseek-ai/DeepSeek-V3.2-Exp-Base Text Generation • 685B • Updated Oct 9 • 217 • 48
        ---
        license: mit
        library_name: transformers
        ---</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp-Base</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp-Base</guid>
      <pubDate>Wed, 08 Oct 2025 18:09:46 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.1-Terminus</title>
      <description>deepseek-ai/DeepSeek-V3.1-Terminus Text Generation • 685B • Updated Sep 29 • 81k • • 354
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.1-Base
        ---
        # DeepSeek-V3.1-Terminus
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        ## Introduction
        This update maintains the model&#39;s original capabilities while addressing issues reported by users, including:
        - Language consistency: Reducing instances of mixed Chinese-English text and occasional abnormal characters;
        - Agent capabilities: Further optimizing the performance of the Code Agent and Search Agent.
        | Benchmark | DeepSeek-V3.1 | DeepSeek-V3.1-Terminus |
        | :--- | :---: | :---: |
        | **Reasoning Mode w/o Tool Use** | | |
        | MMLU-Pro | 84.8 | 85.0 |
        | GPQA-Diamond | 80.1 | 80.7 |
        | Humanity&#39;s Last Exam | 15.9 | 21.7 |
        | LiveCodeBench | 74.8 | 74.9 |
        | Codeforces | 2091 | 2046 |
        | Aider-Polyglot | 76.3 | 76.1 |
        | **Agentic Tool Use** | | |
        | BrowseComp | 30.0 | 38.5 |
        | BrowseComp-zh | 49.2 | 45.0 |
        | SimpleQA | 93.4 | 96.8 |
        | SWE Verified | 66.0 | 68.4 |
        | SWE-bench Multilingual | 54.5 | 57.8 |
        | Terminal-bench | 31.3 | 36.7 |
        **The template and tool-set of search agent have been updated, which is shown in `assets/search_tool_trajectory.html`.**
        ## How to Run Locally
        The model structure of DeepSeek-V3.1-Terminus is the same as DeepSeek-V3. Please visit [DeepSeek-V3](https://github.com/deepseek-ai/DeepSeek-V3) repo for more information about running this model locally.
        For the model&#39;s chat template other than search agent, please refer to the [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1) repo.
        **Here we also provide an updated inference demo code in the `inference` folder to help the community get started with running our model and understand the details of model architecture.**
        **NOTE: In the current model checkpoint, the parameters of `self_attn.o_proj` do not conform to the UE8M0 FP8 scale data format. This is a known issue and will be corrected in future model releases.**
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2024deepseekv3technicalreport,
        title={DeepSeek-V3 Technical Report},
        author={DeepSeek-AI},
        year={2024},
        eprint={2412.19437},
        archivePrefix={arXiv},
        primaryClass={cs.CL},
        url={https://arxiv.org/abs/2412.19437},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus</guid>
      <pubDate>Sun, 28 Sep 2025 17:52:07 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.1</title>
      <description>deepseek-ai/DeepSeek-V3.1 Text Generation • 685B • Updated Sep 5 • 82.2k • • 808
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.1-Base
        ---
        # DeepSeek-V3.1
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        ## Introduction
        DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:
        - **Hybrid thinking mode**: One model supports both thinking mode and non-thinking mode by changing the chat template.
        - **Smarter tool calling**: Through post-training optimization, the model&#39;s performance in tool usage and agent tasks has significantly improved.
        - **Higher thinking efficiency**: DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly.
        DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens.
        Additionally, DeepSeek-V3.1 is trained using the **UE8M0 FP8 scale data format on both model weights and activations** to ensure compatibility with microscaling data formats. Please refer to [DeepGEMM](https://github.com/deepseek-ai/DeepGEMM) for more details.
        ## Model Downloads
        &lt;div align=&quot;center&quot;&gt;
        | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
        | :------------: | :------------: | :------------: | :------------: | :------------: |
        | DeepSeek-V3.1-Base | 671B | 37B | 128K | [HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) \| [ModelScope](https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.1-Base) |
        | DeepSeek-V3.1 | 671B | 37B | 128K | [HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-V3.1) \| [ModelScope](https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.1) |
        &lt;/div&gt;
        ## Chat Template
        The details of our chat template is described in `tokenizer_config.json` and `assets/chat_template.jinja`. Here is a brief description.
        ### Non-Thinking
        #### First-Turn
        Prefix:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;`
        With the given prefix, DeepSeek V3.1 generates responses to queries in non-thinking mode. Unlike DeepSeek V3, it introduces an additional token ``.
        #### Multi-Turn
        Context:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;...&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;`
        Prefix:
        `&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;`
        By concatenating the context and the prefix, we obtain the correct prompt for the query.
        ### Thinking
        #### First-Turn
        Prefix:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;`
        The prefix of thinking mode is similar to DeepSeek-R1.
        #### Multi-Turn
        Context:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;&lt;/think&gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;...&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;`
        Prefix:
        `&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;`
        The multi-turn template is the same with non-thinking multi-turn chat template. It means the thinking token in the last turn will be dropped but the `&lt;/think&gt;` is retained in every turn of context.
        ### ToolCall
        Toolcall is supported in non-thinking mode. The format is:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}\n\n{tool_description}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;` where the tool_description is
        ```
        ## Tools
        You have access to the following tools:
        ### {tool_name1}
        Description: {description}
        Parameters: {json.dumps(parameters)}
        IMPORTANT: ALWAYS adhere to this exact format for tool use:
        &amp;lt;｜tool▁calls▁begin｜&amp;gt;&amp;lt;｜tool▁call▁begin｜&amp;gt;tool_call_name&amp;lt;｜tool▁sep｜&amp;gt;tool_call_arguments&amp;lt;｜tool▁call▁end｜&amp;gt;{additional_tool_calls}&amp;lt;｜tool▁calls▁end｜&amp;gt;
        Where:
        - `tool_call_name` must be an exact match to one of the available tools
        - `tool_call_arguments` must be valid JSON that strictly follows the tool&#39;s Parameters Schema
        - For multiple tool calls, chain them directly without separators or spaces
        ```
        ### Code-Agent
        We support various code agent frameworks. Please refer to the above toolcall format to create your own code agents. An example is shown in `assets/code_agent_trajectory.html`.
        ### Search-Agent
        We design a specific format for searching toolcall in thinking mode, to support search agent.
        For complex questions that require accessing external or up-to-date information, DeepSeek-V3.1 can leverage a user-provided search tool through a multi-turn tool-calling process.
        Please refer to the `assets/search_tool_trajectory.html` and `assets/search_python_tool_trajectory.html` for the detailed template.
        ## Evaluation
        | Category | Benchmark (Metric) | DeepSeek V3.1-NonThinking | DeepSeek V3 0324 | DeepSeek V3.1-Thinking | DeepSeek R1 0528
        |----------|----------------------------------|-----------------|---|---|---|
        | General |
        | | MMLU-Redux (EM) | 91.8 | 90.5 | 93.7 | 93.4
        | | MMLU-Pro (EM) | 83.7 | 81.2 | 84.8 | 85.0
        | | GPQA-Diamond (Pass@1) | 74.9 | 68.4 | 80.1 | 81.0
        | | Humanity&#39;s Last Exam (Pass@1) | - | - | 15.9 | 17.7
        |Search Agent|
        | | BrowseComp | - | - | 30.0 | 8.9
        | | BrowseComp_zh | - | - | 49.2 | 35.7
        | | Humanity&#39;s Last Exam (Python + Search) |- | - | 29.8 | 24.8
        | | SimpleQA | - | - | 93.4 | 92.3
        | Code |
        | | LiveCodeBench (2408-2505) (Pass@1) | 56.4 | 43.0 | 74.8 | 73.3
        | | Codeforces-Div1 (Rating) | - | - | 2091 | 1930
        | | Aider-Polyglot (Acc.) | 68.4 | 55.1 | 76.3 | 71.6
        | Code Agent|
        | | SWE Verified (Agent mode) | 66.0 | 45.4 | - | 44.6
        | | SWE-bench Multilingual (Agent mode) | 54.5 | 29.3 | - | 30.5
        | | Terminal-bench (Terminus 1 framework) | 31.3 | 13.3 | - | 5.7
        | Math |
        | | AIME 2024 (Pass@1) | 66.3 | 59.4 | 93.1 | 91.4
        | | AIME 2025 (Pass@1) | 49.8 | 51.3 | 88.4 | 87.5
        | | HMMT 2025 (Pass@1) | 33.5 | 29.2 | 84.2 | 79.4 |
        Note:
        - Search agents are evaluated with our internal search framework, which uses a commercial search API + webpage filter + 128K context window. Seach agent results of R1-0528 are eva

...

github-actions · 2025-12-06T11:14:57Z

http://localhost:1200/huggingface/models/facebook - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface facebook Models</title>
    <link>https://huggingface.co/facebook/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/facebook" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface facebook Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sat, 06 Dec 2025 11:14:55 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>facebook/testing_instructions</title>
      <description>facebook/testing_instructions Updated 2 days ago • 2
        ---
        extra_gated_fields:
        First Name: text
        Last Name: text
        Date of birth: date_picker
        Country: country
        Affiliation: text
        I accept the terms and conditions: checkbox
        geo: ip_location
        Test request?: checkbox
        By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox
        extra_gated_description: &amp;gt;-
        The information you provide will be collected, stored, processed and shared in
        accordance with the [Meta Privacy
        Policy](https://www.facebook.com/privacy/policy/).
        extra_gated_button_content: Submit
        extra_gated_heading: &quot;Please be sure to provide your full legal name, date of birth, and full organization name with all corporate identifiers. Avoid the use of acronyms and special characters. Failure to follow these instructions may prevent you from accessing this model and others on Hugging Face. You will not have the ability to edit this form after submission, so please ensure all information is accurate.&quot;
        language:
        - en
        tags:
        - meta-ai
        - meta-pytorch
        license: fair-noncommercial-research-license
        ---</description>
      <link>https://huggingface.co/facebook/testing_instructions</link>
      <guid isPermaLink="false">https://huggingface.co/facebook/testing_instructions</guid>
      <pubDate>Thu, 04 Dec 2025 00:10:57 GMT</pubDate>
    </item>
    <item>
      <title>facebook/omniASR-LLM-7B-ZS</title>
      <description>facebook/omniASR-LLM-7B-ZS Automatic Speech Recognition • Updated 8 days ago • 3
        ---
        license: apache-2.0
        datasets:
        - facebook/omnilingual-asr-corpus
        pipeline_tag: automatic-speech-recognition
        ---
        # Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
        &lt;div align=&quot;center&quot; style=&quot;lline-height: 1.2; font-size:16px; margin-bottom: 30px;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/facebook&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🤗 Hugging Face
        &lt;/a&gt; |
        &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🐙 GitHub
        &lt;/a&gt; |
        &lt;a href=&quot;https://huggingface.co/spaces/facebook/omniasr-transcriptions&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🤖️ Demo
        &lt;/a&gt; |
        &lt;a href=&quot;https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        📃 Paper
        &lt;/a&gt; |
        &lt;a href=&quot;https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        📝 Blogpost
        &lt;/a&gt; |
        &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        📄 License: Apache 2.0
        &lt;/a&gt;
        &lt;/div&gt;
        # Model Card for omniASR-LLM-7B-ZS
        ## Model Description
        This model is part of the **Omnilingual ASR** family released by Meta AI. The original suite includes:
        &lt;!-- TODO : add new tokenizer, we&#39;ll get two tokenizer, add mssing speed numbers--&gt;
        | Model Name | Features | Parameters | Download Size (FP32) | Inference VRAM¹ | Real-Time Factor¹ (relative speed)² |
        |---------------------|---------------|------------:|---------------:|---------------:|-----------:|
        | [`omniASR_W2V_300M`](https://huggingface.co/facebook/omniASR-W2V-300M) | SSL | 317_390_592 | 1.2 GiB | | |
        | [`omniASR_W2V_1B`](https://huggingface.co/facebook/omniASR-W2V-1B) | SSL | 965_514_752 | 3.6 GiB | | |
        | [`omniASR_W2V_3B`](https://huggingface.co/facebook/omniASR-W2V-3B) | SSL | 3_064_124_672 | 12.0 GiB | | |
        | [`omniASR_W2V_7B`](https://huggingface.co/facebook/omniASR-W2V-7B) | SSL | 6_488_487_168 | 25.0 GiB | | |
        | [`omniASR_CTC_300M`](https://huggingface.co/facebook/omniASR-CTC-300M) | ASR | 325_494_996 | 1.3 GiB | ~2 GiB | 0.001 (96x) |
        | [`omniASR_CTC_1B`](https://huggingface.co/facebook/omniASR-CTC-1B) | ASR | 975_065_300 | 3.7 GiB | ~3 GiB | 0.002 (48x) |
        | [`omniASR_CTC_3B`](https://huggingface.co/facebook/omniASR-CTC-3B) | ASR | 3_080_423_636 | 12.0 GiB | ~8 GiB | 0.003 (32x) |
        | [`omniASR_CTC_7B`](https://huggingface.co/facebook/omniASR-CTC-7B) | ASR | 6_504_786_132 | 25.0 GiB | ~15 GiB | 0.006 (16x) |
        | [`omniASR_LLM_300M`](https://huggingface.co/facebook/omniASR-LLM-300M) | ASR with optional language conditioning | 1_627_603_584 | 6.1 GiB | ~5 GiB | 0.090 (~1x) |
        | [`omniASR_LLM_1B`](https://huggingface.co/facebook/omniASR-LLM-1B) | ASR with optional language conditioning | 2_275_710_592 | 8.5 GiB | ~6 GiB | 0.091 (~1x) |
        | [`omniASR_LLM_3B`](https://huggingface.co/facebook/omniASR-LLM-3B) | ASR with optional language conditioning | 4_376_679_040 | 17.0 GiB | ~10 GiB | 0.093 (~1x) |
        | [`omniASR_LLM_7B`](https://huggingface.co/facebook/omniASR-LLM-7B) | ASR with optional language conditioning | 7_801_041_536 | 30.0 GiB | ~17 GiB | 0.092 (~1x) |
        | [`omniASR_LLM_7B_ZS`](https://huggingface.co/facebook/omniASR-LLM-7B-ZS) | Zero-Shot ASR | 7_810_900_608 | 30.0 GiB | ~20 GiB |&amp;nbsp;0.194 (~0.5x) |
        ¹ (batch=1, audio_len=30s, BF16, A100)
        ² Relative speed to `omniASR_LLM_7B`
        ---
        ## Installation
        The models were developed using [fairseq2](https://github.com/facebookresearch/fairseq2), a research-focused sequence modeling toolkit. While we provide a **reference** inference pipeline that works across platforms, audio support requires [libsndfile](https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#system-dependencies) (Mac: `brew install libsndfile`; Windows may need an additional [setup](https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#installing-on-windows)).
        ```bash
        # using pip
        pip install omnilingual-asr
        # using uv
        uv add omnilingual-asr
        ```
        ## Inference
        ```python
        from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
        pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
        audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
        lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
        transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)
        ```
        ## Supported Languages
        To view the full list of 1600+ supported languages, you can access the language list [programmatically](/src/omnilingual_asr/models/wav2vec2_llama/lang_ids.py):
        ```python
        from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
        # Print all supported languages
        print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
        print(supported_langs)
        # Check if a specific language is supported
        if &quot;eng_Latn&quot; in supported_langs:
        print(&quot;English (Latin script) is supported!&quot;)
        ```
        Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...
        ---
        ## Training
        To further finetune the released checkpoints on your own data, use our [data preparation guide](/workflows/dataprep/README.md) followed by the [finetuning recipe guide](/workflows/recipes/wav2vec2/asr/README.md).
        ---
        ## Citation
        **BibTeX:**
        ```bibtex
        @misc{omnilingualasr2025,
        title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-7B-ZS https://huggingface.co/facebook/omniASR-LLM-7B-ZSThu, 27 Nov 2025 23:41:34 GMT<title>facebook/omniASR-LLM-7B</title>facebook/omniASR-LLM-7B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-7B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-7B https://huggingface.co/facebook/omniASR-LLM-7BThu, 27 Nov 2025 23:26:06 GMT<title>facebook/omniASR-LLM-3B</title>facebook/omniASR-LLM-3B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-3B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-3B https://huggingface.co/facebook/omniASR-LLM-3BThu, 27 Nov 2025 23:08:56 GMT<title>facebook/omniASR-LLM-1B</title>facebook/omniASR-LLM-1B Automatic Speech Recognition • Updated 8 days ago • 3

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-1B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-1B https://huggingface.co/facebook/omniASR-LLM-1BThu, 27 Nov 2025 22:37:27 GMT<title>facebook/omniASR-LLM-300M</title>facebook/omniASR-LLM-300M Automatic Speech Recognition • Updated 8 days ago • 3

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-300M

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-300M https://huggingface.co/facebook/omniASR-LLM-300MThu, 27 Nov 2025 22:04:09 GMT<title>facebook/omniASR-CTC-7B</title>facebook/omniASR-CTC-7B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-CTC-7B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-CTC-7B https://huggingface.co/facebook/omniASR-CTC-7BThu, 27 Nov 2025 21:31:20 GMT<title>facebook/omniASR-CTC-3B</title>facebook/omniASR-CTC-3B Automatic Speech Recognition • Updated 8 days ago • 2

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-CTC-3B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-CTC-3B https://huggingface.co/facebook/omniASR-CTC-3BThu, 27 Nov 2025 15:47:08 GMT<title>facebook/omniASR-CTC-1B</title>facebook/omniASR-CTC-1B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/faceb

</details>


...

github-actions · 2025-12-06T11:14:58Z

http://localhost:1200/huggingface/models/ianyang02 - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface ianyang02 Models</title>
    <link>https://huggingface.co/ianyang02/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/ianyang02" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface ianyang02 Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sat, 06 Dec 2025 11:14:56 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>ianyang02/aita_qwen3-30b</title>
      <description>ianyang02/aita_qwen3-30b Updated about 13 hours ago</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-30b</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-30b</guid>
      <pubDate>Fri, 05 Dec 2025 14:17:52 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_dpo</title>
      <description>ianyang02/aita_qwen3-4b_dpo Updated 15 days ago
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_dpo
        tags:
        - generated_from_trainer
        - trl
        - dpo
        licence: license
        ---
        # Model Card for aita_qwen3-4b_dpo
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/aita_qwen3-4b_dpo&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/4xnikck8)
        This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite DPO as:
        ```bibtex
        @inproceedings{rafailov2023direct,
        title = {{Direct Preference Optimization: Your Language Model is Secretly a Reward Model}},
        author = {Rafael Rafailov and Archit Sharma and Eric Mitchell and Christopher D. Manning and Stefano Ermon and Chelsea Finn},
        year = 2023,
        booktitle = {Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023},
        url = {http://papers.nips.cc/paper_files/paper/2023/hash/a85b405ed65c6477a4fe8302b5e06ce7-Abstract-Conference.html},
        editor = {Alice Oh and Tristan Naumann and Amir Globerson and Kate Saenko and Moritz Hardt and Sergey Levine},
        }
        ```
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_dpo</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_dpo</guid>
      <pubDate>Fri, 21 Nov 2025 15:09:21 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/ppo_model_qwen3-4b_aita_h200_2</title>
      <description>ianyang02/ppo_model_qwen3-4b_aita_h200_2 Text Generation • Updated 15 days ago • 66
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: peft
        model_name: ppo_model_qwen3-4b_aita_h200_2
        tags:
        - base_model:adapter:Qwen/Qwen3-4B-Instruct-2507
        - lora
        - transformers
        licence: license
        pipeline_tag: text-generation
        ---
        # Model Card for ppo_model_qwen3-4b_aita_h200_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/ppo_model_qwen3-4b_aita_h200_2&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/x99g8ppd)
        This model was trained with PPO, a method introduced in [Fine-Tuning Language Models from Human Preferences](https://huggingface.co/papers/1909.08593).
        ### Framework versions
        - PEFT 0.18.0
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.7.0.dev20250224+cu126
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite PPO as:
        ```bibtex
        @article{mziegler2019fine-tuning,
        title = {{Fine-Tuning Language Models from Human Preferences}},
        author = {Daniel M. Ziegler and Nisan Stiennon and Jeffrey Wu and Tom B. Brown and Alec Radford and Dario Amodei and Paul F. Christiano and Geoffrey Irving},
        year = 2019,
        eprint = {arXiv:1909.08593}
        }
        ```
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200_2</guid>
      <pubDate>Fri, 21 Nov 2025 08:19:14 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2</title>
      <description>ianyang02/aita_qwen3-4b_2 Text Classification • 4B • Updated 16 days ago • 23
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.5.0.dev20240818+cu124
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2</guid>
      <pubDate>Wed, 19 Nov 2025 23:02:33 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2_checkpoint_2</title>
      <description>ianyang02/aita_qwen3-4b_2_checkpoint_2 Text Classification • 4B • Updated 16 days ago • 25
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.5.0.dev20240818+cu124
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_2</guid>
      <pubDate>Wed, 19 Nov 2025 19:52:00 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2_checkpoint_1</title>
      <description>ianyang02/aita_qwen3-4b_2_checkpoint_1 Text Classification • 4B • Updated 16 days ago • 4
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.5.0.dev20240818+cu124
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_1</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_1</guid>
      <pubDate>Wed, 19 Nov 2025 16:40:34 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/ppo_model_qwen3-4b_aita_h200</title>
      <description>ianyang02/ppo_model_qwen3-4b_aita_h200 Updated 17 days ago
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: ppo_model_qwen3-4b_aita_h200
        tags:
        - generated_from_trainer
        - trl
        - ppo
        licence: license
        ---
        # Model Card for ppo_model_qwen3-4b_aita_h200
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/ppo_model_qwen3-4b_aita_h200&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/ht40cr4d)
        This model was trained with PPO, a method introduced in [Fine-Tuning Language Models from Human Preferences](https://huggingface.co/papers/1909.08593).
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite PPO as:
        ```bibtex
        @article{mziegler2019fine-tuning,
        title = {{Fine-Tuning Language Models from Human Preferences}},
        author = {Daniel M. Ziegler and Nisan Stiennon and Jeffrey Wu and Tom B. Brown and Alec Radford and Dario Amodei and Paul F. Christiano and Geoffrey Irving},
        year = 2019,
        eprint = {arXiv:1909.08593}
        }
        ```
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200</guid>
      <pubDate>Tue, 18 Nov 2025 16:21:51 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b</title>
      <description>ianyang02/aita_qwen3-4b Text Classification • 4B • Updated 20 days ago • 93
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/ofxhxi45)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b</guid>
      <pubDate>Sun, 16 Nov 2025 01:32:06 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_fine_tuned_Qwen3_0.6B</title>
      <description>ianyang02/aita_fine_tuned_Qwen3_0.6B Updated 24 days ago</description>
      <link>https://huggingface.co/ianyang02/aita_fine_tuned_Qwen3_0.6B</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_fine_tuned_Qwen3_0.6B</guid>
      <pubDate>Tue, 11 Nov 2025 16:18:49 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_Qwen3-0.6B</title>
      <description>ianyang02/aita_Qwen3-0.6B Text Classification • 0.6B • Updated 28 days ago • 50
        ---
        base_model: Qwen/Qwen3-0.6B
        library_name: transformers
        model_name: aita_Qwen3-0.6B
        tags:
        - generated_from_trainer
        - reward-trainer
        - trl
        licence: license
        ---
        # Model Card for aita_Qwen3-0.6B
        This model is a fine-tuned version of [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_Qwen3-0.6B&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/7xwpy737)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_Qwen3-0.6B</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_Qwen3-0.6B</guid>
      <pubDate>Sat, 08 Nov 2025 07:48:49 GMT</pubDate>
    </item>
  </channel>
</rss>

…proper state !!!!!

github-actions · 2025-12-06T11:25:13Z

Successfully generated as following:

http://localhost:1200/huggingface/models/deepseek-ai - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface deepseek-ai Models</title>
    <link>https://huggingface.co/deepseek-ai/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/deepseek-ai" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface deepseek-ai Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sat, 06 Dec 2025 11:25:05 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Speciale</title>
      <description>deepseek-ai/DeepSeek-V3.2-Speciale Text Generation • 685B • Updated 5 days ago • 4.72k • 517
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune
        ---
        # DeepSeek-V3.2: Efficient Reasoning &amp;amp; Agentic AI
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2/blob/main/assets/paper.pdf&quot;&gt;&lt;b&gt;Technical Report&lt;/b&gt;👁️&lt;/a&gt;
        &lt;/p&gt;
        ## Introduction
        We introduce **DeepSeek-V3.2**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
        1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
        2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
        - *Achievement:* 🥇 **Gold-medal performance** in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI).
        3. **Large-Scale Agentic Task Synthesis Pipeline:** To integrate **reasoning into tool-use** scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This facilitates scalable agentic post-training, improving compliance and generalization in complex interactive environments.
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/benchmark.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        We have also released the final submissions for IOI 2025, ICPC World Finals, IMO 2025 and CMO 2025, which were selected based on our designed pipeline. These materials are provided for the community to conduct secondary verification. The files can be accessed at `assets/olympiad_cases`.
        ## Chat Template
        DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a &quot;thinking with tools&quot; capability.
        To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model&#39;s text output.
        A brief example is illustrated below:
        ```python
        import transformers
        # encoding/encoding_dsv32.py
        from encoding_dsv32 import encode_messages, parse_message_from_completion_text
        tokenizer = transformers.AutoTokenizer.from_pretrained(&quot;deepseek-ai/DeepSeek-V3.2&quot;)
        messages = [
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;hello&quot;},
        {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;Hello! I am DeepSeek.&quot;, &quot;reasoning_content&quot;: &quot;thinking...&quot;},
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;1+1=?&quot;}
        ]
        encode_config = dict(thinking_mode=&quot;thinking&quot;, drop_thinking=True, add_default_bos_token=True)
        # messages -&amp;gt; string
        prompt = encode_messages(messages, **encode_config)
        # Output: &quot;&amp;lt;｜begin▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;hello&amp;lt;｜Assistant｜&amp;gt;Hello! I am DeepSeek.&amp;lt;｜end▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;1+1=?&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;&quot;
        # string -&amp;gt; tokens
        tokens = tokenizer.encode(prompt)
        # Output: [0, 128803, 33310, 128804, 128799, 19923, 3, 342, 1030, 22651, 4374, 1465, 16, 1, 128803, 19, 13, 19, 127252, 128804, 128798]
        ```
        Important Notes:
        1. This release does not include a Jinja-format chat template. Please refer to the Python code mentioned above.
        2. The output parsing function included in the code is designed to handle well-formatted strings only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. It is not suitable for production use without robust error handling.
        3. A new role named `developer` has been introduced in the chat template. This role is dedicated exclusively to search agent scenarios and is designated for no other tasks. The official API does not accept messages assigned to `developer`.
        ## How to Run Locally
        The model structure of DeepSeek-V3.2 and DeepSeek-V3.2-Speciale are the same as DeepSeek-V3.2-Exp. Please visit [DeepSeek-V3.2-Exp](https://github.com/deepseek-ai/DeepSeek-V3.2-Exp) repo for more information about running this model locally.
        Usage Recommendations:
        1. For local deployment, we recommend setting the sampling parameters to `temperature = 1.0, top_p = 0.95`.
        2. Please note that the DeepSeek-V3.2-Speciale variant is designed exclusively for deep reasoning tasks and does not support the tool-calling functionality.
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2025deepseekv32,
        title={DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models},
        author={DeepSeek-AI},
        year={2025},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
        &lt;/think&gt;</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale</guid>
      <pubDate>Mon, 01 Dec 2025 03:06:03 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2</title>
      <description>deepseek-ai/DeepSeek-V3.2 Text Generation • 685B • Updated 5 days ago • 18.1k • • 738
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune
        ---
        # DeepSeek-V3.2: Efficient Reasoning &amp;amp; Agentic AI
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2/blob/main/assets/paper.pdf&quot;&gt;&lt;b&gt;Technical Report&lt;/b&gt;👁️&lt;/a&gt;
        &lt;/p&gt;
        ## Introduction
        We introduce **DeepSeek-V3.2**, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
        1. **DeepSeek Sparse Attention (DSA):** We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
        2. **Scalable Reinforcement Learning Framework:** By implementing a robust RL protocol and scaling post-training compute, *DeepSeek-V3.2* performs comparably to GPT-5. Notably, our high-compute variant, **DeepSeek-V3.2-Speciale**, **surpasses GPT-5** and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
        - *Achievement:* 🥇 **Gold-medal performance** in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI).
        3. **Large-Scale Agentic Task Synthesis Pipeline:** To integrate **reasoning into tool-use** scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This facilitates scalable agentic post-training, improving compliance and generalization in complex interactive environments.
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/benchmark.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        We have also released the final submissions for IOI 2025, ICPC World Finals, IMO 2025 and CMO 2025, which were selected based on our designed pipeline. These materials are provided for the community to conduct secondary verification. The files can be accessed at `assets/olympiad_cases`.
        ## Chat Template
        DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a &quot;thinking with tools&quot; capability.
        To assist the community in understanding and adapting to this new template, we have provided a dedicated `encoding` folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model&#39;s text output.
        A brief example is illustrated below:
        ```python
        import transformers
        # encoding/encoding_dsv32.py
        from encoding_dsv32 import encode_messages, parse_message_from_completion_text
        tokenizer = transformers.AutoTokenizer.from_pretrained(&quot;deepseek-ai/DeepSeek-V3.2&quot;)
        messages = [
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;hello&quot;},
        {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;Hello! I am DeepSeek.&quot;, &quot;reasoning_content&quot;: &quot;thinking...&quot;},
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;1+1=?&quot;}
        ]
        encode_config = dict(thinking_mode=&quot;thinking&quot;, drop_thinking=True, add_default_bos_token=True)
        # messages -&amp;gt; string
        prompt = encode_messages(messages, **encode_config)
        # Output: &quot;&amp;lt;｜begin▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;hello&amp;lt;｜Assistant｜&amp;gt;Hello! I am DeepSeek.&amp;lt;｜end▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;1+1=?&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;&quot;
        # string -&amp;gt; tokens
        tokens = tokenizer.encode(prompt)
        # Output: [0, 128803, 33310, 128804, 128799, 19923, 3, 342, 1030, 22651, 4374, 1465, 16, 1, 128803, 19, 13, 19, 127252, 128804, 128798]
        ```
        Important Notes:
        1. This release does not include a Jinja-format chat template. Please refer to the Python code mentioned above.
        2. The output parsing function included in the code is designed to handle well-formatted strings only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. It is not suitable for production use without robust error handling.
        3. A new role named `developer` has been introduced in the chat template. This role is dedicated exclusively to search agent scenarios and is designated for no other tasks. The official API does not accept messages assigned to `developer`.
        ## How to Run Locally
        The model structure of DeepSeek-V3.2 and DeepSeek-V3.2-Speciale are the same as DeepSeek-V3.2-Exp. Please visit [DeepSeek-V3.2-Exp](https://github.com/deepseek-ai/DeepSeek-V3.2-Exp) repo for more information about running this model locally.
        Usage Recommendations:
        1. For local deployment, we recommend setting the sampling parameters to `temperature = 1.0, top_p = 0.95`.
        2. Please note that the DeepSeek-V3.2-Speciale variant is designed exclusively for deep reasoning tasks and does not support the tool-calling functionality.
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2025deepseekv32,
        title={DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models},
        author={DeepSeek-AI},
        year={2025},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
        &lt;/think&gt;</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2</guid>
      <pubDate>Mon, 01 Dec 2025 03:04:59 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-Math-V2</title>
      <description>deepseek-ai/DeepSeek-Math-V2 Text Generation • 685B • Updated 9 days ago • 8.96k • 637
        ---
        license: apache-2.0
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-Math-V2
        ---
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot;&gt;&lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot;&gt;&lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot;&gt;&lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;br&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot;&gt;&lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot;&gt;&lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot;&gt;&lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;br&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-Apache%202.0-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;br&gt;
        &lt;/div&gt;
        # DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
        ## 1. Introduction
        Large language models have made significant progress in mathematical reasoning, which serves as an important testbed for AI and could impact scientific research if further advanced.
        By scaling reasoning with reinforcement learning that rewards correct final answers, LLMs have improved from poor performance to saturating quantitative reasoning competitions like AIME and HMMT in one year.
        However, this approach faces fundamental limitations.
        Pursuing higher final answer accuracy doesn&#39;t address a key issue: correct answers don&#39;t guarantee correct reasoning.
        Moreover, many mathematical tasks like theorem proving require rigorous step-by-step derivation rather than numerical answers, making final answer rewards inapplicable.
        To push the limits of deep reasoning, we believe it is necessary to verify the comprehensiveness and rigor of mathematical reasoning.
        Self-verification is particularly important for scaling test-time compute, especially for open problems without known solutions.
        Towards self-verifiable mathematical reasoning, we investigate how to train an accurate and faithful LLM-based verifier for theorem proving.
        We then train a proof generator using the verifier as the reward model, and incentivize the generator to identify and resolve as many issues as possible in their own proofs before finalizing them.
        To maintain the generation-verification gap as the generator becomes stronger, we propose to scale verification compute to automatically label new hard-to-verify proofs, creating training data to further improve the verifier.
        Our resulting model, DeepSeekMath-V2, demonstrates strong theorem-proving capabilities, achieving gold-level scores on IMO 2025 and CMO 2024 and a near-perfect 118/120 on Putnam 2024 with scaled test-time compute.
        While much work remains, these results suggest that self-verifiable mathematical reasoning is a feasible research direction that may help develop more capable mathematical AI systems.
        ## 2. Evaluation Results
        Below are evaluation results on [IMO-ProofBench](https://github.com/google-deepmind/superhuman/tree/main/imobench) (developed by the DeepMind team behind DeepThink IMO-Gold) and recent mathematics competitions including IMO 2025, CMO 2024, and Putnam 2024.
        **IMO-ProofBench**
        &lt;p align=&quot;center&quot;&gt;
        &lt;img width=&quot;100%&quot; src=&quot;https://raw.githubusercontent.com/deepseek-ai/DeepSeek-Math-V2/refs/heads/main/figures/IMO-ProofBench.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        ---
        **Mathematics Competitions**
        &lt;p align=&quot;center&quot;&gt;
        &lt;img width=&quot;41%&amp;quot;&quot; src=&quot;https://raw.githubusercontent.com/deepseek-ai/DeepSeek-Math-V2/refs/heads/main/figures/Competitions.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        ## 4. Quick Start
        DeepSeekMath-V2 is built on top of DeepSeek-V3.2-Exp-Base.
        For inference support, please refer to [the DeepSeek-V3.2-Exp github repository](https://github.com/deepseek-ai/DeepSeek-V3.2-Exp).
        ## 6. License
        This repository and the model weights are licensed under [the Apache License, Version 2.0 (Apache 2.0)](LICENSE).
        ## 7. Citation
        ```
        @misc{deepseek-math-v2,
        author = {Zhihong Shao, Yuxiang Luo, Chengda Lu, Z.Z. Ren, Jiewen Hu, Tian Ye, Zhibin Gou, Shirong Ma, Xiaokang Zhang},
        title = {DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning},
        year = {2025},
        }
        ```
        ## 8. Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-Math-V2</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-Math-V2</guid>
      <pubDate>Thu, 27 Nov 2025 02:35:52 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Exp</title>
      <description>deepseek-ai/DeepSeek-V3.2-Exp Text Generation • 685B • Updated 18 days ago • 66.2k • • 899
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune
        ---
        # DeepSeek-V3.2-Exp
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        ## Introduction
        We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.
        This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/cost.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        - DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.
        - To rigorously evaluate the impact of introducing sparse attention, we deliberately aligned the training configurations of DeepSeek-V3.2-Exp with V3.1-Terminus. Across public benchmarks in various domains, DeepSeek-V3.2-Exp demonstrates performance on par with V3.1-Terminus.
        | Benchmark | DeepSeek-V3.1-Terminus | DeepSeek-V3.2-Exp |
        | :--- | :---: | :---: |
        | **Reasoning Mode w/o Tool Use** | | |
        | MMLU-Pro | 85.0 | 85.0 |
        | GPQA-Diamond | 80.7 | 79.9 |
        | Humanity&#39;s Last Exam | 21.7 | 19.8 |
        | LiveCodeBench | 74.9 | 74.1 |
        | AIME 2025 | 88.4 | 89.3 |
        | HMMT 2025 | 86.1 | 83.6 |
        | Codeforces | 2046 | 2121 |
        | Aider-Polyglot | 76.1 | 74.5 |
        | **Agentic Tool Use** | | |
        | BrowseComp | 38.5 | 40.1 |
        | BrowseComp-zh | 45.0 | 47.9 |
        | SimpleQA | 96.8 | 97.1 |
        | SWE Verified | 68.4 | 67.8 |
        | SWE-bench Multilingual | 57.8 | 57.9 |
        | Terminal-bench | 36.7 | 37.7 |
        ## Update
        - 2025.11.17: **We have identified that previous versions of the inference demo code contained an implementation discrepancy in Rotary Position Embedding (RoPE) within the indexer module, potentially leading to degraded model performance.** Specifically, the input tensor to RoPE in the indexer module requires a non-interleaved layout, whereas RoPE in the MLA module expects an interleaved layout. This issue has now been resolved. Please refer to the updated version of the inference demo code and take note of this implementation detail.
        ## How to Run Locally
        ### HuggingFace
        We provide an updated inference demo code in the [inference](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp/tree/main/inference) folder to help the community quickly get started with our model and understand its architectural details.
        First convert huggingface model weights to the the format required by our inference demo. Set `MP` to match your available GPU count:
        ```bash
        cd inference
        export EXPERTS=256
        python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}
        ```
        Launch the interactive chat interface and start exploring DeepSeek&#39;s capabilities:
        ```bash
        export CONFIG=config_671B_v3.2.json
        torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive
        ```
        ### SGLang
        #### Installation with Docker
        ```
        # H200
        docker pull lmsysorg/sglang:dsv32
        # MI350
        docker pull lmsysorg/sglang:dsv32-rocm
        # NPUs
        docker pull lmsysorg/sglang:dsv32-a2
        docker pull lmsysorg/sglang:dsv32-a3
        ```
        #### Launch Command
        ```bash
        python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --dp 8 --enable-dp-attention
        ```
        ### vLLM
        vLLM provides day-0 support of DeepSeek-V3.2-Exp. See the [recipes](https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-V3_2-Exp.html) for up-to-date details.
        ## Open-Source Kernels
        For TileLang kernels with **better readability and research-purpose design**, please refer to [TileLang](https://github.com/tile-ai/tilelang/tree/main/examples/deepseek_v32).
        For **high-performance CUDA kernels**, indexer logit kernels (including paged versions) are available in [DeepGEMM](https://github.com/deepseek-ai/DeepGEMM/pull/200). Sparse attention kernels are released in [FlashMLA](https://github.com/deepseek-ai/FlashMLA/pull/98).
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2024deepseekv32,
        title={DeepSeek-V3.2-Exp: Boosting Long-Context Efficiency with DeepSeek Sparse Attention},
        author={DeepSeek-AI},
        year={2025},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp</guid>
      <pubDate>Mon, 17 Nov 2025 18:39:54 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-OCR</title>
      <description>deepseek-ai/DeepSeek-OCR Image-Text-to-Text • 3B • Updated Nov 4 • 5.47M • 2.93k
        ---
        pipeline_tag: image-text-to-text
        language:
        - multilingual
        tags:
        - deepseek
        - vision-language
        - ocr
        - custom_code
        license: mit
        library_name: transformers
        ---
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek AI&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-OCR&quot;&gt;&lt;b&gt;🌟 Github&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR&quot;&gt;&lt;b&gt;📥 Model Download&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf&quot;&gt;&lt;b&gt;📄 Paper Link&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://arxiv.org/abs/2510.18234&quot;&gt;&lt;b&gt;📄 Arxiv Paper Link&lt;/b&gt;&lt;/a&gt; |
        &lt;/p&gt;
        &lt;h2&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/papers/2510.18234&quot;&gt;DeepSeek-OCR: Contexts Optical Compression&lt;/a&gt;
        &lt;/p&gt;
        &lt;/h2&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/fig1.png&quot; style=&quot;width: 1000px&quot; align=&quot;center&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/papers/2510.18234&quot;&gt;Explore the boundaries of visual-text compression.&lt;/a&gt;
        &lt;/p&gt;
        ## Usage
        Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8：
        ```
        torch==2.6.0
        transformers==4.46.3
        tokenizers==0.20.3
        einops
        addict
        easydict
        pip install flash-attn==2.7.3 --no-build-isolation
        ```
        ```python
        from transformers import AutoModel, AutoTokenizer
        import torch
        import os
        os.environ[&quot;CUDA_VISIBLE_DEVICES&quot;] = &#39;0&#39;
        model_name = &#39;deepseek-ai/DeepSeek-OCR&#39;
        tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
        model = AutoModel.from_pretrained(model_name, _attn_implementation=&#39;flash_attention_2&#39;, trust_remote_code=True, use_safetensors=True)
        model = model.eval().cuda().to(torch.bfloat16)
        # prompt = &quot;&lt;img referrerpolicy=&quot;no-referrer&quot;&gt;\nFree OCR. &quot;
        prompt = &quot;&lt;img referrerpolicy=&quot;no-referrer&quot;&gt;\n&amp;lt;|grounding|&amp;gt;Convert the document to markdown. &quot;
        image_file = &#39;your_image.jpg&#39;
        output_path = &#39;your/output/dir&#39;
        # infer(self, tokenizer, prompt=&#39;&#39;, image_file=&#39;&#39;, output_path = &#39; &#39;, base_size = 1024, image_size = 640, crop_mode = True, test_compress = False, save_results = False):
        # Tiny: base_size = 512, image_size = 512, crop_mode = False
        # Small: base_size = 640, image_size = 640, crop_mode = False
        # Base: base_size = 1024, image_size = 1024, crop_mode = False
        # Large: base_size = 1280, image_size = 1280, crop_mode = False
        # Gundam: base_size = 1024, image_size = 640, crop_mode = True
        res = model.infer(tokenizer, prompt=prompt, image_file=image_file, output_path = output_path, base_size = 1024, image_size = 640, crop_mode=True, save_results = True, test_compress = True)
        ```
        ## vLLM
        Refer to [🌟GitHub](https://github.com/deepseek-ai/DeepSeek-OCR/) for guidance on model inference acceleration and PDF processing, etc.&lt;!-- --&gt;
        [2025/10/23] 🚀🚀🚀 DeepSeek-OCR is now officially supported in upstream [vLLM](https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-OCR.html#installing-vllm).
        ```shell
        uv venv
        source .venv/bin/activate
        # Until v0.11.1 release, you need to install vLLM from nightly build
        uv pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
        ```
        ```python
        from vllm import LLM, SamplingParams
        from vllm.model_executor.models.deepseek_ocr import NGramPerReqLogitsProcessor
        from PIL import Image
        # Create model instance
        llm = LLM(
        model=&quot;deepseek-ai/DeepSeek-OCR&quot;,
        enable_prefix_caching=False,
        mm_processor_cache_gb=0,
        logits_processors=[NGramPerReqLogitsProcessor]
        )
        # Prepare batched input with your image file
        image_1 = Image.open(&quot;path/to/your/image_1.png&quot;).convert(&quot;RGB&quot;)
        image_2 = Image.open(&quot;path/to/your/image_2.png&quot;).convert(&quot;RGB&quot;)
        prompt = &quot;&lt;img referrerpolicy=&quot;no-referrer&quot;&gt;\nFree OCR.&quot;
        model_input = [
        {
        &quot;prompt&quot;: prompt,
        &quot;multi_modal_data&quot;: {&quot;image&quot;: image_1}
        },
        {
        &quot;prompt&quot;: prompt,
        &quot;multi_modal_data&quot;: {&quot;image&quot;: image_2}
        }
        ]
        sampling_param = SamplingParams(
        temperature=0.0,
        max_tokens=8192,
        # ngram logit processor args
        extra_args=dict(
        ngram_size=30,
        window_size=90,
        whitelist_token_ids={128821, 128822}, # whitelist: ,
        ),
        skip_special_tokens=False,
        )
        # Generate output
        model_outputs = llm.generate(model_input, sampling_param)
        # Print output
        for output in model_outputs:
        print(output.outputs[0].text)
        ```
        ## Visualizations
        &lt;table&gt;
        &lt;tbody&gt;&lt;tr&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show1.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show2.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show3.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/assets/show4.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;/tbody&gt;&lt;/table&gt;
        ## Acknowledgement
        We would like to thank [Vary](https://github.com/Ucas-HaoranWei/Vary/), [GOT-OCR2.0](https://github.com/Ucas-HaoranWei/GOT-OCR2.0/), [MinerU](https://github.com/opendatalab/MinerU), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR), [OneChart](https://github.com/LingyvKong/OneChart), [Slow Perception](https://github.com/Ucas-HaoranWei/Slow-Perception) for their valuable models and ideas.
        We also appreciate the benchmarks: [Fox](https://github.com/ucaslcl/Fox), [OminiDocBench](https://github.com/opendatalab/OmniDocBench).
        ## Citation
        ```bibtex
        @article{wei2025deepseek,
        title={DeepSeek-OCR: Contexts Optical Compression},
        author={Wei, Haoran and Sun, Yaofeng and Li, Yukun},
        journal={arXiv preprint arXiv:2510.18234},
        year={2025}
        }</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-OCR</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-OCR</guid>
      <pubDate>Mon, 03 Nov 2025 18:36:12 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Exp-Base</title>
      <description>deepseek-ai/DeepSeek-V3.2-Exp-Base Text Generation • 685B • Updated Oct 9 • 217 • 48
        ---
        license: mit
        library_name: transformers
        ---</description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp-Base</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp-Base</guid>
      <pubDate>Wed, 08 Oct 2025 18:09:46 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.1-Terminus</title>
      <description>deepseek-ai/DeepSeek-V3.1-Terminus Text Generation • 685B • Updated Sep 29 • 81k • • 354
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.1-Base
        ---
        # DeepSeek-V3.1-Terminus
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        ## Introduction
        This update maintains the model&#39;s original capabilities while addressing issues reported by users, including:
        - Language consistency: Reducing instances of mixed Chinese-English text and occasional abnormal characters;
        - Agent capabilities: Further optimizing the performance of the Code Agent and Search Agent.
        | Benchmark | DeepSeek-V3.1 | DeepSeek-V3.1-Terminus |
        | :--- | :---: | :---: |
        | **Reasoning Mode w/o Tool Use** | | |
        | MMLU-Pro | 84.8 | 85.0 |
        | GPQA-Diamond | 80.1 | 80.7 |
        | Humanity&#39;s Last Exam | 15.9 | 21.7 |
        | LiveCodeBench | 74.8 | 74.9 |
        | Codeforces | 2091 | 2046 |
        | Aider-Polyglot | 76.3 | 76.1 |
        | **Agentic Tool Use** | | |
        | BrowseComp | 30.0 | 38.5 |
        | BrowseComp-zh | 49.2 | 45.0 |
        | SimpleQA | 93.4 | 96.8 |
        | SWE Verified | 66.0 | 68.4 |
        | SWE-bench Multilingual | 54.5 | 57.8 |
        | Terminal-bench | 31.3 | 36.7 |
        **The template and tool-set of search agent have been updated, which is shown in `assets/search_tool_trajectory.html`.**
        ## How to Run Locally
        The model structure of DeepSeek-V3.1-Terminus is the same as DeepSeek-V3. Please visit [DeepSeek-V3](https://github.com/deepseek-ai/DeepSeek-V3) repo for more information about running this model locally.
        For the model&#39;s chat template other than search agent, please refer to the [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1) repo.
        **Here we also provide an updated inference demo code in the `inference` folder to help the community get started with running our model and understand the details of model architecture.**
        **NOTE: In the current model checkpoint, the parameters of `self_attn.o_proj` do not conform to the UE8M0 FP8 scale data format. This is a known issue and will be corrected in future model releases.**
        ## License
        This repository and the model weights are licensed under the [MIT License](LICENSE).
        ## Citation
        ```
        @misc{deepseekai2024deepseekv3technicalreport,
        title={DeepSeek-V3 Technical Report},
        author={DeepSeek-AI},
        year={2024},
        eprint={2412.19437},
        archivePrefix={arXiv},
        primaryClass={cs.CL},
        url={https://arxiv.org/abs/2412.19437},
        }
        ```
        ## Contact
        If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus</guid>
      <pubDate>Sun, 28 Sep 2025 17:52:07 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.1</title>
      <description>deepseek-ai/DeepSeek-V3.1 Text Generation • 685B • Updated Sep 5 • 82.2k • • 808
        ---
        license: mit
        library_name: transformers
        base_model:
        - deepseek-ai/DeepSeek-V3.1-Base
        ---
        # DeepSeek-V3.1
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        ## Introduction
        DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:
        - **Hybrid thinking mode**: One model supports both thinking mode and non-thinking mode by changing the chat template.
        - **Smarter tool calling**: Through post-training optimization, the model&#39;s performance in tool usage and agent tasks has significantly improved.
        - **Higher thinking efficiency**: DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly.
        DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens.
        Additionally, DeepSeek-V3.1 is trained using the **UE8M0 FP8 scale data format on both model weights and activations** to ensure compatibility with microscaling data formats. Please refer to [DeepGEMM](https://github.com/deepseek-ai/DeepGEMM) for more details.
        ## Model Downloads
        &lt;div align=&quot;center&quot;&gt;
        | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
        | :------------: | :------------: | :------------: | :------------: | :------------: |
        | DeepSeek-V3.1-Base | 671B | 37B | 128K | [HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) \| [ModelScope](https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.1-Base) |
        | DeepSeek-V3.1 | 671B | 37B | 128K | [HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-V3.1) \| [ModelScope](https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.1) |
        &lt;/div&gt;
        ## Chat Template
        The details of our chat template is described in `tokenizer_config.json` and `assets/chat_template.jinja`. Here is a brief description.
        ### Non-Thinking
        #### First-Turn
        Prefix:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;`
        With the given prefix, DeepSeek V3.1 generates responses to queries in non-thinking mode. Unlike DeepSeek V3, it introduces an additional token ``.
        #### Multi-Turn
        Context:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;...&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;`
        Prefix:
        `&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;`
        By concatenating the context and the prefix, we obtain the correct prompt for the query.
        ### Thinking
        #### First-Turn
        Prefix:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;`
        The prefix of thinking mode is similar to DeepSeek-R1.
        #### Multi-Turn
        Context:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;&lt;/think&gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;...&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;{response}&amp;lt;｜end▁of▁sentence｜&amp;gt;`
        Prefix:
        `&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;&lt;think&gt;`
        The multi-turn template is the same with non-thinking multi-turn chat template. It means the thinking token in the last turn will be dropped but the `&lt;/think&gt;` is retained in every turn of context.
        ### ToolCall
        Toolcall is supported in non-thinking mode. The format is:
        `&amp;lt;｜begin▁of▁sentence｜&amp;gt;{system prompt}\n\n{tool_description}&amp;lt;｜User｜&amp;gt;{query}&amp;lt;｜Assistant｜&amp;gt;` where the tool_description is
        ```
        ## Tools
        You have access to the following tools:
        ### {tool_name1}
        Description: {description}
        Parameters: {json.dumps(parameters)}
        IMPORTANT: ALWAYS adhere to this exact format for tool use:
        &amp;lt;｜tool▁calls▁begin｜&amp;gt;&amp;lt;｜tool▁call▁begin｜&amp;gt;tool_call_name&amp;lt;｜tool▁sep｜&amp;gt;tool_call_arguments&amp;lt;｜tool▁call▁end｜&amp;gt;{additional_tool_calls}&amp;lt;｜tool▁calls▁end｜&amp;gt;
        Where:
        - `tool_call_name` must be an exact match to one of the available tools
        - `tool_call_arguments` must be valid JSON that strictly follows the tool&#39;s Parameters Schema
        - For multiple tool calls, chain them directly without separators or spaces
        ```
        ### Code-Agent
        We support various code agent frameworks. Please refer to the above toolcall format to create your own code agents. An example is shown in `assets/code_agent_trajectory.html`.
        ### Search-Agent
        We design a specific format for searching toolcall in thinking mode, to support search agent.
        For complex questions that require accessing external or up-to-date information, DeepSeek-V3.1 can leverage a user-provided search tool through a multi-turn tool-calling process.
        Please refer to the `assets/search_tool_trajectory.html` and `assets/search_python_tool_trajectory.html` for the detailed template.
        ## Evaluation
        | Category | Benchmark (Metric) | DeepSeek V3.1-NonThinking | DeepSeek V3 0324 | DeepSeek V3.1-Thinking | DeepSeek R1 0528
        |----------|----------------------------------|-----------------|---|---|---|
        | General |
        | | MMLU-Redux (EM) | 91.8 | 90.5 | 93.7 | 93.4
        | | MMLU-Pro (EM) | 83.7 | 81.2 | 84.8 | 85.0
        | | GPQA-Diamond (Pass@1) | 74.9 | 68.4 | 80.1 | 81.0
        | | Humanity&#39;s Last Exam (Pass@1) | - | - | 15.9 | 17.7
        |Search Agent|
        | | BrowseComp | - | - | 30.0 | 8.9
        | | BrowseComp_zh | - | - | 49.2 | 35.7
        | | Humanity&#39;s Last Exam (Python + Search) |- | - | 29.8 | 24.8
        | | SimpleQA | - | - | 93.4 | 92.3
        | Code |
        | | LiveCodeBench (2408-2505) (Pass@1) | 56.4 | 43.0 | 74.8 | 73.3
        | | Codeforces-Div1 (Rating) | - | - | 2091 | 1930
        | | Aider-Polyglot (Acc.) | 68.4 | 55.1 | 76.3 | 71.6
        | Code Agent|
        | | SWE Verified (Agent mode) | 66.0 | 45.4 | - | 44.6
        | | SWE-bench Multilingual (Agent mode) | 54.5 | 29.3 | - | 30.5
        | | Terminal-bench (Terminus 1 framework) | 31.3 | 13.3 | - | 5.7
        | Math |
        | | AIME 2024 (Pass@1) | 66.3 | 59.4 | 93.1 | 91.4
        | | AIME 2025 (Pass@1) | 49.8 | 51.3 | 88.4 | 87.5
        | | HMMT 2025 (Pass@1) | 33.5 | 29.2 | 84.2 | 79.4 |
        Note:
        - Search agents are evaluated with our internal search framework, which uses a commercial search API + webpage filter + 128K context window. Seach agent results of R1-0528 are eva

...

github-actions · 2025-12-06T11:25:14Z

http://localhost:1200/huggingface/models/facebook - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface facebook Models</title>
    <link>https://huggingface.co/facebook/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/facebook" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface facebook Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sat, 06 Dec 2025 11:25:11 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>facebook/testing_instructions</title>
      <description>facebook/testing_instructions Updated 2 days ago • 2
        ---
        extra_gated_fields:
        First Name: text
        Last Name: text
        Date of birth: date_picker
        Country: country
        Affiliation: text
        I accept the terms and conditions: checkbox
        geo: ip_location
        Test request?: checkbox
        By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox
        extra_gated_description: &amp;gt;-
        The information you provide will be collected, stored, processed and shared in
        accordance with the [Meta Privacy
        Policy](https://www.facebook.com/privacy/policy/).
        extra_gated_button_content: Submit
        extra_gated_heading: &quot;Please be sure to provide your full legal name, date of birth, and full organization name with all corporate identifiers. Avoid the use of acronyms and special characters. Failure to follow these instructions may prevent you from accessing this model and others on Hugging Face. You will not have the ability to edit this form after submission, so please ensure all information is accurate.&quot;
        language:
        - en
        tags:
        - meta-ai
        - meta-pytorch
        license: fair-noncommercial-research-license
        ---</description>
      <link>https://huggingface.co/facebook/testing_instructions</link>
      <guid isPermaLink="false">https://huggingface.co/facebook/testing_instructions</guid>
      <pubDate>Thu, 04 Dec 2025 00:10:57 GMT</pubDate>
    </item>
    <item>
      <title>facebook/omniASR-LLM-7B-ZS</title>
      <description>facebook/omniASR-LLM-7B-ZS Automatic Speech Recognition • Updated 8 days ago • 3
        ---
        license: apache-2.0
        datasets:
        - facebook/omnilingual-asr-corpus
        pipeline_tag: automatic-speech-recognition
        ---
        # Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
        &lt;div align=&quot;center&quot; style=&quot;lline-height: 1.2; font-size:16px; margin-bottom: 30px;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/facebook&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🤗 Hugging Face
        &lt;/a&gt; |
        &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🐙 GitHub
        &lt;/a&gt; |
        &lt;a href=&quot;https://huggingface.co/spaces/facebook/omniasr-transcriptions&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🤖️ Demo
        &lt;/a&gt; |
        &lt;a href=&quot;https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        📃 Paper
        &lt;/a&gt; |
        &lt;a href=&quot;https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        📝 Blogpost
        &lt;/a&gt; |
        &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        📄 License: Apache 2.0
        &lt;/a&gt;
        &lt;/div&gt;
        # Model Card for omniASR-LLM-7B-ZS
        ## Model Description
        This model is part of the **Omnilingual ASR** family released by Meta AI. The original suite includes:
        &lt;!-- TODO : add new tokenizer, we&#39;ll get two tokenizer, add mssing speed numbers--&gt;
        | Model Name | Features | Parameters | Download Size (FP32) | Inference VRAM¹ | Real-Time Factor¹ (relative speed)² |
        |---------------------|---------------|------------:|---------------:|---------------:|-----------:|
        | [`omniASR_W2V_300M`](https://huggingface.co/facebook/omniASR-W2V-300M) | SSL | 317_390_592 | 1.2 GiB | | |
        | [`omniASR_W2V_1B`](https://huggingface.co/facebook/omniASR-W2V-1B) | SSL | 965_514_752 | 3.6 GiB | | |
        | [`omniASR_W2V_3B`](https://huggingface.co/facebook/omniASR-W2V-3B) | SSL | 3_064_124_672 | 12.0 GiB | | |
        | [`omniASR_W2V_7B`](https://huggingface.co/facebook/omniASR-W2V-7B) | SSL | 6_488_487_168 | 25.0 GiB | | |
        | [`omniASR_CTC_300M`](https://huggingface.co/facebook/omniASR-CTC-300M) | ASR | 325_494_996 | 1.3 GiB | ~2 GiB | 0.001 (96x) |
        | [`omniASR_CTC_1B`](https://huggingface.co/facebook/omniASR-CTC-1B) | ASR | 975_065_300 | 3.7 GiB | ~3 GiB | 0.002 (48x) |
        | [`omniASR_CTC_3B`](https://huggingface.co/facebook/omniASR-CTC-3B) | ASR | 3_080_423_636 | 12.0 GiB | ~8 GiB | 0.003 (32x) |
        | [`omniASR_CTC_7B`](https://huggingface.co/facebook/omniASR-CTC-7B) | ASR | 6_504_786_132 | 25.0 GiB | ~15 GiB | 0.006 (16x) |
        | [`omniASR_LLM_300M`](https://huggingface.co/facebook/omniASR-LLM-300M) | ASR with optional language conditioning | 1_627_603_584 | 6.1 GiB | ~5 GiB | 0.090 (~1x) |
        | [`omniASR_LLM_1B`](https://huggingface.co/facebook/omniASR-LLM-1B) | ASR with optional language conditioning | 2_275_710_592 | 8.5 GiB | ~6 GiB | 0.091 (~1x) |
        | [`omniASR_LLM_3B`](https://huggingface.co/facebook/omniASR-LLM-3B) | ASR with optional language conditioning | 4_376_679_040 | 17.0 GiB | ~10 GiB | 0.093 (~1x) |
        | [`omniASR_LLM_7B`](https://huggingface.co/facebook/omniASR-LLM-7B) | ASR with optional language conditioning | 7_801_041_536 | 30.0 GiB | ~17 GiB | 0.092 (~1x) |
        | [`omniASR_LLM_7B_ZS`](https://huggingface.co/facebook/omniASR-LLM-7B-ZS) | Zero-Shot ASR | 7_810_900_608 | 30.0 GiB | ~20 GiB |&amp;nbsp;0.194 (~0.5x) |
        ¹ (batch=1, audio_len=30s, BF16, A100)
        ² Relative speed to `omniASR_LLM_7B`
        ---
        ## Installation
        The models were developed using [fairseq2](https://github.com/facebookresearch/fairseq2), a research-focused sequence modeling toolkit. While we provide a **reference** inference pipeline that works across platforms, audio support requires [libsndfile](https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#system-dependencies) (Mac: `brew install libsndfile`; Windows may need an additional [setup](https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#installing-on-windows)).
        ```bash
        # using pip
        pip install omnilingual-asr
        # using uv
        uv add omnilingual-asr
        ```
        ## Inference
        ```python
        from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
        pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
        audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
        lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
        transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)
        ```
        ## Supported Languages
        To view the full list of 1600+ supported languages, you can access the language list [programmatically](/src/omnilingual_asr/models/wav2vec2_llama/lang_ids.py):
        ```python
        from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
        # Print all supported languages
        print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
        print(supported_langs)
        # Check if a specific language is supported
        if &quot;eng_Latn&quot; in supported_langs:
        print(&quot;English (Latin script) is supported!&quot;)
        ```
        Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...
        ---
        ## Training
        To further finetune the released checkpoints on your own data, use our [data preparation guide](/workflows/dataprep/README.md) followed by the [finetuning recipe guide](/workflows/recipes/wav2vec2/asr/README.md).
        ---
        ## Citation
        **BibTeX:**
        ```bibtex
        @misc{omnilingualasr2025,
        title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-7B-ZS https://huggingface.co/facebook/omniASR-LLM-7B-ZSThu, 27 Nov 2025 23:41:34 GMT<title>facebook/omniASR-LLM-7B</title>facebook/omniASR-LLM-7B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-7B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-7B https://huggingface.co/facebook/omniASR-LLM-7BThu, 27 Nov 2025 23:26:06 GMT<title>facebook/omniASR-LLM-3B</title>facebook/omniASR-LLM-3B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-3B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-3B https://huggingface.co/facebook/omniASR-LLM-3BThu, 27 Nov 2025 23:08:56 GMT<title>facebook/omniASR-LLM-1B</title>facebook/omniASR-LLM-1B Automatic Speech Recognition • Updated 8 days ago • 3

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-1B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-1B https://huggingface.co/facebook/omniASR-LLM-1BThu, 27 Nov 2025 22:37:27 GMT<title>facebook/omniASR-LLM-300M</title>facebook/omniASR-LLM-300M Automatic Speech Recognition • Updated 8 days ago • 3

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-LLM-300M

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-LLM-300M https://huggingface.co/facebook/omniASR-LLM-300MThu, 27 Nov 2025 22:04:09 GMT<title>facebook/omniASR-CTC-7B</title>facebook/omniASR-CTC-7B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-CTC-7B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-CTC-7B https://huggingface.co/facebook/omniASR-CTC-7BThu, 27 Nov 2025 21:31:20 GMT<title>facebook/omniASR-CTC-3B</title>facebook/omniASR-CTC-3B Automatic Speech Recognition • Updated 8 days ago • 2

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE" style="margin: 2px;">
📄 License: Apache 2.0
</a>
</div>

Model Card for omniASR-CTC-3B

Model Description

This model is part of the Omnilingual ASR family released by Meta AI. The original suite includes:

Model Name	Features	Parameters	Download Size (FP32)	Inference VRAM¹	Real-Time Factor¹ (relative speed)²
`omniASR_W2V_300M`	SSL	317_390_592	1.2 GiB
`omniASR_W2V_1B`	SSL	965_514_752	3.6 GiB
`omniASR_W2V_3B`	SSL	3_064_124_672	12.0 GiB
`omniASR_W2V_7B`	SSL	6_488_487_168	25.0 GiB
`omniASR_CTC_300M`	ASR	325_494_996	1.3 GiB	~2 GiB	0.001 (96x)
`omniASR_CTC_1B`	ASR	975_065_300	3.7 GiB	~3 GiB	0.002 (48x)
`omniASR_CTC_3B`	ASR	3_080_423_636	12.0 GiB	~8 GiB	0.003 (32x)
`omniASR_CTC_7B`	ASR	6_504_786_132	25.0 GiB	~15 GiB	0.006 (16x)
`omniASR_LLM_300M`	ASR with optional language conditioning	1_627_603_584	6.1 GiB	~5 GiB	0.090 (~1x)
`omniASR_LLM_1B`	ASR with optional language conditioning	2_275_710_592	8.5 GiB	~6 GiB	0.091 (~1x)
`omniASR_LLM_3B`	ASR with optional language conditioning	4_376_679_040	17.0 GiB	~10 GiB	0.093 (~1x)
`omniASR_LLM_7B`	ASR with optional language conditioning	7_801_041_536	30.0 GiB	~17 GiB	0.092 (~1x)
`omniASR_LLM_7B_ZS`	Zero-Shot ASR	7_810_900_608	30.0 GiB	~20 GiB	0.194 (~0.5x)
¹ (batch=1, audio_len=30s, BF16, A100)
² Relative speed to `omniASR_LLM_7B`

Installation

The models were developed using fairseq2, a research-focused sequence modeling toolkit. While we provide a reference inference pipeline that works across platforms, audio support requires libsndfile (Mac: brew install libsndfile; Windows may need an additional setup).

# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr

Inference

from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)

Supported Languages

To view the full list of 1600+ supported languages, you can access the language list programmatically:

from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

BibTeX:

@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}

Developed by: Meta AI / Omnilingual ASR Team([GitHub][1])
Model type: End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).
Language(s) (NLP): 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers 348 under-served languages across many writing systems (Latin, Arabic, Devanagari, etc.).([GitHub][1])
License: Apache-2.0 (for the model and code), CC-BY-4.0 for the facebook/omnilingual-asr-corpus dataset.([GitHub][1])

[1]: https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file "GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages"
[2]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus/blob/main/README.md "README.md · facebook/omnilingual-asr-corpus at main"
[3]: https://huggingface.co/datasets/facebook/omnilingual-asr-corpus?utm_source=chatgpt.com "facebook/omnilingual-asr-corpus · Datasets at ..."
[4]: https://venturebeat.com/ai/meta-returns-to-open-source-ai-with-omnilingual-asr-models-that-can?utm_source=chatgpt.com "Meta returns to open source AI with Omnilingual ASR ..."
[5]: https://huggingface.co/spaces/facebook/omniasr-transcriptions?utm_source=chatgpt.com "Omnilingual ASR Media Transcription"
[6]: https://huggingface.co/collections/bezzam/omnilingual-asr-1-600-languages?utm_source=chatgpt.com "Omnilingual ASR (1600+ Languages) - a bezzam Collection"
https://huggingface.co/facebook/omniASR-CTC-3B https://huggingface.co/facebook/omniASR-CTC-3BThu, 27 Nov 2025 15:47:08 GMT<title>facebook/omniASR-CTC-1B</title>facebook/omniASR-CTC-1B Automatic Speech Recognition • Updated 8 days ago • 1

license: apache-2.0
datasets:

facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

<div align="center" style="lline-height: 1.2; font-size:16px; margin-bottom: 30px;">
<a href="https://huggingface.co/facebook" target="_blank" style="margin: 2px;">
🤗 Hugging Face
</a> |
<a href="https://github.com/facebookresearch/omnilingual-asr" target="_blank" style="margin: 2px;">
🐙 GitHub
</a> |
<a href="https://huggingface.co/spaces/facebook/omniasr-transcriptions" target="_blank" style="margin: 2px;">
🤖️ Demo
</a> |
<a href="https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/" target="_blank" style="margin: 2px;">
📃 Paper
</a> |
<a href="https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/" target="_blank" style="margin: 2px;">
📝 Blogpost
</a> |
<a href="https://github.com/faceb

</details>


...

github-actions · 2025-12-06T11:25:15Z

http://localhost:1200/huggingface/models/ianyang02 - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface ianyang02 Models</title>
    <link>https://huggingface.co/ianyang02/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/ianyang02" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface ianyang02 Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sat, 06 Dec 2025 11:25:13 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>ianyang02/aita_qwen3-30b</title>
      <description>ianyang02/aita_qwen3-30b Updated about 13 hours ago</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-30b</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-30b</guid>
      <pubDate>Fri, 05 Dec 2025 14:17:52 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_dpo</title>
      <description>ianyang02/aita_qwen3-4b_dpo Updated 15 days ago
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_dpo
        tags:
        - generated_from_trainer
        - trl
        - dpo
        licence: license
        ---
        # Model Card for aita_qwen3-4b_dpo
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/aita_qwen3-4b_dpo&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/4xnikck8)
        This model was trained with DPO, a method introduced in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290).
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite DPO as:
        ```bibtex
        @inproceedings{rafailov2023direct,
        title = {{Direct Preference Optimization: Your Language Model is Secretly a Reward Model}},
        author = {Rafael Rafailov and Archit Sharma and Eric Mitchell and Christopher D. Manning and Stefano Ermon and Chelsea Finn},
        year = 2023,
        booktitle = {Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023},
        url = {http://papers.nips.cc/paper_files/paper/2023/hash/a85b405ed65c6477a4fe8302b5e06ce7-Abstract-Conference.html},
        editor = {Alice Oh and Tristan Naumann and Amir Globerson and Kate Saenko and Moritz Hardt and Sergey Levine},
        }
        ```
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_dpo</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_dpo</guid>
      <pubDate>Fri, 21 Nov 2025 15:09:21 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/ppo_model_qwen3-4b_aita_h200_2</title>
      <description>ianyang02/ppo_model_qwen3-4b_aita_h200_2 Text Generation • Updated 15 days ago • 66
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: peft
        model_name: ppo_model_qwen3-4b_aita_h200_2
        tags:
        - base_model:adapter:Qwen/Qwen3-4B-Instruct-2507
        - lora
        - transformers
        licence: license
        pipeline_tag: text-generation
        ---
        # Model Card for ppo_model_qwen3-4b_aita_h200_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/ppo_model_qwen3-4b_aita_h200_2&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/x99g8ppd)
        This model was trained with PPO, a method introduced in [Fine-Tuning Language Models from Human Preferences](https://huggingface.co/papers/1909.08593).
        ### Framework versions
        - PEFT 0.18.0
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.7.0.dev20250224+cu126
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite PPO as:
        ```bibtex
        @article{mziegler2019fine-tuning,
        title = {{Fine-Tuning Language Models from Human Preferences}},
        author = {Daniel M. Ziegler and Nisan Stiennon and Jeffrey Wu and Tom B. Brown and Alec Radford and Dario Amodei and Paul F. Christiano and Geoffrey Irving},
        year = 2019,
        eprint = {arXiv:1909.08593}
        }
        ```
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200_2</guid>
      <pubDate>Fri, 21 Nov 2025 08:19:14 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2</title>
      <description>ianyang02/aita_qwen3-4b_2 Text Classification • 4B • Updated 16 days ago • 23
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.5.0.dev20240818+cu124
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2</guid>
      <pubDate>Wed, 19 Nov 2025 23:02:33 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2_checkpoint_2</title>
      <description>ianyang02/aita_qwen3-4b_2_checkpoint_2 Text Classification • 4B • Updated 16 days ago • 25
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.5.0.dev20240818+cu124
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_2</guid>
      <pubDate>Wed, 19 Nov 2025 19:52:00 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2_checkpoint_1</title>
      <description>ianyang02/aita_qwen3-4b_2_checkpoint_1 Text Classification • 4B • Updated 16 days ago • 4
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b_2
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.25.1
        - Transformers: 4.57.1
        - Pytorch: 2.5.0.dev20240818+cu124
        - Datasets: 4.4.1
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_1</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_1</guid>
      <pubDate>Wed, 19 Nov 2025 16:40:34 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/ppo_model_qwen3-4b_aita_h200</title>
      <description>ianyang02/ppo_model_qwen3-4b_aita_h200 Updated 17 days ago
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: ppo_model_qwen3-4b_aita_h200
        tags:
        - generated_from_trainer
        - trl
        - ppo
        licence: license
        ---
        # Model Card for ppo_model_qwen3-4b_aita_h200
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/ppo_model_qwen3-4b_aita_h200&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/ht40cr4d)
        This model was trained with PPO, a method introduced in [Fine-Tuning Language Models from Human Preferences](https://huggingface.co/papers/1909.08593).
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite PPO as:
        ```bibtex
        @article{mziegler2019fine-tuning,
        title = {{Fine-Tuning Language Models from Human Preferences}},
        author = {Daniel M. Ziegler and Nisan Stiennon and Jeffrey Wu and Tom B. Brown and Alec Radford and Dario Amodei and Paul F. Christiano and Geoffrey Irving},
        year = 2019,
        eprint = {arXiv:1909.08593}
        }
        ```
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200</guid>
      <pubDate>Tue, 18 Nov 2025 16:21:51 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b</title>
      <description>ianyang02/aita_qwen3-4b Text Classification • 4B • Updated 20 days ago • 93
        ---
        base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b
        tags:
        - generated_from_trainer
        - trl
        - reward-trainer
        licence: license
        ---
        # Model Card for aita_qwen3-4b
        This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/ofxhxi45)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b</guid>
      <pubDate>Sun, 16 Nov 2025 01:32:06 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_fine_tuned_Qwen3_0.6B</title>
      <description>ianyang02/aita_fine_tuned_Qwen3_0.6B Updated 24 days ago</description>
      <link>https://huggingface.co/ianyang02/aita_fine_tuned_Qwen3_0.6B</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_fine_tuned_Qwen3_0.6B</guid>
      <pubDate>Tue, 11 Nov 2025 16:18:49 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_Qwen3-0.6B</title>
      <description>ianyang02/aita_Qwen3-0.6B Text Classification • 0.6B • Updated 28 days ago • 50
        ---
        base_model: Qwen/Qwen3-0.6B
        library_name: transformers
        model_name: aita_Qwen3-0.6B
        tags:
        - generated_from_trainer
        - reward-trainer
        - trl
        licence: license
        ---
        # Model Card for aita_Qwen3-0.6B
        This model is a fine-tuned version of [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B).
        It has been trained using [TRL](https://github.com/huggingface/trl).
        ## Quick start
        ```python
        from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_Qwen3-0.6B&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        ```
        ## Training procedure
        [&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;](https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/7xwpy737)
        This model was trained with Reward.
        ### Framework versions
        - TRL: 0.24.0
        - Transformers: 4.57.1
        - Pytorch: 2.9.0
        - Datasets: 4.3.0
        - Tokenizers: 0.22.1
        ## Citations
        Cite TRL as:
        ```bibtex
        @misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        ```</description>
      <link>https://huggingface.co/ianyang02/aita_Qwen3-0.6B</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_Qwen3-0.6B</guid>
      <pubDate>Sat, 08 Nov 2025 07:48:49 GMT</pubDate>
    </item>
  </channel>
</rss>

lib/routes/huggingface/models.ts

github-actions · 2025-12-07T03:02:32Z

Successfully generated as following:

http://localhost:1200/huggingface/models/deepseek-ai - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface deepseek-ai Models</title>
    <link>https://huggingface.co/deepseek-ai/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/deepseek-ai" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface deepseek-ai Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sun, 07 Dec 2025 03:02:24 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Speciale</title>
      <description>deepseek-ai/DeepSeek-V3.2-Speciale Text Generation • 685B • Updated 6 days ago • 4.72k • 521&lt;hr&gt;
        &lt;p&gt;license: mit
        library_name: transformers
        base_model:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;DeepSeek-V3.2: Efficient Reasoning &amp;amp; Agentic AI&lt;/h1&gt;
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2/blob/main/assets/paper.pdf&quot;&gt;&lt;b&gt;Technical Report&lt;/b&gt;👁️&lt;/a&gt;
        &lt;/p&gt;
        &lt;h2&gt;Introduction&lt;/h2&gt;
        &lt;p&gt;We introduce &lt;strong&gt;DeepSeek-V3.2&lt;/strong&gt;, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:&lt;/p&gt;
        &lt;ol&gt;
        &lt;li&gt;&lt;strong&gt;DeepSeek Sparse Attention (DSA):&lt;/strong&gt; We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.&lt;/li&gt;
        &lt;li&gt;&lt;strong&gt;Scalable Reinforcement Learning Framework:&lt;/strong&gt; By implementing a robust RL protocol and scaling post-training compute, &lt;em&gt;DeepSeek-V3.2&lt;/em&gt; performs comparably to GPT-5. Notably, our high-compute variant, &lt;strong&gt;DeepSeek-V3.2-Speciale&lt;/strong&gt;, &lt;strong&gt;surpasses GPT-5&lt;/strong&gt; and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
        &lt;ul&gt;
        &lt;li&gt;&lt;em&gt;Achievement:&lt;/em&gt; 🥇 &lt;strong&gt;Gold-medal performance&lt;/strong&gt; in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI).&lt;/li&gt;
        &lt;/ul&gt;
        &lt;/li&gt;
        &lt;li&gt;&lt;strong&gt;Large-Scale Agentic Task Synthesis Pipeline:&lt;/strong&gt; To integrate &lt;strong&gt;reasoning into tool-use&lt;/strong&gt; scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This facilitates scalable agentic post-training, improving compliance and generalization in complex interactive environments.&lt;/li&gt;
        &lt;/ol&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale/resolve/main/assets/benchmark.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;p&gt;We have also released the final submissions for IOI 2025, ICPC World Finals, IMO 2025 and CMO 2025, which were selected based on our designed pipeline. These materials are provided for the community to conduct secondary verification. The files can be accessed at &lt;code&gt;assets/olympiad_cases&lt;/code&gt;.&lt;/p&gt;
        &lt;h2&gt;Chat Template&lt;/h2&gt;
        &lt;p&gt;DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a &quot;thinking with tools&quot; capability.&lt;/p&gt;
        &lt;p&gt;To assist the community in understanding and adapting to this new template, we have provided a dedicated &lt;code&gt;encoding&lt;/code&gt; folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model&#39;s text output.&lt;/p&gt;
        &lt;p&gt;A brief example is illustrated below:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;import transformers
        # encoding/encoding_dsv32.py
        from encoding_dsv32 import encode_messages, parse_message_from_completion_text
        tokenizer = transformers.AutoTokenizer.from_pretrained(&quot;deepseek-ai/DeepSeek-V3.2&quot;)
        messages = [
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;hello&quot;},
        {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;Hello! I am DeepSeek.&quot;, &quot;reasoning_content&quot;: &quot;thinking...&quot;},
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;1+1=?&quot;}
        ]
        encode_config = dict(thinking_mode=&quot;thinking&quot;, drop_thinking=True, add_default_bos_token=True)
        # messages -&amp;gt; string
        prompt = encode_messages(messages, **encode_config)
        # Output: &quot;&amp;lt;｜begin▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;hello&amp;lt;｜Assistant｜&amp;gt;&amp;lt;/think&amp;gt;Hello! I am DeepSeek.&amp;lt;｜end▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;1+1=?&amp;lt;｜Assistant｜&amp;gt;&amp;lt;think&amp;gt;&quot;
        # string -&amp;gt; tokens
        tokens = tokenizer.encode(prompt)
        # Output: [0, 128803, 33310, 128804, 128799, 19923, 3, 342, 1030, 22651, 4374, 1465, 16, 1, 128803, 19, 13, 19, 127252, 128804, 128798]
        &lt;/code&gt;&lt;/pre&gt;
        &lt;p&gt;Important Notes:&lt;/p&gt;
        &lt;ol&gt;
        &lt;li&gt;This release does not include a Jinja-format chat template. Please refer to the Python code mentioned above.&lt;/li&gt;
        &lt;li&gt;The output parsing function included in the code is designed to handle well-formatted strings only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. It is not suitable for production use without robust error handling.&lt;/li&gt;
        &lt;li&gt;A new role named &lt;code&gt;developer&lt;/code&gt; has been introduced in the chat template. This role is dedicated exclusively to search agent scenarios and is designated for no other tasks. The official API does not accept messages assigned to &lt;code&gt;developer&lt;/code&gt;.&lt;/li&gt;
        &lt;/ol&gt;
        &lt;h2&gt;How to Run Locally&lt;/h2&gt;
        &lt;p&gt;The model structure of DeepSeek-V3.2 and DeepSeek-V3.2-Speciale are the same as DeepSeek-V3.2-Exp. Please visit &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V3.2-Exp&quot;&gt;DeepSeek-V3.2-Exp&lt;/a&gt; repo for more information about running this model locally.&lt;/p&gt;
        &lt;p&gt;Usage Recommendations:&lt;/p&gt;
        &lt;ol&gt;
        &lt;li&gt;For local deployment, we recommend setting the sampling parameters to &lt;code&gt;temperature = 1.0, top_p = 0.95&lt;/code&gt;.&lt;/li&gt;
        &lt;li&gt;Please note that the DeepSeek-V3.2-Speciale variant is designed exclusively for deep reasoning tasks and does not support the tool-calling functionality.&lt;/li&gt;
        &lt;/ol&gt;
        &lt;h2&gt;License&lt;/h2&gt;
        &lt;p&gt;This repository and the model weights are licensed under the &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot;&gt;MIT License&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Citation&lt;/h2&gt;
        &lt;pre&gt;&lt;code&gt;@misc{deepseekai2025deepseekv32,
        title={DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models},
        author={DeepSeek-AI},
        year={2025},
        }
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Contact&lt;/h2&gt;
        &lt;p&gt;If you have any questions, please raise an issue or contact us at &lt;a href=&quot;https://huggingface.co/deepseek-ai/service@deepseek.com&quot;&gt;service@deepseek.com&lt;/a&gt;.&lt;/p&gt;
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale</guid>
      <pubDate>Mon, 01 Dec 2025 03:06:03 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2</title>
      <description>deepseek-ai/DeepSeek-V3.2 Text Generation • 685B • Updated 6 days ago • 18.1k • • 752&lt;hr&gt;
        &lt;p&gt;license: mit
        library_name: transformers
        base_model:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;DeepSeek-V3.2: Efficient Reasoning &amp;amp; Agentic AI&lt;/h1&gt;
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2/blob/main/assets/paper.pdf&quot;&gt;&lt;b&gt;Technical Report&lt;/b&gt;👁️&lt;/a&gt;
        &lt;/p&gt;
        &lt;h2&gt;Introduction&lt;/h2&gt;
        &lt;p&gt;We introduce &lt;strong&gt;DeepSeek-V3.2&lt;/strong&gt;, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:&lt;/p&gt;
        &lt;ol&gt;
        &lt;li&gt;&lt;strong&gt;DeepSeek Sparse Attention (DSA):&lt;/strong&gt; We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.&lt;/li&gt;
        &lt;li&gt;&lt;strong&gt;Scalable Reinforcement Learning Framework:&lt;/strong&gt; By implementing a robust RL protocol and scaling post-training compute, &lt;em&gt;DeepSeek-V3.2&lt;/em&gt; performs comparably to GPT-5. Notably, our high-compute variant, &lt;strong&gt;DeepSeek-V3.2-Speciale&lt;/strong&gt;, &lt;strong&gt;surpasses GPT-5&lt;/strong&gt; and exhibits reasoning proficiency on par with Gemini-3.0-Pro.
        &lt;ul&gt;
        &lt;li&gt;&lt;em&gt;Achievement:&lt;/em&gt; 🥇 &lt;strong&gt;Gold-medal performance&lt;/strong&gt; in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI).&lt;/li&gt;
        &lt;/ul&gt;
        &lt;/li&gt;
        &lt;li&gt;&lt;strong&gt;Large-Scale Agentic Task Synthesis Pipeline:&lt;/strong&gt; To integrate &lt;strong&gt;reasoning into tool-use&lt;/strong&gt; scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This facilitates scalable agentic post-training, improving compliance and generalization in complex interactive environments.&lt;/li&gt;
        &lt;/ol&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2/resolve/main/assets/benchmark.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;p&gt;We have also released the final submissions for IOI 2025, ICPC World Finals, IMO 2025 and CMO 2025, which were selected based on our designed pipeline. These materials are provided for the community to conduct secondary verification. The files can be accessed at &lt;code&gt;assets/olympiad_cases&lt;/code&gt;.&lt;/p&gt;
        &lt;h2&gt;Chat Template&lt;/h2&gt;
        &lt;p&gt;DeepSeek-V3.2 introduces significant updates to its chat template compared to prior versions. The primary changes involve a revised format for tool calling and the introduction of a &quot;thinking with tools&quot; capability.&lt;/p&gt;
        &lt;p&gt;To assist the community in understanding and adapting to this new template, we have provided a dedicated &lt;code&gt;encoding&lt;/code&gt; folder, which contains Python scripts and test cases demonstrating how to encode messages in OpenAI-compatible format into input strings for the model and how to parse the model&#39;s text output.&lt;/p&gt;
        &lt;p&gt;A brief example is illustrated below:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;import transformers
        # encoding/encoding_dsv32.py
        from encoding_dsv32 import encode_messages, parse_message_from_completion_text
        tokenizer = transformers.AutoTokenizer.from_pretrained(&quot;deepseek-ai/DeepSeek-V3.2&quot;)
        messages = [
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;hello&quot;},
        {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;Hello! I am DeepSeek.&quot;, &quot;reasoning_content&quot;: &quot;thinking...&quot;},
        {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;1+1=?&quot;}
        ]
        encode_config = dict(thinking_mode=&quot;thinking&quot;, drop_thinking=True, add_default_bos_token=True)
        # messages -&amp;gt; string
        prompt = encode_messages(messages, **encode_config)
        # Output: &quot;&amp;lt;｜begin▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;hello&amp;lt;｜Assistant｜&amp;gt;&amp;lt;/think&amp;gt;Hello! I am DeepSeek.&amp;lt;｜end▁of▁sentence｜&amp;gt;&amp;lt;｜User｜&amp;gt;1+1=?&amp;lt;｜Assistant｜&amp;gt;&amp;lt;think&amp;gt;&quot;
        # string -&amp;gt; tokens
        tokens = tokenizer.encode(prompt)
        # Output: [0, 128803, 33310, 128804, 128799, 19923, 3, 342, 1030, 22651, 4374, 1465, 16, 1, 128803, 19, 13, 19, 127252, 128804, 128798]
        &lt;/code&gt;&lt;/pre&gt;
        &lt;p&gt;Important Notes:&lt;/p&gt;
        &lt;ol&gt;
        &lt;li&gt;This release does not include a Jinja-format chat template. Please refer to the Python code mentioned above.&lt;/li&gt;
        &lt;li&gt;The output parsing function included in the code is designed to handle well-formatted strings only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. It is not suitable for production use without robust error handling.&lt;/li&gt;
        &lt;li&gt;A new role named &lt;code&gt;developer&lt;/code&gt; has been introduced in the chat template. This role is dedicated exclusively to search agent scenarios and is designated for no other tasks. The official API does not accept messages assigned to &lt;code&gt;developer&lt;/code&gt;.&lt;/li&gt;
        &lt;/ol&gt;
        &lt;h2&gt;How to Run Locally&lt;/h2&gt;
        &lt;p&gt;The model structure of DeepSeek-V3.2 and DeepSeek-V3.2-Speciale are the same as DeepSeek-V3.2-Exp. Please visit &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V3.2-Exp&quot;&gt;DeepSeek-V3.2-Exp&lt;/a&gt; repo for more information about running this model locally.&lt;/p&gt;
        &lt;p&gt;Usage Recommendations:&lt;/p&gt;
        &lt;ol&gt;
        &lt;li&gt;For local deployment, we recommend setting the sampling parameters to &lt;code&gt;temperature = 1.0, top_p = 0.95&lt;/code&gt;.&lt;/li&gt;
        &lt;li&gt;Please note that the DeepSeek-V3.2-Speciale variant is designed exclusively for deep reasoning tasks and does not support the tool-calling functionality.&lt;/li&gt;
        &lt;/ol&gt;
        &lt;h2&gt;License&lt;/h2&gt;
        &lt;p&gt;This repository and the model weights are licensed under the &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot;&gt;MIT License&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Citation&lt;/h2&gt;
        &lt;pre&gt;&lt;code&gt;@misc{deepseekai2025deepseekv32,
        title={DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models},
        author={DeepSeek-AI},
        year={2025},
        }
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Contact&lt;/h2&gt;
        &lt;p&gt;If you have any questions, please raise an issue or contact us at &lt;a href=&quot;https://huggingface.co/deepseek-ai/service@deepseek.com&quot;&gt;service@deepseek.com&lt;/a&gt;.&lt;/p&gt;
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2</guid>
      <pubDate>Mon, 01 Dec 2025 03:04:59 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-Math-V2</title>
      <description>deepseek-ai/DeepSeek-Math-V2 Text Generation • 685B • Updated 10 days ago • 8.96k • 637&lt;hr&gt;
        &lt;p&gt;license: apache-2.0
        library_name: transformers
        base_model:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;deepseek-ai/DeepSeek-Math-V2&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot;&gt;&lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot;&gt;&lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot;&gt;&lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;br&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot;&gt;&lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot;&gt;&lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot;&gt;&lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;
        &lt;br&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-Apache%202.0-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;br&gt;
        &lt;/div&gt;
        &lt;h1&gt;DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning&lt;/h1&gt;
        &lt;h2&gt;1. Introduction&lt;/h2&gt;
        &lt;p&gt;Large language models have made significant progress in mathematical reasoning, which serves as an important testbed for AI and could impact scientific research if further advanced.
        By scaling reasoning with reinforcement learning that rewards correct final answers, LLMs have improved from poor performance to saturating quantitative reasoning competitions like AIME and HMMT in one year.
        However, this approach faces fundamental limitations.
        Pursuing higher final answer accuracy doesn&#39;t address a key issue: correct answers don&#39;t guarantee correct reasoning.
        Moreover, many mathematical tasks like theorem proving require rigorous step-by-step derivation rather than numerical answers, making final answer rewards inapplicable.
        To push the limits of deep reasoning, we believe it is necessary to verify the comprehensiveness and rigor of mathematical reasoning.
        Self-verification is particularly important for scaling test-time compute, especially for open problems without known solutions.
        Towards self-verifiable mathematical reasoning, we investigate how to train an accurate and faithful LLM-based verifier for theorem proving.
        We then train a proof generator using the verifier as the reward model, and incentivize the generator to identify and resolve as many issues as possible in their own proofs before finalizing them.
        To maintain the generation-verification gap as the generator becomes stronger, we propose to scale verification compute to automatically label new hard-to-verify proofs, creating training data to further improve the verifier.
        Our resulting model, DeepSeekMath-V2, demonstrates strong theorem-proving capabilities, achieving gold-level scores on IMO 2025 and CMO 2024 and a near-perfect 118/120 on Putnam 2024 with scaled test-time compute.
        While much work remains, these results suggest that self-verifiable mathematical reasoning is a feasible research direction that may help develop more capable mathematical AI systems.&lt;/p&gt;
        &lt;h2&gt;2. Evaluation Results&lt;/h2&gt;
        &lt;p&gt;Below are evaluation results on &lt;a href=&quot;https://github.com/google-deepmind/superhuman/tree/main/imobench&quot;&gt;IMO-ProofBench&lt;/a&gt; (developed by the DeepMind team behind DeepThink IMO-Gold) and recent mathematics competitions including IMO 2025, CMO 2024, and Putnam 2024.&lt;/p&gt;
        &lt;p&gt;&lt;strong&gt;IMO-ProofBench&lt;/strong&gt;&lt;/p&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;img width=&quot;100%&quot; src=&quot;https://raw.githubusercontent.com/deepseek-ai/DeepSeek-Math-V2/refs/heads/main/figures/IMO-ProofBench.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        &lt;hr&gt;
        &lt;p&gt;&lt;strong&gt;Mathematics Competitions&lt;/strong&gt;&lt;/p&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;img width=&quot;41%&amp;quot;&quot; src=&quot;https://raw.githubusercontent.com/deepseek-ai/DeepSeek-Math-V2/refs/heads/main/figures/Competitions.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        &lt;h2&gt;4. Quick Start&lt;/h2&gt;
        &lt;p&gt;DeepSeekMath-V2 is built on top of DeepSeek-V3.2-Exp-Base.
        For inference support, please refer to &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V3.2-Exp&quot;&gt;the DeepSeek-V3.2-Exp github repository&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;6. License&lt;/h2&gt;
        &lt;p&gt;This repository and the model weights are licensed under &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot;&gt;the Apache License, Version 2.0 (Apache 2.0)&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;7. Citation&lt;/h2&gt;
        &lt;pre&gt;&lt;code&gt;@misc{deepseek-math-v2,
        author = {Zhihong Shao, Yuxiang Luo, Chengda Lu, Z.Z. Ren, Jiewen Hu, Tian Ye, Zhibin Gou, Shirong Ma, Xiaokang Zhang},
        title = {DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning},
        year = {2025},
        }
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;8. Contact&lt;/h2&gt;
        &lt;p&gt;If you have any questions, please raise an issue or contact us at &lt;a href=&quot;mailto:service@deepseek.com&quot;&gt;service@deepseek.com&lt;/a&gt;.&lt;/p&gt;
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-Math-V2</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-Math-V2</guid>
      <pubDate>Thu, 27 Nov 2025 02:35:52 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Exp</title>
      <description>deepseek-ai/DeepSeek-V3.2-Exp Text Generation • 685B • Updated 19 days ago • 66.2k • • 900&lt;hr&gt;
        &lt;p&gt;license: mit
        library_name: transformers
        base_model:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;deepseek-ai/DeepSeek-V3.2-Exp-Base
        base_model_relation: finetune&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;DeepSeek-V3.2-Exp&lt;/h1&gt;
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;h2&gt;Introduction&lt;/h2&gt;
        &lt;p&gt;We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.&lt;/p&gt;
        &lt;p&gt;This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.&lt;/p&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp/resolve/main/assets/cost.png&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;ul&gt;
        &lt;li&gt;
        &lt;p&gt;DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.&lt;/p&gt;
        &lt;/li&gt;
        &lt;li&gt;
        &lt;p&gt;To rigorously evaluate the impact of introducing sparse attention, we deliberately aligned the training configurations of DeepSeek-V3.2-Exp with V3.1-Terminus. Across public benchmarks in various domains, DeepSeek-V3.2-Exp demonstrates performance on par with V3.1-Terminus.&lt;/p&gt;
        &lt;/li&gt;
        &lt;/ul&gt;
        &lt;table&gt;
        &lt;thead&gt;
        &lt;tr&gt;
        &lt;th style=&quot;text-align:left&quot;&gt;Benchmark&lt;/th&gt;
        &lt;th style=&quot;text-align:center&quot;&gt;DeepSeek-V3.1-Terminus&lt;/th&gt;
        &lt;th style=&quot;text-align:center&quot;&gt;DeepSeek-V3.2-Exp&lt;/th&gt;
        &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;&lt;strong&gt;Reasoning Mode w/o Tool Use&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;MMLU-Pro&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;85.0&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;85.0&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;GPQA-Diamond&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;80.7&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;79.9&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;Humanity&#39;s Last Exam&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;21.7&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;19.8&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;LiveCodeBench&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;74.9&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;74.1&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;AIME 2025&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;88.4&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;89.3&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;HMMT 2025&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;86.1&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;83.6&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;Codeforces&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;2046&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;2121&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;Aider-Polyglot&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;76.1&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;74.5&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;&lt;strong&gt;Agentic Tool Use&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;BrowseComp&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;38.5&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;40.1&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;BrowseComp-zh&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;45.0&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;47.9&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;SimpleQA&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;96.8&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;97.1&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;SWE Verified&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;68.4&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;67.8&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;SWE-bench Multilingual&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;57.8&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;57.9&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;Terminal-bench&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;36.7&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;37.7&lt;/td&gt;
        &lt;/tr&gt;
        &lt;/tbody&gt;
        &lt;/table&gt;
        &lt;h2&gt;Update&lt;/h2&gt;
        &lt;ul&gt;
        &lt;li&gt;2025.11.17: &lt;strong&gt;We have identified that previous versions of the inference demo code contained an implementation discrepancy in Rotary Position Embedding (RoPE) within the indexer module, potentially leading to degraded model performance.&lt;/strong&gt; Specifically, the input tensor to RoPE in the indexer module requires a non-interleaved layout, whereas RoPE in the MLA module expects an interleaved layout. This issue has now been resolved. Please refer to the updated version of the inference demo code and take note of this implementation detail.&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;How to Run Locally&lt;/h2&gt;
        &lt;h3&gt;HuggingFace&lt;/h3&gt;
        &lt;p&gt;We provide an updated inference demo code in the &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp/tree/main/inference&quot;&gt;inference&lt;/a&gt; folder to help the community quickly get started with our model and understand its architectural details.&lt;/p&gt;
        &lt;p&gt;First convert huggingface model weights to the the format required by our inference demo. Set &lt;code&gt;MP&lt;/code&gt; to match your available GPU count:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;cd inference
        export EXPERTS=256
        python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}
        &lt;/code&gt;&lt;/pre&gt;
        &lt;p&gt;Launch the interactive chat interface and start exploring DeepSeek&#39;s capabilities:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;export CONFIG=config_671B_v3.2.json
        torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h3&gt;SGLang&lt;/h3&gt;
        &lt;h4&gt;Installation with Docker&lt;/h4&gt;
        &lt;pre&gt;&lt;code&gt;# H200
        docker pull lmsysorg/sglang:dsv32
        # MI350
        docker pull lmsysorg/sglang:dsv32-rocm
        # NPUs
        docker pull lmsysorg/sglang:dsv32-a2
        docker pull lmsysorg/sglang:dsv32-a3
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h4&gt;Launch Command&lt;/h4&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --dp 8 --enable-dp-attention
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h3&gt;vLLM&lt;/h3&gt;
        &lt;p&gt;vLLM provides day-0 support of DeepSeek-V3.2-Exp. See the &lt;a href=&quot;https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-V3_2-Exp.html&quot;&gt;recipes&lt;/a&gt; for up-to-date details.&lt;/p&gt;
        &lt;h2&gt;Open-Source Kernels&lt;/h2&gt;
        &lt;p&gt;For TileLang kernels with &lt;strong&gt;better readability and research-purpose design&lt;/strong&gt;, please refer to &lt;a href=&quot;https://github.com/tile-ai/tilelang/tree/main/examples/deepseek_v32&quot;&gt;TileLang&lt;/a&gt;.&lt;/p&gt;
        &lt;p&gt;For &lt;strong&gt;high-performance CUDA kernels&lt;/strong&gt;, indexer logit kernels (including paged versions) are available in &lt;a href=&quot;https://github.com/deepseek-ai/DeepGEMM/pull/200&quot;&gt;DeepGEMM&lt;/a&gt;. Sparse attention kernels are released in &lt;a href=&quot;https://github.com/deepseek-ai/FlashMLA/pull/98&quot;&gt;FlashMLA&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;License&lt;/h2&gt;
        &lt;p&gt;This repository and the model weights are licensed under the &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot;&gt;MIT License&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Citation&lt;/h2&gt;
        &lt;pre&gt;&lt;code&gt;@misc{deepseekai2024deepseekv32,
        title={DeepSeek-V3.2-Exp: Boosting Long-Context Efficiency with DeepSeek Sparse Attention},
        author={DeepSeek-AI},
        year={2025},
        }
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Contact&lt;/h2&gt;
        &lt;p&gt;If you have any questions, please raise an issue or contact us at &lt;a href=&quot;https://huggingface.co/deepseek-ai/service@deepseek.com&quot;&gt;service@deepseek.com&lt;/a&gt;.&lt;/p&gt;
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp</guid>
      <pubDate>Mon, 17 Nov 2025 18:39:54 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-OCR</title>
      <description>deepseek-ai/DeepSeek-OCR Image-Text-to-Text • 3B • Updated Nov 4 • 5.47M • 2.93k&lt;hr&gt;
        &lt;p&gt;pipeline_tag: image-text-to-text
        language:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;multilingual
        tags:&lt;/li&gt;
        &lt;li&gt;deepseek&lt;/li&gt;
        &lt;li&gt;vision-language&lt;/li&gt;
        &lt;li&gt;ocr&lt;/li&gt;
        &lt;li&gt;custom_code
        license: mit
        library_name: transformers&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek AI&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-OCR&quot;&gt;&lt;b&gt;🌟 Github&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR&quot;&gt;&lt;b&gt;📥 Model Download&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf&quot;&gt;&lt;b&gt;📄 Paper Link&lt;/b&gt;&lt;/a&gt; |
        &lt;a href=&quot;https://arxiv.org/abs/2510.18234&quot;&gt;&lt;b&gt;📄 Arxiv Paper Link&lt;/b&gt;&lt;/a&gt; |
        &lt;/p&gt;
        &lt;h2&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/papers/2510.18234&quot;&gt;DeepSeek-OCR: Contexts Optical Compression&lt;/a&gt;
        &lt;/p&gt;
        &lt;/h2&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR/resolve/main/assets/fig1.png&quot; style=&quot;width: 1000px&quot; align=&quot;center&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/p&gt;
        &lt;p align=&quot;center&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/papers/2510.18234&quot;&gt;Explore the boundaries of visual-text compression.&lt;/a&gt;
        &lt;/p&gt;
        &lt;h2&gt;Usage&lt;/h2&gt;
        &lt;p&gt;Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8：&lt;/p&gt;
        &lt;pre&gt;&lt;code&gt;torch==2.6.0
        transformers==4.46.3
        tokenizers==0.20.3
        einops
        addict
        easydict
        pip install flash-attn==2.7.3 --no-build-isolation
        &lt;/code&gt;&lt;/pre&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import AutoModel, AutoTokenizer
        import torch
        import os
        os.environ[&quot;CUDA_VISIBLE_DEVICES&quot;] = &#39;0&#39;
        model_name = &#39;deepseek-ai/DeepSeek-OCR&#39;
        tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
        model = AutoModel.from_pretrained(model_name, _attn_implementation=&#39;flash_attention_2&#39;, trust_remote_code=True, use_safetensors=True)
        model = model.eval().cuda().to(torch.bfloat16)
        # prompt = &quot;&amp;lt;image&amp;gt;\nFree OCR. &quot;
        prompt = &quot;&amp;lt;image&amp;gt;\n&amp;lt;|grounding|&amp;gt;Convert the document to markdown. &quot;
        image_file = &#39;your_image.jpg&#39;
        output_path = &#39;your/output/dir&#39;
        # infer(self, tokenizer, prompt=&#39;&#39;, image_file=&#39;&#39;, output_path = &#39; &#39;, base_size = 1024, image_size = 640, crop_mode = True, test_compress = False, save_results = False):
        # Tiny: base_size = 512, image_size = 512, crop_mode = False
        # Small: base_size = 640, image_size = 640, crop_mode = False
        # Base: base_size = 1024, image_size = 1024, crop_mode = False
        # Large: base_size = 1280, image_size = 1280, crop_mode = False
        # Gundam: base_size = 1024, image_size = 640, crop_mode = True
        res = model.infer(tokenizer, prompt=prompt, image_file=image_file, output_path = output_path, base_size = 1024, image_size = 640, crop_mode=True, save_results = True, test_compress = True)
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;vLLM&lt;/h2&gt;
        &lt;p&gt;Refer to &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-OCR/&quot;&gt;🌟GitHub&lt;/a&gt; for guidance on model inference acceleration and PDF processing, etc.&lt;!-- --&gt;&lt;/p&gt;
        &lt;p&gt;[2025/10/23] 🚀🚀🚀 DeepSeek-OCR is now officially supported in upstream &lt;a href=&quot;https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-OCR.html#installing-vllm&quot;&gt;vLLM&lt;/a&gt;.&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;uv venv
        source .venv/bin/activate
        # Until v0.11.1 release, you need to install vLLM from nightly build
        uv pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
        &lt;/code&gt;&lt;/pre&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from vllm import LLM, SamplingParams
        from vllm.model_executor.models.deepseek_ocr import NGramPerReqLogitsProcessor
        from PIL import Image
        # Create model instance
        llm = LLM(
        model=&quot;deepseek-ai/DeepSeek-OCR&quot;,
        enable_prefix_caching=False,
        mm_processor_cache_gb=0,
        logits_processors=[NGramPerReqLogitsProcessor]
        )
        # Prepare batched input with your image file
        image_1 = Image.open(&quot;path/to/your/image_1.png&quot;).convert(&quot;RGB&quot;)
        image_2 = Image.open(&quot;path/to/your/image_2.png&quot;).convert(&quot;RGB&quot;)
        prompt = &quot;&amp;lt;image&amp;gt;\nFree OCR.&quot;
        model_input = [
        {
        &quot;prompt&quot;: prompt,
        &quot;multi_modal_data&quot;: {&quot;image&quot;: image_1}
        },
        {
        &quot;prompt&quot;: prompt,
        &quot;multi_modal_data&quot;: {&quot;image&quot;: image_2}
        }
        ]
        sampling_param = SamplingParams(
        temperature=0.0,
        max_tokens=8192,
        # ngram logit processor args
        extra_args=dict(
        ngram_size=30,
        window_size=90,
        whitelist_token_ids={128821, 128822}, # whitelist: &amp;lt;td&amp;gt;, &amp;lt;/td&amp;gt;
        ),
        skip_special_tokens=False,
        )
        # Generate output
        model_outputs = llm.generate(model_input, sampling_param)
        # Print output
        for output in model_outputs:
        print(output.outputs[0].text)
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Visualizations&lt;/h2&gt;
        &lt;table&gt;
        &lt;tbody&gt;&lt;tr&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR/resolve/main/assets/show1.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR/resolve/main/assets/show2.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR/resolve/main/assets/show3.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src=&quot;https://huggingface.co/deepseek-ai/DeepSeek-OCR/resolve/main/assets/show4.jpg&quot; style=&quot;width: 500px&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;/tbody&gt;&lt;/table&gt;
        &lt;h2&gt;Acknowledgement&lt;/h2&gt;
        &lt;p&gt;We would like to thank &lt;a href=&quot;https://github.com/Ucas-HaoranWei/Vary/&quot;&gt;Vary&lt;/a&gt;, &lt;a href=&quot;https://github.com/Ucas-HaoranWei/GOT-OCR2.0/&quot;&gt;GOT-OCR2.0&lt;/a&gt;, &lt;a href=&quot;https://github.com/opendatalab/MinerU&quot;&gt;MinerU&lt;/a&gt;, &lt;a href=&quot;https://github.com/PaddlePaddle/PaddleOCR&quot;&gt;PaddleOCR&lt;/a&gt;, &lt;a href=&quot;https://github.com/LingyvKong/OneChart&quot;&gt;OneChart&lt;/a&gt;, &lt;a href=&quot;https://github.com/Ucas-HaoranWei/Slow-Perception&quot;&gt;Slow Perception&lt;/a&gt; for their valuable models and ideas.&lt;/p&gt;
        &lt;p&gt;We also appreciate the benchmarks: &lt;a href=&quot;https://github.com/ucaslcl/Fox&quot;&gt;Fox&lt;/a&gt;, &lt;a href=&quot;https://github.com/opendatalab/OmniDocBench&quot;&gt;OminiDocBench&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Citation&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@article{wei2025deepseek,
        title={DeepSeek-OCR: Contexts Optical Compression},
        author={Wei, Haoran and Sun, Yaofeng and Li, Yukun},
        journal={arXiv preprint arXiv:2510.18234},
        year={2025}
        }&lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-OCR</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-OCR</guid>
      <pubDate>Mon, 03 Nov 2025 18:36:12 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.2-Exp-Base</title>
      <description>deepseek-ai/DeepSeek-V3.2-Exp-Base Text Generation • 685B • Updated Oct 9 • 217 • 48&lt;hr&gt;
        &lt;h2&gt;license: mit
        library_name: transformers&lt;/h2&gt;
      </description>
      <link>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp-Base</link>
      <guid isPermaLink="false">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp-Base</guid>
      <pubDate>Wed, 08 Oct 2025 18:09:46 GMT</pubDate>
    </item>
    <item>
      <title>deepseek-ai/DeepSeek-V3.1-Terminus</title>
      <description>deepseek-ai/DeepSeek-V3.1-Terminus Text Generation • 685B • Updated Sep 29 • 81k • • 354&lt;hr&gt;
        &lt;p&gt;license: mit
        library_name: transformers
        base_model:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;deepseek-ai/DeepSeek-V3.1-Base&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;DeepSeek-V3.1-Terminus&lt;/h1&gt;
        &lt;!-- markdownlint-disable first-line-h1 --&gt;
        &lt;!-- markdownlint-disable html --&gt;
        &lt;!-- markdownlint-disable no-duplicate-header --&gt;
        &lt;div align=&quot;center&quot;&gt;
        &lt;img src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true&quot; width=&quot;60%&quot; alt=&quot;DeepSeek-V3&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/div&gt;
        &lt;hr&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://www.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Homepage&quot; src=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://chat.deepseek.com/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Chat&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%96%20Chat-DeepSeek%20V3-536af5?color=536af5&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Hugging Face&quot; src=&quot;https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://discord.gg/Tc7c45Zzu5&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Discord&quot; src=&quot;https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&amp;amp;logoColor=white&amp;amp;color=7289da&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Wechat&quot; src=&quot;https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;a href=&quot;https://twitter.com/deepseek_ai&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;Twitter Follow&quot; src=&quot;https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&amp;amp;logoColor=white&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;div align=&quot;center&quot; style=&quot;line-height: 1;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/deepseek-ai/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        &lt;img alt=&quot;License&quot; src=&quot;https://img.shields.io/badge/License-MIT-f5de53?&amp;amp;color=f5de53&quot; style=&quot;display: inline-block; vertical-align: middle;&quot; referrerpolicy=&quot;no-referrer&quot;&gt;
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;h2&gt;Introduction&lt;/h2&gt;
        &lt;p&gt;This update maintains the model&#39;s original capabilities while addressing issues reported by users, including:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;Language consistency: Reducing instances of mixed Chinese-English text and occasional abnormal characters;&lt;/li&gt;
        &lt;li&gt;Agent capabilities: Further optimizing the performance of the Code Agent and Search Agent.&lt;/li&gt;
        &lt;/ul&gt;
        &lt;table&gt;
        &lt;thead&gt;
        &lt;tr&gt;
        &lt;th style=&quot;text-align:left&quot;&gt;Benchmark&lt;/th&gt;
        &lt;th style=&quot;text-align:center&quot;&gt;DeepSeek-V3.1&lt;/th&gt;
        &lt;th style=&quot;text-align:center&quot;&gt;DeepSeek-V3.1-Terminus&lt;/th&gt;
        &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;&lt;strong&gt;Reasoning Mode w/o Tool Use&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;MMLU-Pro&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;84.8&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;85.0&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;GPQA-Diamond&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;80.1&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;80.7&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;Humanity&#39;s Last Exam&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;15.9&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;21.7&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;LiveCodeBench&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;74.8&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;74.9&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;Codeforces&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;2091&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;2046&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;Aider-Polyglot&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;76.3&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;76.1&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;&lt;strong&gt;Agentic Tool Use&lt;/strong&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;BrowseComp&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;30.0&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;38.5&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;BrowseComp-zh&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;49.2&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;45.0&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;SimpleQA&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;93.4&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;96.8&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;SWE Verified&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;66.0&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;68.4&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td style=&quot;text-align:left&quot;&gt;SWE-bench Multilingual&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;54.5&lt;/td&gt;
        &lt;td style=&quot;text-align:center&quot;&gt;57.8&lt;/td&gt;

...

github-actions · 2025-12-07T03:02:33Z

http://localhost:1200/huggingface/models/facebook - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface facebook Models</title>
    <link>https://huggingface.co/facebook/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/facebook" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface facebook Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sun, 07 Dec 2025 03:02:30 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>facebook/testing_instructions</title>
      <description>facebook/testing_instructions Updated 3 days ago • 2&lt;hr&gt;
        &lt;pre&gt;&lt;code&gt;extra_gated_fields:
        First Name: text
        Last Name: text
        Date of birth: date_picker
        Country: country
        Affiliation: text
        I accept the terms and conditions: checkbox
        geo: ip_location
        Test request?: checkbox
        By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox
        extra_gated_description: &amp;gt;-
        The information you provide will be collected, stored, processed and shared in
        accordance with the [Meta Privacy
        Policy](https://www.facebook.com/privacy/policy/).
        extra_gated_button_content: Submit
        extra_gated_heading: &quot;Please be sure to provide your full legal name, date of birth, and full organization name with all corporate identifiers. Avoid the use of acronyms and special characters. Failure to follow these instructions may prevent you from accessing this model and others on Hugging Face. You will not have the ability to edit this form after submission, so please ensure all information is accurate.&quot;
        language:
        - en
        tags:
        - meta-ai
        - meta-pytorch
        license: fair-noncommercial-research-license
        ---
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/facebook/testing_instructions</link>
      <guid isPermaLink="false">https://huggingface.co/facebook/testing_instructions</guid>
      <pubDate>Thu, 04 Dec 2025 00:10:57 GMT</pubDate>
    </item>
    <item>
      <title>facebook/omniASR-LLM-7B-ZS</title>
      <description>facebook/omniASR-LLM-7B-ZS Automatic Speech Recognition • Updated 9 days ago • 3&lt;hr&gt;
        &lt;p&gt;license: apache-2.0
        datasets:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;facebook/omnilingual-asr-corpus
        pipeline_tag: automatic-speech-recognition&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages&lt;/h1&gt;
        &lt;div align=&quot;center&quot; style=&quot;lline-height: 1.2; font-size:16px; margin-bottom: 30px;&quot;&gt;
        &lt;a href=&quot;https://huggingface.co/facebook&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🤗 Hugging Face
        &lt;/a&gt; |
        &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🐙 GitHub
        &lt;/a&gt; |
        &lt;a href=&quot;https://huggingface.co/spaces/facebook/omniasr-transcriptions&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        🤖️ Demo
        &lt;/a&gt; |
        &lt;a href=&quot;https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        📃 Paper
        &lt;/a&gt; |
        &lt;a href=&quot;https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
        📝 Blogpost
        &lt;/a&gt; |
        &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
        📄 License: Apache 2.0
        &lt;/a&gt;
        &lt;/div&gt;
        &lt;h1&gt;Model Card for omniASR-LLM-7B-ZS&lt;/h1&gt;
        &lt;h2&gt;Model Description&lt;/h2&gt;
        &lt;p&gt;This model is part of the &lt;strong&gt;Omnilingual ASR&lt;/strong&gt; family released by Meta AI. The original suite includes:&lt;/p&gt;
        &lt;!-- TODO : add new tokenizer, we&#39;ll get two tokenizer, add mssing speed numbers--&gt;
        &lt;table&gt;
        &lt;thead&gt;
        &lt;tr&gt;
        &lt;th&gt;Model Name&lt;/th&gt;
        &lt;th&gt;Features&lt;/th&gt;
        &lt;th style=&quot;text-align:right&quot;&gt;Parameters&lt;/th&gt;
        &lt;th style=&quot;text-align:right&quot;&gt;Download Size (FP32)&lt;/th&gt;
        &lt;th style=&quot;text-align:right&quot;&gt;Inference VRAM¹&lt;/th&gt;
        &lt;th style=&quot;text-align:right&quot;&gt;Real-Time Factor¹ (relative speed)²&lt;/th&gt;
        &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-300M&quot;&gt;&lt;code&gt;omniASR_W2V_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;SSL&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;317_390_592&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;1.2 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-1B&quot;&gt;&lt;code&gt;omniASR_W2V_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;SSL&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;965_514_752&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;3.6 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-3B&quot;&gt;&lt;code&gt;omniASR_W2V_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;SSL&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;3_064_124_672&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-7B&quot;&gt;&lt;code&gt;omniASR_W2V_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;SSL&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;6_488_487_168&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-300M&quot;&gt;&lt;code&gt;omniASR_CTC_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;325_494_996&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;1.3 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~2 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.001 (96x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-1B&quot;&gt;&lt;code&gt;omniASR_CTC_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;975_065_300&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;3.7 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~3 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.002 (48x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-3B&quot;&gt;&lt;code&gt;omniASR_CTC_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;3_080_423_636&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~8 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.003 (32x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-7B&quot;&gt;&lt;code&gt;omniASR_CTC_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;6_504_786_132&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~15 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.006 (16x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-300M&quot;&gt;&lt;code&gt;omniASR_LLM_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;1_627_603_584&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;6.1 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~5 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.090 (~1x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-1B&quot;&gt;&lt;code&gt;omniASR_LLM_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;2_275_710_592&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;8.5 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~6 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.091 (~1x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-3B&quot;&gt;&lt;code&gt;omniASR_LLM_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;4_376_679_040&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;17.0 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~10 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.093 (~1x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B&quot;&gt;&lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;7_801_041_536&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~17 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.092 (~1x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B-ZS&quot;&gt;&lt;code&gt;omniASR_LLM_7B_ZS&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;Zero-Shot ASR&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;7_810_900_608&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;~20 GiB&lt;/td&gt;
        &lt;td style=&quot;text-align:right&quot;&gt;0.194 (~0.5x)&lt;/td&gt;
        &lt;/tr&gt;
        &lt;/tbody&gt;
        &lt;/table&gt;
        &lt;p&gt;¹ (batch=1, audio_len=30s, BF16, A100)&lt;/p&gt;
        &lt;p&gt;² Relative speed to &lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/p&gt;
        &lt;hr&gt;
        &lt;h2&gt;Installation&lt;/h2&gt;
        &lt;p&gt;The models were developed using &lt;a href=&quot;https://github.com/facebookresearch/fairseq2&quot;&gt;fairseq2&lt;/a&gt;, a research-focused sequence modeling toolkit. While we provide a &lt;strong&gt;reference&lt;/strong&gt; inference pipeline that works across platforms, audio support requires &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#system-dependencies&quot;&gt;libsndfile&lt;/a&gt; (Mac: &lt;code&gt;brew install libsndfile&lt;/code&gt;; Windows may need an additional &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#installing-on-windows&quot;&gt;setup&lt;/a&gt;).&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# using pip
        pip install omnilingual-asr
        # using uv
        uv add omnilingual-asr
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Inference&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
        pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
        audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
        lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
        transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Supported Languages&lt;/h2&gt;
        &lt;p&gt;To view the full list of 1600+ supported languages, you can access the language list &lt;a href=&quot;https://huggingface.co/src/omnilingual_asr/models/wav2vec2_llama/lang_ids.py&quot;&gt;programmatically&lt;/a&gt;:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
        # Print all supported languages
        print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
        print(supported_langs)
        # Check if a specific language is supported
        if &quot;eng_Latn&quot; in supported_langs:
        print(&quot;English (Latin script) is supported!&quot;)
        &lt;/code&gt;&lt;/pre&gt;
        &lt;p&gt;Languages follow the format &lt;code&gt;{language_code}_{script}&lt;/code&gt;, for example &lt;code&gt;eng_Latn&lt;/code&gt; - English (Latin script), &lt;code&gt;cmn_Hans&lt;/code&gt; - Mandarin Chinese (Simplified), ...&lt;/p&gt;
        &lt;hr&gt;
        &lt;h2&gt;Training&lt;/h2&gt;
        &lt;p&gt;To further finetune the released checkpoints on your own data, use our &lt;a href=&quot;https://huggingface.co/workflows/dataprep/README.md&quot;&gt;data preparation guide&lt;/a&gt; followed by the &lt;a href=&quot;https://huggingface.co/workflows/recipes/wav2vec2/asr/README.md&quot;&gt;finetuning recipe guide&lt;/a&gt;.&lt;/p&gt;
        &lt;hr&gt;
        &lt;h2&gt;Citation&lt;/h2&gt;
        &lt;p&gt;&lt;strong&gt;BibTeX:&lt;/strong&gt;&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{omnilingualasr2025,
        title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Developed by:&lt;/strong&gt; Meta AI / Omnilingual ASR Team(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model type:&lt;/strong&gt; End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Language(s) (NLP):&lt;/strong&gt; 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers &lt;strong&gt;348 under-served languages&lt;/strong&gt; across many writing systems (Latin, Arabic, Devanagari, etc.).(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;License:&lt;/strong&gt; Apache-2.0 (for the model and code), CC-BY-4.0 for the &lt;code&gt;facebook/omnilingual-asr-corpus&lt;/code&gt; dataset.(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
</description><link>https://huggingface.co/facebook/omniASR-LLM-7B-ZS</link><guid isPermaLink="false">https://huggingface.co/facebook/omniASR-LLM-7B-ZS</guid><pubDate>Thu, 27 Nov 2025 23:41:34 GMT</pubDate></item><item><title>facebook/omniASR-LLM-7B</title><description>facebook/omniASR-LLM-7B Automatic Speech Recognition • Updated 9 days ago • 12&lt;hr&gt;
&lt;p&gt;license: apache-2.0
datasets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h1&gt;Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages&lt;/h1&gt;
&lt;div align=&quot;center&quot; style=&quot;lline-height: 1.2; font-size:16px; margin-bottom: 30px;&quot;&gt;
  &lt;a href=&quot;https://huggingface.co/facebook&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤗 Hugging Face
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🐙 GitHub
  &lt;/a&gt; |
  &lt;a href=&quot;https://huggingface.co/spaces/facebook/omniasr-transcriptions&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤖️ Demo
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📃 Paper
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📝 Blogpost
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
    📄 License: Apache 2.0
  &lt;/a&gt;
&lt;/div&gt;
&lt;h1&gt;Model Card for omniASR-LLM-7B&lt;/h1&gt;
&lt;h2&gt;Model Description&lt;/h2&gt;
&lt;p&gt;This model is part of the &lt;strong&gt;Omnilingual ASR&lt;/strong&gt; family released by Meta AI. The original suite includes:&lt;/p&gt;
&lt;!-- TODO : add new tokenizer, we&#39;ll get two tokenizer, add mssing speed numbers--&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Name&lt;/th&gt;
&lt;th&gt;Features&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Parameters&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Download Size (FP32)&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Inference VRAM¹&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Real-Time Factor¹ (relative speed)²&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-300M&quot;&gt;&lt;code&gt;omniASR_W2V_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;317_390_592&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1.2 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-1B&quot;&gt;&lt;code&gt;omniASR_W2V_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;965_514_752&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3.6 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-3B&quot;&gt;&lt;code&gt;omniASR_W2V_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3_064_124_672&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-7B&quot;&gt;&lt;code&gt;omniASR_W2V_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6_488_487_168&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-300M&quot;&gt;&lt;code&gt;omniASR_CTC_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;325_494_996&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1.3 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~2 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.001 (96x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-1B&quot;&gt;&lt;code&gt;omniASR_CTC_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;975_065_300&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3.7 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~3 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.002 (48x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-3B&quot;&gt;&lt;code&gt;omniASR_CTC_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3_080_423_636&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~8 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.003 (32x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-7B&quot;&gt;&lt;code&gt;omniASR_CTC_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6_504_786_132&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~15 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.006 (16x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-300M&quot;&gt;&lt;code&gt;omniASR_LLM_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1_627_603_584&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6.1 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~5 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.090 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-1B&quot;&gt;&lt;code&gt;omniASR_LLM_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;2_275_710_592&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;8.5 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~6 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.091 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-3B&quot;&gt;&lt;code&gt;omniASR_LLM_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;4_376_679_040&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;17.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~10 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.093 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B&quot;&gt;&lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;7_801_041_536&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~17 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.092 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B-ZS&quot;&gt;&lt;code&gt;omniASR_LLM_7B_ZS&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Zero-Shot ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;7_810_900_608&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~20 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.194 (~0.5x)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;¹ (batch=1, audio_len=30s, BF16, A100)&lt;/p&gt;
&lt;p&gt;² Relative speed to &lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Installation&lt;/h2&gt;
&lt;p&gt;The models were developed using &lt;a href=&quot;https://github.com/facebookresearch/fairseq2&quot;&gt;fairseq2&lt;/a&gt;, a research-focused sequence modeling toolkit. While we provide a &lt;strong&gt;reference&lt;/strong&gt; inference pipeline that works across platforms, audio support requires &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#system-dependencies&quot;&gt;libsndfile&lt;/a&gt; (Mac: &lt;code&gt;brew install libsndfile&lt;/code&gt;; Windows may need an additional &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#installing-on-windows&quot;&gt;setup&lt;/a&gt;).&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Inference&lt;/h2&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Supported Languages&lt;/h2&gt;
&lt;p&gt;To view the full list of 1600+ supported languages, you can access the language list &lt;a href=&quot;https://huggingface.co/src/omnilingual_asr/models/wav2vec2_llama/lang_ids.py&quot;&gt;programmatically&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Languages follow the format &lt;code&gt;{language_code}_{script}&lt;/code&gt;, for example &lt;code&gt;eng_Latn&lt;/code&gt; - English (Latin script), &lt;code&gt;cmn_Hans&lt;/code&gt; - Mandarin Chinese (Simplified), ...&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Training&lt;/h2&gt;
&lt;p&gt;To further finetune the released checkpoints on your own data, use our &lt;a href=&quot;https://huggingface.co/workflows/dataprep/README.md&quot;&gt;data preparation guide&lt;/a&gt; followed by the &lt;a href=&quot;https://huggingface.co/workflows/recipes/wav2vec2/asr/README.md&quot;&gt;finetuning recipe guide&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Citation&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;BibTeX:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Developed by:&lt;/strong&gt; Meta AI / Omnilingual ASR Team(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model type:&lt;/strong&gt; End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Language(s) (NLP):&lt;/strong&gt; 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers &lt;strong&gt;348 under-served languages&lt;/strong&gt; across many writing systems (Latin, Arabic, Devanagari, etc.).(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;License:&lt;/strong&gt; Apache-2.0 (for the model and code), CC-BY-4.0 for the &lt;code&gt;facebook/omnilingual-asr-corpus&lt;/code&gt; dataset.(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
</description><link>https://huggingface.co/facebook/omniASR-LLM-7B</link><guid isPermaLink="false">https://huggingface.co/facebook/omniASR-LLM-7B</guid><pubDate>Thu, 27 Nov 2025 23:26:06 GMT</pubDate></item><item><title>facebook/omniASR-LLM-3B</title><description>facebook/omniASR-LLM-3B Automatic Speech Recognition • Updated 9 days ago • 1&lt;hr&gt;
&lt;p&gt;license: apache-2.0
datasets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h1&gt;Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages&lt;/h1&gt;
&lt;div align=&quot;center&quot; style=&quot;lline-height: 1.2; font-size:16px; margin-bottom: 30px;&quot;&gt;
  &lt;a href=&quot;https://huggingface.co/facebook&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤗 Hugging Face
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🐙 GitHub
  &lt;/a&gt; |
  &lt;a href=&quot;https://huggingface.co/spaces/facebook/omniasr-transcriptions&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤖️ Demo
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📃 Paper
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📝 Blogpost
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
    📄 License: Apache 2.0
  &lt;/a&gt;
&lt;/div&gt;
&lt;h1&gt;Model Card for omniASR-LLM-3B&lt;/h1&gt;
&lt;h2&gt;Model Description&lt;/h2&gt;
&lt;p&gt;This model is part of the &lt;strong&gt;Omnilingual ASR&lt;/strong&gt; family released by Meta AI. The original suite includes:&lt;/p&gt;
&lt;!-- TODO : add new tokenizer, we&#39;ll get two tokenizer, add mssing speed numbers--&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Name&lt;/th&gt;
&lt;th&gt;Features&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Parameters&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Download Size (FP32)&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Inference VRAM¹&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Real-Time Factor¹ (relative speed)²&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-300M&quot;&gt;&lt;code&gt;omniASR_W2V_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;317_390_592&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1.2 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-1B&quot;&gt;&lt;code&gt;omniASR_W2V_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;965_514_752&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3.6 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-3B&quot;&gt;&lt;code&gt;omniASR_W2V_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3_064_124_672&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-7B&quot;&gt;&lt;code&gt;omniASR_W2V_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6_488_487_168&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-300M&quot;&gt;&lt;code&gt;omniASR_CTC_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;325_494_996&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1.3 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~2 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.001 (96x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-1B&quot;&gt;&lt;code&gt;omniASR_CTC_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;975_065_300&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3.7 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~3 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.002 (48x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-3B&quot;&gt;&lt;code&gt;omniASR_CTC_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3_080_423_636&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~8 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.003 (32x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-7B&quot;&gt;&lt;code&gt;omniASR_CTC_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6_504_786_132&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~15 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.006 (16x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-300M&quot;&gt;&lt;code&gt;omniASR_LLM_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1_627_603_584&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6.1 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~5 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.090 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-1B&quot;&gt;&lt;code&gt;omniASR_LLM_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;2_275_710_592&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;8.5 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~6 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.091 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-3B&quot;&gt;&lt;code&gt;omniASR_LLM_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;4_376_679_040&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;17.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~10 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.093 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B&quot;&gt;&lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;7_801_041_536&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~17 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.092 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B-ZS&quot;&gt;&lt;code&gt;omniASR_LLM_7B_ZS&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Zero-Shot ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;7_810_900_608&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~20 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.194 (~0.5x)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;¹ (batch=1, audio_len=30s, BF16, A100)&lt;/p&gt;
&lt;p&gt;² Relative speed to &lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Installation&lt;/h2&gt;
&lt;p&gt;The models were developed using &lt;a href=&quot;https://github.com/facebookresearch/fairseq2&quot;&gt;fairseq2&lt;/a&gt;, a research-focused sequence modeling toolkit. While we provide a &lt;strong&gt;reference&lt;/strong&gt; inference pipeline that works across platforms, audio support requires &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#system-dependencies&quot;&gt;libsndfile&lt;/a&gt; (Mac: &lt;code&gt;brew install libsndfile&lt;/code&gt;; Windows may need an additional &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#installing-on-windows&quot;&gt;setup&lt;/a&gt;).&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Inference&lt;/h2&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Supported Languages&lt;/h2&gt;
&lt;p&gt;To view the full list of 1600+ supported languages, you can access the language list &lt;a href=&quot;https://huggingface.co/src/omnilingual_asr/models/wav2vec2_llama/lang_ids.py&quot;&gt;programmatically&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Languages follow the format &lt;code&gt;{language_code}_{script}&lt;/code&gt;, for example &lt;code&gt;eng_Latn&lt;/code&gt; - English (Latin script), &lt;code&gt;cmn_Hans&lt;/code&gt; - Mandarin Chinese (Simplified), ...&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Training&lt;/h2&gt;
&lt;p&gt;To further finetune the released checkpoints on your own data, use our &lt;a href=&quot;https://huggingface.co/workflows/dataprep/README.md&quot;&gt;data preparation guide&lt;/a&gt; followed by the &lt;a href=&quot;https://huggingface.co/workflows/recipes/wav2vec2/asr/README.md&quot;&gt;finetuning recipe guide&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Citation&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;BibTeX:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Developed by:&lt;/strong&gt; Meta AI / Omnilingual ASR Team(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model type:&lt;/strong&gt; End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Language(s) (NLP):&lt;/strong&gt; 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers &lt;strong&gt;348 under-served languages&lt;/strong&gt; across many writing systems (Latin, Arabic, Devanagari, etc.).(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;License:&lt;/strong&gt; Apache-2.0 (for the model and code), CC-BY-4.0 for the &lt;code&gt;facebook/omnilingual-asr-corpus&lt;/code&gt; dataset.(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
</description><link>https://huggingface.co/facebook/omniASR-LLM-3B</link><guid isPermaLink="false">https://huggingface.co/facebook/omniASR-LLM-3B</guid><pubDate>Thu, 27 Nov 2025 23:08:56 GMT</pubDate></item><item><title>facebook/omniASR-LLM-1B</title><description>facebook/omniASR-LLM-1B Automatic Speech Recognition • Updated 9 days ago • 3&lt;hr&gt;
&lt;p&gt;license: apache-2.0
datasets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h1&gt;Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages&lt;/h1&gt;
&lt;div align=&quot;center&quot; style=&quot;lline-height: 1.2; font-size:16px; margin-bottom: 30px;&quot;&gt;
  &lt;a href=&quot;https://huggingface.co/facebook&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤗 Hugging Face
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🐙 GitHub
  &lt;/a&gt; |
  &lt;a href=&quot;https://huggingface.co/spaces/facebook/omniasr-transcriptions&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤖️ Demo
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📃 Paper
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📝 Blogpost
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
    📄 License: Apache 2.0
  &lt;/a&gt;
&lt;/div&gt;
&lt;h1&gt;Model Card for omniASR-LLM-1B&lt;/h1&gt;
&lt;h2&gt;Model Description&lt;/h2&gt;
&lt;p&gt;This model is part of the &lt;strong&gt;Omnilingual ASR&lt;/strong&gt; family released by Meta AI. The original suite includes:&lt;/p&gt;
&lt;!-- TODO : add new tokenizer, we&#39;ll get two tokenizer, add mssing speed numbers--&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Name&lt;/th&gt;
&lt;th&gt;Features&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Parameters&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Download Size (FP32)&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Inference VRAM¹&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Real-Time Factor¹ (relative speed)²&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-300M&quot;&gt;&lt;code&gt;omniASR_W2V_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;317_390_592&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1.2 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-1B&quot;&gt;&lt;code&gt;omniASR_W2V_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;965_514_752&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3.6 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-3B&quot;&gt;&lt;code&gt;omniASR_W2V_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3_064_124_672&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-7B&quot;&gt;&lt;code&gt;omniASR_W2V_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6_488_487_168&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-300M&quot;&gt;&lt;code&gt;omniASR_CTC_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;325_494_996&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1.3 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~2 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.001 (96x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-1B&quot;&gt;&lt;code&gt;omniASR_CTC_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;975_065_300&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3.7 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~3 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.002 (48x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-3B&quot;&gt;&lt;code&gt;omniASR_CTC_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3_080_423_636&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~8 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.003 (32x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-7B&quot;&gt;&lt;code&gt;omniASR_CTC_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6_504_786_132&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~15 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.006 (16x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-300M&quot;&gt;&lt;code&gt;omniASR_LLM_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1_627_603_584&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6.1 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~5 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.090 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-1B&quot;&gt;&lt;code&gt;omniASR_LLM_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;2_275_710_592&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;8.5 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~6 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.091 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-3B&quot;&gt;&lt;code&gt;omniASR_LLM_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;4_376_679_040&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;17.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~10 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.093 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B&quot;&gt;&lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR with optional language conditioning&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;7_801_041_536&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~17 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.092 (~1x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-LLM-7B-ZS&quot;&gt;&lt;code&gt;omniASR_LLM_7B_ZS&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Zero-Shot ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;7_810_900_608&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;30.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;~20 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;0.194 (~0.5x)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;¹ (batch=1, audio_len=30s, BF16, A100)&lt;/p&gt;
&lt;p&gt;² Relative speed to &lt;code&gt;omniASR_LLM_7B&lt;/code&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Installation&lt;/h2&gt;
&lt;p&gt;The models were developed using &lt;a href=&quot;https://github.com/facebookresearch/fairseq2&quot;&gt;fairseq2&lt;/a&gt;, a research-focused sequence modeling toolkit. While we provide a &lt;strong&gt;reference&lt;/strong&gt; inference pipeline that works across platforms, audio support requires &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#system-dependencies&quot;&gt;libsndfile&lt;/a&gt; (Mac: &lt;code&gt;brew install libsndfile&lt;/code&gt;; Windows may need an additional &lt;a href=&quot;https://github.com/facebookresearch/fairseq2?tab=readme-ov-file#installing-on-windows&quot;&gt;setup&lt;/a&gt;).&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# using pip
pip install omnilingual-asr
# using uv
uv add omnilingual-asr
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Inference&lt;/h2&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.inference.pipeline import ASRInferencePipeline
pipeline = ASRInferencePipeline(model_card=&quot;omniASR_LLM_7B&quot;)
audio_files = [&quot;/path/to/eng_audio1.flac&quot;, &quot;/path/to/deu_audio2.wav&quot;]
lang = [&quot;eng_Latn&quot;, &quot;deu_Latn&quot;]
transcriptions = pipeline.transcribe(audio_files, lang=lang, batch_size=2)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Supported Languages&lt;/h2&gt;
&lt;p&gt;To view the full list of 1600+ supported languages, you can access the language list &lt;a href=&quot;https://huggingface.co/src/omnilingual_asr/models/wav2vec2_llama/lang_ids.py&quot;&gt;programmatically&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from omnilingual_asr.models.wav2vec2_llama.lang_ids import supported_langs
# Print all supported languages
print(f&quot;Total supported languages: {len(supported_langs)}&quot;)
print(supported_langs)
# Check if a specific language is supported
if &quot;eng_Latn&quot; in supported_langs:
    print(&quot;English (Latin script) is supported!&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Languages follow the format &lt;code&gt;{language_code}_{script}&lt;/code&gt;, for example &lt;code&gt;eng_Latn&lt;/code&gt; - English (Latin script), &lt;code&gt;cmn_Hans&lt;/code&gt; - Mandarin Chinese (Simplified), ...&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Training&lt;/h2&gt;
&lt;p&gt;To further finetune the released checkpoints on your own data, use our &lt;a href=&quot;https://huggingface.co/workflows/dataprep/README.md&quot;&gt;data preparation guide&lt;/a&gt; followed by the &lt;a href=&quot;https://huggingface.co/workflows/recipes/wav2vec2/asr/README.md&quot;&gt;finetuning recipe guide&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;Citation&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;BibTeX:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{omnilingualasr2025,
  title={{Omnilingual ASR}: Open-Source Multilingual Speech Recognition for 1600+ Languages},
  author={{Omnilingual ASR Team} and Keren, Gil and Kozhevnikov, Artyom and Meng, Yen and Ropers, Christophe and Setzler, Matthew and Wang, Skyler and Adebara, Ife and Auli, Michael and Can, Balioglu and Chan, Kevin and Cheng, Chierh and Chuang, Joe and Droof, Caley and Duppenthaler, Mark and Duquenne, Paul-Ambroise and Erben, Alexander and Gao, Cynthia and Mejia Gonzalez, Gabriel and Lyu, Kehan and Miglani, Sagar and Pratap, Vineel and Sadagopan, Kaushik Ram and Saleem, Safiyyah and Turkatenko, Arina and Ventayol-Boada, Albert and Yong, Zheng-Xin and Chung, Yu-An and Maillard, Jean and Moritz, Rashel and Mourachko, Alexandre and Williamson, Mary and Yates, Shireen},
  year={2025},
  url={https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/},
}
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Developed by:&lt;/strong&gt; Meta AI / Omnilingual ASR Team(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model type:&lt;/strong&gt; End-to-end automatic speech recognition model (wav2vec2-style encoder with CTC head / encoder-decoder, depending on checkpoint).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Language(s) (NLP):&lt;/strong&gt; 1,600+ languages overall in Omnilingual ASR; this corpus release specifically covers &lt;strong&gt;348 under-served languages&lt;/strong&gt; across many writing systems (Latin, Arabic, Devanagari, etc.).(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;License:&lt;/strong&gt; Apache-2.0 (for the model and code), CC-BY-4.0 for the &lt;code&gt;facebook/omnilingual-asr-corpus&lt;/code&gt; dataset.(&lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr?tab=readme-ov-file&quot; title=&quot;GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages&quot;&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
</description><link>https://huggingface.co/facebook/omniASR-LLM-1B</link><guid isPermaLink="false">https://huggingface.co/facebook/omniASR-LLM-1B</guid><pubDate>Thu, 27 Nov 2025 22:37:27 GMT</pubDate></item><item><title>facebook/omniASR-LLM-300M</title><description>facebook/omniASR-LLM-300M Automatic Speech Recognition • Updated 9 days ago • 3&lt;hr&gt;
&lt;p&gt;license: apache-2.0
datasets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;facebook/omnilingual-asr-corpus
pipeline_tag: automatic-speech-recognition&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h1&gt;Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages&lt;/h1&gt;
&lt;div align=&quot;center&quot; style=&quot;lline-height: 1.2; font-size:16px; margin-bottom: 30px;&quot;&gt;
  &lt;a href=&quot;https://huggingface.co/facebook&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤗 Hugging Face
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🐙 GitHub
  &lt;/a&gt; |
  &lt;a href=&quot;https://huggingface.co/spaces/facebook/omniasr-transcriptions&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    🤖️ Demo
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/research/publications/omnilingual-asr-open-source-multilingual-speech-recognition-for-1600-languages/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📃 Paper
  &lt;/a&gt; |
  &lt;a href=&quot;https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/&quot; target=&quot;_blank&quot; style=&quot;margin: 2px;&quot;&gt;
    📝 Blogpost
  &lt;/a&gt; |
  &lt;a href=&quot;https://github.com/facebookresearch/omnilingual-asr/blob/main/LICENSE&quot; style=&quot;margin: 2px;&quot;&gt;
    📄 License: Apache 2.0
  &lt;/a&gt;
&lt;/div&gt;
&lt;h1&gt;Model Card for omniASR-LLM-300M&lt;/h1&gt;
&lt;h2&gt;Model Description&lt;/h2&gt;
&lt;p&gt;This model is part of the &lt;strong&gt;Omnilingual ASR&lt;/strong&gt; family released by Meta AI. The original suite includes:&lt;/p&gt;
&lt;!-- TODO : add new tokenizer, we&#39;ll get two tokenizer, add mssing speed numbers--&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Name&lt;/th&gt;
&lt;th&gt;Features&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Parameters&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Download Size (FP32)&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Inference VRAM¹&lt;/th&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Real-Time Factor¹ (relative speed)²&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-300M&quot;&gt;&lt;code&gt;omniASR_W2V_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;317_390_592&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;1.2 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-1B&quot;&gt;&lt;code&gt;omniASR_W2V_1B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;965_514_752&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3.6 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-3B&quot;&gt;&lt;code&gt;omniASR_W2V_3B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;3_064_124_672&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;12.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-W2V-7B&quot;&gt;&lt;code&gt;omniASR_W2V_7B&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;SSL&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;6_488_487_168&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;25.0 GiB&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://huggingface.co/facebook/omniASR-CTC-300M&quot;&gt;&lt;code&gt;omniASR_CTC_300M&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ASR&lt;/td&gt;
&lt;td style=&quot;text-align:right&quot;&gt;325_494_996&lt;/td&gt;
&lt;td style=&quot;text-align:rig

...

github-actions · 2025-12-07T03:02:34Z

http://localhost:1200/huggingface/models/ianyang02 - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Huggingface ianyang02 Models</title>
    <link>https://huggingface.co/ianyang02/models?sort=created</link>
    <atom:link href="http://localhost:1200/huggingface/models/ianyang02" rel="self" type="application/rss+xml"></atom:link>
    <description>Huggingface ianyang02 Models - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <language>en</language>
    <lastBuildDate>Sun, 07 Dec 2025 03:02:32 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>ianyang02/aita_qwen3-30b</title>
      <description>ianyang02/aita_qwen3-30b Updated 1 day ago</description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-30b</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-30b</guid>
      <pubDate>Fri, 05 Dec 2025 14:17:52 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_dpo</title>
      <description>ianyang02/aita_qwen3-4b_dpo Updated 15 days ago&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_dpo
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;generated_from_trainer&lt;/li&gt;
        &lt;li&gt;trl&lt;/li&gt;
        &lt;li&gt;dpo
        licence: license&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for aita_qwen3-4b_dpo&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507&quot;&gt;Qwen/Qwen3-4B-Instruct-2507&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/aita_qwen3-4b_dpo&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/4xnikck8&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with DPO, a method introduced in &lt;a href=&quot;https://huggingface.co/papers/2305.18290&quot;&gt;Direct Preference Optimization: Your Language Model is Secretly a Reward Model&lt;/a&gt;.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;TRL: 0.24.0&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.9.0&lt;/li&gt;
        &lt;li&gt;Datasets: 4.3.0&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite DPO as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@inproceedings{rafailov2023direct,
        title = {{Direct Preference Optimization: Your Language Model is Secretly a Reward Model}},
        author = {Rafael Rafailov and Archit Sharma and Eric Mitchell and Christopher D. Manning and Stefano Ermon and Chelsea Finn},
        year = 2023,
        booktitle = {Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023},
        url = {http://papers.nips.cc/paper_files/paper/2023/hash/a85b405ed65c6477a4fe8302b5e06ce7-Abstract-Conference.html},
        editor = {Alice Oh and Tristan Naumann and Amir Globerson and Kate Saenko and Moritz Hardt and Sergey Levine},
        }
        &lt;/code&gt;&lt;/pre&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_dpo</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_dpo</guid>
      <pubDate>Fri, 21 Nov 2025 15:09:21 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/ppo_model_qwen3-4b_aita_h200_2</title>
      <description>ianyang02/ppo_model_qwen3-4b_aita_h200_2 Text Generation • Updated 15 days ago • 66&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: peft
        model_name: ppo_model_qwen3-4b_aita_h200_2
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;base_model:adapter:Qwen/Qwen3-4B-Instruct-2507&lt;/li&gt;
        &lt;li&gt;lora&lt;/li&gt;
        &lt;li&gt;transformers
        licence: license
        pipeline_tag: text-generation&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for ppo_model_qwen3-4b_aita_h200_2&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507&quot;&gt;Qwen/Qwen3-4B-Instruct-2507&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/ppo_model_qwen3-4b_aita_h200_2&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/x99g8ppd&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with PPO, a method introduced in &lt;a href=&quot;https://huggingface.co/papers/1909.08593&quot;&gt;Fine-Tuning Language Models from Human Preferences&lt;/a&gt;.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;PEFT 0.18.0&lt;/li&gt;
        &lt;li&gt;TRL: 0.25.1&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.7.0.dev20250224+cu126&lt;/li&gt;
        &lt;li&gt;Datasets: 4.4.1&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite PPO as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@article{mziegler2019fine-tuning,
        title = {{Fine-Tuning Language Models from Human Preferences}},
        author = {Daniel M. Ziegler and Nisan Stiennon and Jeffrey Wu and Tom B. Brown and Alec Radford and Dario Amodei and Paul F. Christiano and Geoffrey Irving},
        year = 2019,
        eprint = {arXiv:1909.08593}
        }
        &lt;/code&gt;&lt;/pre&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200_2</guid>
      <pubDate>Fri, 21 Nov 2025 08:19:14 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2</title>
      <description>ianyang02/aita_qwen3-4b_2 Text Classification • 4B • Updated 17 days ago • 23&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;generated_from_trainer&lt;/li&gt;
        &lt;li&gt;trl&lt;/li&gt;
        &lt;li&gt;reward-trainer
        licence: license&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for aita_qwen3-4b_2&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507&quot;&gt;Qwen/Qwen3-4B-Instruct-2507&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with Reward.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;TRL: 0.25.1&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.5.0.dev20240818+cu124&lt;/li&gt;
        &lt;li&gt;Datasets: 4.4.1&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2</guid>
      <pubDate>Wed, 19 Nov 2025 23:02:33 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2_checkpoint_2</title>
      <description>ianyang02/aita_qwen3-4b_2_checkpoint_2 Text Classification • 4B • Updated 17 days ago • 25&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;generated_from_trainer&lt;/li&gt;
        &lt;li&gt;trl&lt;/li&gt;
        &lt;li&gt;reward-trainer
        licence: license&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for aita_qwen3-4b_2&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507&quot;&gt;Qwen/Qwen3-4B-Instruct-2507&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with Reward.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;TRL: 0.25.1&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.5.0.dev20240818+cu124&lt;/li&gt;
        &lt;li&gt;Datasets: 4.4.1&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_2</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_2</guid>
      <pubDate>Wed, 19 Nov 2025 19:52:00 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b_2_checkpoint_1</title>
      <description>ianyang02/aita_qwen3-4b_2_checkpoint_1 Text Classification • 4B • Updated 17 days ago • 4&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b_2
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;generated_from_trainer&lt;/li&gt;
        &lt;li&gt;trl&lt;/li&gt;
        &lt;li&gt;reward-trainer
        licence: license&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for aita_qwen3-4b_2&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507&quot;&gt;Qwen/Qwen3-4B-Instruct-2507&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b_2&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/lacazue4&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with Reward.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;TRL: 0.25.1&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.5.0.dev20240818+cu124&lt;/li&gt;
        &lt;li&gt;Datasets: 4.4.1&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_1</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b_2_checkpoint_1</guid>
      <pubDate>Wed, 19 Nov 2025 16:40:34 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/ppo_model_qwen3-4b_aita_h200</title>
      <description>ianyang02/ppo_model_qwen3-4b_aita_h200 Updated 18 days ago&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: ppo_model_qwen3-4b_aita_h200
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;generated_from_trainer&lt;/li&gt;
        &lt;li&gt;trl&lt;/li&gt;
        &lt;li&gt;ppo
        licence: license&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for ppo_model_qwen3-4b_aita_h200&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507&quot;&gt;Qwen/Qwen3-4B-Instruct-2507&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        question = &quot;If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?&quot;
        generator = pipeline(&quot;text-generation&quot;, model=&quot;ianyang02/ppo_model_qwen3-4b_aita_h200&quot;, device=&quot;cuda&quot;)
        output = generator([{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: question}], max_new_tokens=128, return_full_text=False)[0]
        print(output[&quot;generated_text&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/ht40cr4d&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with PPO, a method introduced in &lt;a href=&quot;https://huggingface.co/papers/1909.08593&quot;&gt;Fine-Tuning Language Models from Human Preferences&lt;/a&gt;.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;TRL: 0.24.0&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.9.0&lt;/li&gt;
        &lt;li&gt;Datasets: 4.3.0&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite PPO as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@article{mziegler2019fine-tuning,
        title = {{Fine-Tuning Language Models from Human Preferences}},
        author = {Daniel M. Ziegler and Nisan Stiennon and Jeffrey Wu and Tom B. Brown and Alec Radford and Dario Amodei and Paul F. Christiano and Geoffrey Irving},
        year = 2019,
        eprint = {arXiv:1909.08593}
        }
        &lt;/code&gt;&lt;/pre&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/ppo_model_qwen3-4b_aita_h200</guid>
      <pubDate>Tue, 18 Nov 2025 16:21:51 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_qwen3-4b</title>
      <description>ianyang02/aita_qwen3-4b Text Classification • 4B • Updated 21 days ago • 93&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-4B-Instruct-2507
        library_name: transformers
        model_name: aita_qwen3-4b
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;generated_from_trainer&lt;/li&gt;
        &lt;li&gt;trl&lt;/li&gt;
        &lt;li&gt;reward-trainer
        licence: license&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for aita_qwen3-4b&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507&quot;&gt;Qwen/Qwen3-4B-Instruct-2507&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_qwen3-4b&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/ofxhxi45&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with Reward.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;TRL: 0.24.0&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.9.0&lt;/li&gt;
        &lt;li&gt;Datasets: 4.3.0&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/aita_qwen3-4b</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_qwen3-4b</guid>
      <pubDate>Sun, 16 Nov 2025 01:32:06 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_fine_tuned_Qwen3_0.6B</title>
      <description>ianyang02/aita_fine_tuned_Qwen3_0.6B Updated 25 days ago</description>
      <link>https://huggingface.co/ianyang02/aita_fine_tuned_Qwen3_0.6B</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_fine_tuned_Qwen3_0.6B</guid>
      <pubDate>Tue, 11 Nov 2025 16:18:49 GMT</pubDate>
    </item>
    <item>
      <title>ianyang02/aita_Qwen3-0.6B</title>
      <description>ianyang02/aita_Qwen3-0.6B Text Classification • 0.6B • Updated 28 days ago • 50&lt;hr&gt;
        &lt;p&gt;base_model: Qwen/Qwen3-0.6B
        library_name: transformers
        model_name: aita_Qwen3-0.6B
        tags:&lt;/p&gt;
        &lt;ul&gt;
        &lt;li&gt;generated_from_trainer&lt;/li&gt;
        &lt;li&gt;reward-trainer&lt;/li&gt;
        &lt;li&gt;trl
        licence: license&lt;/li&gt;
        &lt;/ul&gt;
        &lt;hr&gt;
        &lt;h1&gt;Model Card for aita_Qwen3-0.6B&lt;/h1&gt;
        &lt;p&gt;This model is a fine-tuned version of &lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-0.6B&quot;&gt;Qwen/Qwen3-0.6B&lt;/a&gt;.
        It has been trained using &lt;a href=&quot;https://github.com/huggingface/trl&quot;&gt;TRL&lt;/a&gt;.&lt;/p&gt;
        &lt;h2&gt;Quick start&lt;/h2&gt;
        &lt;pre&gt;&lt;code class=&quot;language-python&quot;&gt;from transformers import pipeline
        text = &quot;The capital of France is Paris.&quot;
        rewarder = pipeline(model=&quot;ianyang02/aita_Qwen3-0.6B&quot;, device=&quot;cuda&quot;)
        output = rewarder(text)[0]
        print(output[&quot;score&quot;])
        &lt;/code&gt;&lt;/pre&gt;
        &lt;h2&gt;Training procedure&lt;/h2&gt;
        &lt;p&gt;&lt;a href=&quot;https://wandb.ai/ianyang02-university-of-washington/huggingface/runs/7xwpy737&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg&quot; alt=&quot;Visualize in Weights &amp;amp; Biases&quot; width=&quot;150&quot; height=&quot;24&quot; referrerpolicy=&quot;no-referrer&quot;&gt;&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;This model was trained with Reward.&lt;/p&gt;
        &lt;h3&gt;Framework versions&lt;/h3&gt;
        &lt;ul&gt;
        &lt;li&gt;TRL: 0.24.0&lt;/li&gt;
        &lt;li&gt;Transformers: 4.57.1&lt;/li&gt;
        &lt;li&gt;Pytorch: 2.9.0&lt;/li&gt;
        &lt;li&gt;Datasets: 4.3.0&lt;/li&gt;
        &lt;li&gt;Tokenizers: 0.22.1&lt;/li&gt;
        &lt;/ul&gt;
        &lt;h2&gt;Citations&lt;/h2&gt;
        &lt;p&gt;Cite TRL as:&lt;/p&gt;
        &lt;pre&gt;&lt;code class=&quot;language-bibtex&quot;&gt;@misc{vonwerra2022trl,
        title = {{TRL: Transformer Reinforcement Learning}},
        author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\&#39;e}dec},
        year = 2020,
        journal = {GitHub repository},
        publisher = {GitHub},
        howpublished = {\url{https://github.com/huggingface/trl}}
        }
        &lt;/code&gt;&lt;/pre&gt;
      </description>
      <link>https://huggingface.co/ianyang02/aita_Qwen3-0.6B</link>
      <guid isPermaLink="false">https://huggingface.co/ianyang02/aita_Qwen3-0.6B</guid>
      <pubDate>Sat, 08 Nov 2025 07:48:49 GMT</pubDate>
    </item>
  </channel>
</rss>

github-actions · 2026-01-06T23:36:26Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

WuNein added 2 commits December 6, 2025 18:59

Adding Huggingface Model details

5b70b58

more robust

d0ae277

github-actions bot added route auto: ready to review labels Dec 6, 2025

fix deepscan's problem, shall not be a problem !!!!! 404 should be a …

826df29

…proper state !!!!!

TonyRL reviewed Dec 6, 2025

View reviewed changes

lib/routes/huggingface/models.ts Outdated Show resolved Hide resolved

md2html and image url fix

ddfdea8

WuNein requested a review from TonyRL December 7, 2025 03:03

TonyRL merged commit fd12cf8 into DIYgod:master Dec 7, 2025
31 checks passed

github-actions bot locked as resolved and limited conversation to collaborators Jan 6, 2026

fix(route/huggingface): add huggingface group models detail #20646

fix(route/huggingface): add huggingface group models detail #20646

Uh oh!

Conversation

WuNein commented Dec 6, 2025

Involved Issue / 该 PR 相关 Issue

Example for the Proposed Route(s) / 路由地址示例

New RSS Route Checklist / 新 RSS 路由检查表

Note / 说明

Uh oh!

github-actions bot commented Dec 6, 2025

Uh oh!

github-actions bot commented Dec 6, 2025

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Model Card for omniASR-LLM-7B

Model Description

Installation

Inference

Supported Languages

Languages follow the format {language_code}_{script}, for example eng_Latn - English (Latin script), cmn_Hans - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Model Card for omniASR-LLM-3B

Model Description

Installation

Inference

Supported Languages

Languages follow the format {language_code}_{script}, for example eng_Latn - English (Latin script), cmn_Hans - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Model Card for omniASR-LLM-1B

Model Description

Installation

Inference

Supported Languages

Languages follow the format {language_code}_{script}, for example eng_Latn - English (Latin script), cmn_Hans - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Model Card for omniASR-LLM-300M

Model Description

Installation

Inference

Supported Languages

Languages follow the format {language_code}_{script}, for example eng_Latn - English (Latin script), cmn_Hans - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Model Card for omniASR-CTC-7B

Model Description

Installation

Inference

Supported Languages

Languages follow the format {language_code}_{script}, for example eng_Latn - English (Latin script), cmn_Hans - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Model Card for omniASR-CTC-3B

Model Description

Installation

Inference

Supported Languages

Languages follow the format {language_code}_{script}, for example eng_Latn - English (Latin script), cmn_Hans - Mandarin Chinese (Simplified), ...

Training

To further finetune the released checkpoints on your own data, use our data preparation guide followed by the finetuning recipe guide.

Citation

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Uh oh!

github-actions bot commented Dec 6, 2025

Uh oh!

github-actions bot commented Dec 6, 2025

Uh oh!

github-actions bot commented Dec 6, 2025

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...

Languages follow the format `{language_code}_{script}`, for example `eng_Latn` - English (Latin script), `cmn_Hans` - Mandarin Chinese (Simplified), ...