feat(providers): add HuggingFace chat completion provider #7446

mldangelo · 2026-02-02T22:54:51Z

Summary

Adds support for HuggingFace's OpenAI-compatible chat completions API via a new huggingface:chat provider.

New HuggingfaceChatCompletionProvider class extending OpenAiChatCompletionProvider for code reuse
Auto-detection in huggingface:text-generation when apiEndpoint contains /v1
Supports chatCompletion: true/false config to explicitly control format
Maps HuggingFace parameters (max_new_tokens) to OpenAI format (max_tokens)
Registry support for huggingface:chat:<model> syntax

Usage

providers:
  - id: huggingface:chat:deepseek-ai/DeepSeek-R1
    config:
      temperature: 0.7
      max_new_tokens: 1000

  # Or with text-generation (auto-detects from URL)
  - id: huggingface:text-generation:meta-llama/Llama-3.3-70B-Instruct
    config:
      apiEndpoint: https://router.huggingface.co/v1/chat/completions

Test plan

Unit tests added for chat completion functionality
Integration tested with multiple models:
- DeepSeek-R1
- Llama-3.3-70B-Instruct
- Qwen2.5-72B-Instruct
- Llama-3.1-8B-Instruct (via auto-detect)
Existing HuggingFace tests still pass
Documentation updated
Example added in examples/huggingface-chat/

cc @jameshiester - this fixes the issue you reported where the HuggingFace text-generation provider was sending the wrong request format to /v1/chat/completions.

🤖 Generated with Claude Code

@jameshiester

Adds support for HuggingFace's OpenAI-compatible chat completions API via a new `huggingface:chat` provider that extends the OpenAI provider. Changes: - New `HuggingfaceChatCompletionProvider` class extending OpenAiChatCompletionProvider - Auto-detection in `huggingface:text-generation` when apiEndpoint contains `/v1` - Supports `chatCompletion: true/false` config to explicitly control format - Maps HuggingFace parameters (max_new_tokens) to OpenAI format (max_tokens) - Registry support for `huggingface:chat:<model>` syntax - Updated documentation with chat provider examples - New example in `examples/huggingface-chat/` - Unit tests for chat completion functionality Usage: ```yaml providers: - id: huggingface:chat:deepseek-ai/DeepSeek-R1 config: temperature: 0.7 max_new_tokens: 1000 ``` Fixes issue reported by @jameshiester where HuggingFace text-generation provider was sending incorrect request format to the /v1/chat/completions endpoint. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

use-tusk · 2026-02-02T22:55:02Z

⏩ No test execution environment matched (885be34) View output ↗

View check history

Commit	Status	Output	Created (UTC)
`d39af5f`	⏩ No test execution environment matched	Output	Feb 2, 2026 10:55PM
`10a88f0`	⏩ No test execution environment matched	Output	Feb 2, 2026 11:30PM
`be79f36`	⏩ No test execution environment matched	Output	Feb 3, 2026 12:16AM
`e3f15f2`	⏩ No test execution environment matched	Output	Feb 3, 2026 12:17AM
`ba88e67`	⏩ No test execution environment matched	Output	Feb 3, 2026 7:09AM
`3a91865`	⏩ No test execution environment matched	Output	Feb 3, 2026 7:16AM
`22fa5a9`	⏩ No test execution environment matched	Output	Feb 3, 2026 7:24AM
`228bd3c`	⏩ No test execution environment matched	Output	Feb 3, 2026 7:35AM
`c8712a1`	⏩ No test execution environment matched	Output	Feb 3, 2026 7:46AM
`317db54`	⏩ No test execution environment matched	Output	Feb 3, 2026 3:45PM
`885be34`	⏩ No test execution environment matched	Output	Feb 3, 2026 5:38PM

View output in GitHub ↗

jameshiester · 2026-02-02T23:04:27Z

should there be a smoke test?

coderabbitai · 2026-02-02T23:04:33Z

📝 Walkthrough

Walkthrough

This pull request adds support for HuggingFace OpenAI-compatible chat completions. It introduces a new HuggingFaceChatCompletionProvider class alongside the existing Inference API provider, enabling conditional routing based on API endpoint and configuration. The changes include a new provider factory function, auto-detection logic for chat completion format, provider registry updates, documentation updates describing the chat models feature, an example configuration, and corresponding test coverage for the new chat completion path.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately and concisely describes the main feature addition—a new HuggingFace chat completion provider—which aligns with the core changes across the codebase.
Description check	✅ Passed	The PR description comprehensively relates to the changeset, detailing the implementation approach, usage examples, test coverage, and addressing a specific issue—all directly supported by the code and documentation changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/huggingface-chat-provider

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/providers/registry.ts (1)

1555-1582: ⚠️ Potential issue | 🟡 Minor

Include sentence-similarity in the HuggingFace path error message. It’s a supported task but not listed, which can mislead users.

🛠️ Suggested update

-          `Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>`,
+          `Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>, huggingface:sentence-similarity:<model name>`,
...
-        `Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>`,
+        `Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>, huggingface:sentence-similarity:<model name>`,

🤖 Fix all issues with AI agents

In `@examples/huggingface-chat/promptfooconfig.yaml`:
- Around line 5-42: Move the prompts section to appear before the providers
section in promptfooconfig.yaml so the top-level keys follow the required order
(description, env, prompts, providers, defaultTest, scenarios, tests);
specifically relocate the "prompts:" block (the "'What is 2+2?..." entry) above
the list of "providers:" entries and ensure missing intermediate keys (env,
defaultTest, scenarios) are present (can be empty) in that sequence to preserve
the mandated field order.

In `@site/docs/providers/huggingface.md`:
- Line 60: Restore the "Examples" heading to its original level instead of
demoting it; ensure the heading text "Examples" remains at the same Markdown
header level (the "### Examples" token) so existing anchors/links keep working
and do not change other surrounding headings or content.
- Around line 1-4: The front matter in the document currently only contains
sidebar_label and description; add the required YAML fields by inserting a title
(under 60 characters), a description expanded to ~150–160 characters, and a
numeric sidebar_position key alongside the existing sidebar_label and
description; update the existing description value (the "description" field) to
meet the 150–160 char length and ensure the newly added "title" and
"sidebar_position" keys are present near the top of the file so static site
tooling picks them up.

In `@src/providers/huggingface.ts`:
- Around line 712-714: Update the thrown Error message that references
providerPath in the Huggingface provider factory to include
"sentence-similarity" among the supported task types; locate the throw new
Error(...) that lists "huggingface:chat, huggingface:text-generation,
huggingface:feature-extraction, huggingface:text-classification,
huggingface:token-classification" and add
"huggingface:sentence-similarity:<model name>" to that list so the error
correctly reflects supported tasks.
- Around line 174-186: The auto-detect in useChatCompletionFormat() is too broad
because checking apiEndpoint.includes('/v1') will match both chat and non-chat
endpoints; update the detection so that it only returns true for chat-specific
endpoints (e.g., check for '/v1/chat' or '/v1/chat/completions' explicitly)
while leaving '/v1/completions' treated as non-chat; preserve the explicit
config.chatCompletion override and ensure the method returns false for generic
'/v1/completions' endpoints.

In `@test/providers/index.test.ts`:
- Around line 388-514: Add tests to cover HTTP 4xx/5xx errors, rate-limit
responses, invalid configuration, and token-usage tracking for the chat
completion path by extending the existing describe block that exercises
HuggingfaceTextGenerationProvider and its callApi method; specifically, add
cases that (1) mock fetch to return status 400/500 with an error body and assert
result.error contains the message, (2) mock a 429 response and/or a 200 with
rate-limit headers (eg. x-ratelimit-reset, retry-after) and assert the provider
surfaces rate-limit info, (3) call new HuggingfaceTextGenerationProvider
instances with invalid configs (e.g., missing apiEndpoint or invalid
chatCompletion types) and assert validation errors, and (4) mock responses that
include usage/token fields (or headers) and assert the provider records token
usage on result. Use the same test utilities (mockFetch, defaultMockResponse)
and follow the style of existing tests (parsing options.body, checking
messages/inputs) to implement these additional specs.

🧹 Nitpick comments (1)

examples/huggingface-chat/promptfooconfig.yaml (1)

35-37: Consider a quirky prompt for the simple test. It’ll make the example more engaging.

Based on learnings: For trivial test cases in configuration, make them quirky and fun to increase engagement.

coderabbitai · 2026-02-02T23:04:36Z

examples/huggingface-chat/promptfooconfig.yaml

+description: HuggingFace Chat Completion Tests
+
+providers:
+  # New dedicated huggingface:chat provider (recommended approach)
+  - id: huggingface:chat:deepseek-ai/DeepSeek-R1
+    config:
+      temperature: 0.1
+      max_new_tokens: 100
+    label: DeepSeek-R1
+
+  - id: huggingface:chat:meta-llama/Llama-3.3-70B-Instruct
+    config:
+      temperature: 0.1
+      max_new_tokens: 100
+    label: Llama-3.3-70B
+
+  - id: huggingface:chat:Qwen/Qwen2.5-72B-Instruct
+    config:
+      temperature: 0.1
+      max_new_tokens: 100
+    label: Qwen2.5-72B
+
+  # text-generation with auto-detection from URL (backward compatible)
+  - id: huggingface:text-generation:meta-llama/Llama-3.1-8B-Instruct
+    config:
+      apiEndpoint: https://router.huggingface.co/v1/chat/completions
+      temperature: 0.1
+      max_new_tokens: 100
+    label: Llama-3.1-8B (auto-detect)
+
+prompts:
+  - 'What is 2+2? Answer with just the number.'
+
+tests:
+  - vars: {}
+    assert:
+      - type: contains
+        value: '4'


⚠️ Potential issue | 🟡 Minor

Reorder sections to match required field order. prompts should come before providers.

🛠️ Suggested reorder

description: HuggingFace Chat Completion Tests +prompts: + - 'What is 2+2? Answer with just the number.' + providers: # New dedicated huggingface:chat provider (recommended approach) - id: huggingface:chat:deepseek-ai/DeepSeek-R1 @@ - -prompts: - - 'What is 2+2? Answer with just the number.' - tests: - vars: {} assert: - type: contains value: '4'

As per coding guidelines: Field order in promptfooconfig.yaml must be: description, env, prompts, providers, defaultTest, scenarios, tests.

🤖 Prompt for AI Agents

In `@examples/huggingface-chat/promptfooconfig.yaml` around lines 5 - 42, Move the prompts section to appear before the providers section in promptfooconfig.yaml so the top-level keys follow the required order (description, env, prompts, providers, defaultTest, scenarios, tests); specifically relocate the "prompts:" block (the "'What is 2+2?..." entry) above the list of "providers:" entries and ensure missing intermediate keys (env, defaultTest, scenarios) are present (can be empty) in that sequence to preserve the mandated field order.

coderabbitai · 2026-02-02T23:04:36Z

site/docs/providers/huggingface.md

 ---
 sidebar_label: HuggingFace
-description: Configure HuggingFace's text classification, embedding, and NER models for LLM testing and eval tasks
+description: Configure HuggingFace's chat models, text classification, embedding, and NER models for LLM testing and eval tasks
 ---


⚠️ Potential issue | 🟡 Minor

Add required front matter fields (and expand description length).

🛠️ Suggested front matter

--- +title: HuggingFace provider sidebar_label: HuggingFace -description: Configure HuggingFace's chat models, text classification, embedding, and NER models for LLM testing and eval tasks +sidebar_position: 40 +description: Configure HuggingFace chat models, embeddings, classification, and NER for Promptfoo evals using Inference API tasks and OpenAI‑compatible chat endpoints and routers. ---

As per coding guidelines: Front matter is required with title (under 60 chars), description (150-160 chars), and sidebar_position.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

---

sidebar_label: HuggingFace

description: Configure HuggingFace's text classification, embedding, and NER models for LLM testing and eval tasks

description: Configure HuggingFace's chat models, text classification, embedding, and NER models for LLM testing and eval tasks

---

---

title: HuggingFace provider

sidebar_label: HuggingFace

sidebar_position: 40

description: Configure HuggingFace chat models, embeddings, classification, and NER for Promptfoo evals using Inference API tasks and OpenAI‑compatible chat endpoints and routers.

---

🤖 Prompt for AI Agents

In `@site/docs/providers/huggingface.md` around lines 1 - 4, The front matter in the document currently only contains sidebar_label and description; add the required YAML fields by inserting a title (under 60 characters), a description expanded to ~150–160 characters, and a numeric sidebar_position key alongside the existing sidebar_label and description; update the existing description value (the "description" field) to meet the 150–160 char length and ensure the newly added "title" and "sidebar_position" keys are present near the top of the file so static site tooling picks them up.

coderabbitai · 2026-02-02T23:04:36Z

site/docs/providers/huggingface.md

 :::

-## Examples
+### Examples


⚠️ Potential issue | 🟡 Minor

Don’t demote the “Examples” heading. It can break existing anchor links.

🛠️ Suggested change

-### Examples +## Examples

As per coding guidelines: Don't modify headings as they are often externally linked.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

### Examples

## Examples

🤖 Prompt for AI Agents

In `@site/docs/providers/huggingface.md` at line 60, Restore the "Examples" heading to its original level instead of demoting it; ensure the heading text "Examples" remains at the same Markdown header level (the "### Examples" token) so existing anchors/links keep working and do not change other surrounding headings or content.

coderabbitai · 2026-02-02T23:04:36Z

src/providers/huggingface.ts

+  private useChatCompletionFormat(): boolean {
+    // Explicit config takes precedence
+    if (this.config.chatCompletion !== undefined) {
+      return this.config.chatCompletion;
+    }
+    // Auto-detect based on endpoint URL
+    if (this.config.apiEndpoint) {
+      return (
+        this.config.apiEndpoint.includes('/v1/chat/completions') ||
+        this.config.apiEndpoint.includes('/v1')
+      );
+    }
+    return false;


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

HuggingFace OpenAI-compatible API endpoints /v1/chat/completions /v1/completions documentation

💡 Result:

Hugging Face has two “OpenAI-compatible” surfaces that document /v1/chat/completions and/or /v1/completions:

Hugging Face Inference Providers (HF Router)

Chat Completions: documented under Inference Providers → Chat Completion; examples use the OpenAI SDK with base_url="https://router.huggingface.co/v1" (the SDK then hits the OpenAI-style /v1/chat/completions route). [1]

(Legacy) Completions: documented under Inference Providers → Text Generation; examples show base_url="https://router.huggingface.co/<provider>/v1/completions" (i.e., the OpenAI-style /v1/completions route). [2]

Hugging Face Inference Endpoints running TGI (Text Generation Inference)

/v1/chat/completions: documented as the Messages API, “fully compatible with the OpenAI Chat Completion API,” with curl examples directly calling .../v1/chat/completions. [3]

/v1/completions: documented at a high level in the TGI engine overview as “Fully OpenAI-compatible /v1/chat or /v1/completions APIs”. [4]

Sources

[1] Hugging Face Docs — Inference Providers: Chat Completion: https://huggingface.co/docs/inference-providers/en/tasks/chat-completion

[2] Hugging Face Docs — Inference Providers: Text Generation: https://huggingface.co/docs/inference-providers/en/tasks/text-generation

[3] Hugging Face Docs — TGI: Messages API: https://huggingface.co/docs/text-generation-inference/messages_api

[4] Hugging Face Docs — Inference Endpoints (dedicated): TGI engine: https://huggingface.co/docs/inference-endpoints/engines/tgi

🏁 Script executed:

# First, find the HuggingFace provider file and understand the context find . -name "huggingface.ts" -type f | head -5

Repository: promptfoo/promptfoo

Length of output: 93

🏁 Script executed:

# Search for how useChatCompletionFormat is used in the provider rg "useChatCompletionFormat" -A 5 -B 2

Repository: promptfoo/promptfoo

Length of output: 1122

🏁 Script executed:

# Check what getChatProvider returns and how it handles the response rg "getChatProvider" -A 10 src/providers/huggingface.ts

Repository: promptfoo/promptfoo

Length of output: 806

🏁 Script executed:

# Check the inference API path to understand what format it expects rg "callApi.*prompt" -B 3 -A 10 src/providers/huggingface.ts | head -50

Repository: promptfoo/promptfoo

Length of output: 1277

🏁 Script executed:

# Check if there are tests for different endpoint patterns find . -name "*huggingface*test*" -o -name "*huggingface*.spec.ts" | head -10

Repository: promptfoo/promptfoo

Length of output: 110

🏁 Script executed:

# Look at the broader context of useChatCompletionFormat to see all endpoint paths users might pass rg "apiEndpoint" src/providers/huggingface.ts -B 2 -A 2

Repository: promptfoo/promptfoo

Length of output: 2334

Auto-detect is too broad and can misroute /v1/completions. HuggingFace offers both chat (/v1/chat/completions) and completions (/v1/completions) OpenAI-compatible endpoints with different request formats. The current check includes('/v1') matches both, causing completions endpoints to receive messages format. This needs narrowing to avoid breaking valid /v1/completions usage.

🛠️ Suggested narrowing

- if (this.config.apiEndpoint) { - return ( - this.config.apiEndpoint.includes('/v1/chat/completions') || - this.config.apiEndpoint.includes('/v1') - ); - } + if (this.config.apiEndpoint) { + if (this.config.apiEndpoint.includes('/v1/chat/completions')) { + return true; + } + // Treat base /v1 as OpenAI-compatible, but avoid /v1/completions or other non-chat paths + return /\/v1\/?$/.test(this.config.apiEndpoint); + }

🤖 Prompt for AI Agents

In `@src/providers/huggingface.ts` around lines 174 - 186, The auto-detect in useChatCompletionFormat() is too broad because checking apiEndpoint.includes('/v1') will match both chat and non-chat endpoints; update the detection so that it only returns true for chat-specific endpoints (e.g., check for '/v1/chat' or '/v1/chat/completions' explicitly) while leaving '/v1/completions' treated as non-chat; preserve the explicit config.chatCompletion override and ensure the method returns false for generic '/v1/completions' endpoints.

coderabbitai · 2026-02-02T23:04:36Z

src/providers/huggingface.ts

+  throw new Error(
+    `Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>`,
+  );


⚠️ Potential issue | 🟡 Minor

Include sentence-similarity in the factory error message. It’s a supported task but missing in the list.

🛠️ Suggested update

- `Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>`, + `Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>, huggingface:sentence-similarity:<model name>`,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

throw new Error(

`Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>`,

);

throw new Error(

`Invalid Huggingface provider path: ${providerPath}. Use one of the following providers: huggingface:chat:<model name>, huggingface:text-generation:<model name>, huggingface:feature-extraction:<model name>, huggingface:text-classification:<model name>, huggingface:token-classification:<model name>, huggingface:sentence-similarity:<model name>`,

);

🤖 Prompt for AI Agents

In `@src/providers/huggingface.ts` around lines 712 - 714, Update the thrown Error message that references providerPath in the Huggingface provider factory to include "sentence-similarity" among the supported task types; locate the throw new Error(...) that lists "huggingface:chat, huggingface:text-generation, huggingface:feature-extraction, huggingface:text-classification, huggingface:token-classification" and add "huggingface:sentence-similarity:<model name>" to that list so the error correctly reflects supported tasks.

coderabbitai · 2026-02-02T23:04:36Z

test/providers/index.test.ts

+  describe('HuggingfaceTextGenerationProvider chat completion format', () => {
+    it('auto-detects chat completion format from URL', async () => {
+      const mockResponse = {
+        ...defaultMockResponse,
+        text: vi.fn().mockResolvedValue(
+          JSON.stringify({
+            choices: [{ message: { content: 'Chat response' } }],
+          }),
+        ),
+      };
+      mockFetch.mockResolvedValue(mockResponse);
+
+      const provider = new HuggingfaceTextGenerationProvider('deepseek-ai/DeepSeek-R1', {
+        config: {
+          apiEndpoint: 'https://router.huggingface.co/v1/chat/completions',
+          apiKey: 'test-key',
+        },
+      });
+      const result = await provider.callApi('Test prompt');
+
+      expect(mockFetch).toHaveBeenCalledTimes(1);
+      const [url, options] = mockFetch.mock.calls[0];
+      expect(url).toBe('https://router.huggingface.co/v1/chat/completions');
+      const body = JSON.parse(options.body);
+      expect(body).toHaveProperty('model', 'deepseek-ai/DeepSeek-R1');
+      expect(body).toHaveProperty('messages');
+      expect(body.messages[0]).toEqual({ role: 'user', content: 'Test prompt' });
+      expect(result.output).toBe('Chat response');
+    });
+
+    it('uses explicit chatCompletion config', async () => {
+      const mockResponse = {
+        ...defaultMockResponse,
+        text: vi.fn().mockResolvedValue(
+          JSON.stringify({
+            choices: [{ message: { content: 'Chat response' } }],
+          }),
+        ),
+      };
+      mockFetch.mockResolvedValue(mockResponse);
+
+      const provider = new HuggingfaceTextGenerationProvider('my-model', {
+        config: {
+          apiEndpoint: 'https://my-custom-endpoint.com/api',
+          chatCompletion: true,
+        },
+      });
+      const result = await provider.callApi('Test prompt');
+
+      expect(mockFetch).toHaveBeenCalledTimes(1);
+      const [, options] = mockFetch.mock.calls[0];
+      const body = JSON.parse(options.body);
+      expect(body).toHaveProperty('messages');
+      expect(result.output).toBe('Chat response');
+    });
+
+    it('maps HuggingFace parameters to OpenAI format', async () => {
+      const mockResponse = {
+        ...defaultMockResponse,
+        text: vi.fn().mockResolvedValue(
+          JSON.stringify({
+            choices: [{ message: { content: 'Response' } }],
+          }),
+        ),
+      };
+      mockFetch.mockResolvedValue(mockResponse);
+
+      const provider = new HuggingfaceTextGenerationProvider('model', {
+        config: {
+          apiEndpoint: 'https://api.example.com/v1/chat/completions',
+          temperature: 0.7,
+          top_p: 0.9,
+          max_new_tokens: 100,
+        },
+      });
+      await provider.callApi('Test');
+
+      const [, options] = mockFetch.mock.calls[0];
+      const body = JSON.parse(options.body);
+      expect(body.temperature).toBe(0.7);
+      expect(body.top_p).toBe(0.9);
+      expect(body.max_tokens).toBe(100);
+    });
+
+    it('handles chat completion error response', async () => {
+      const mockResponse = {
+        ...defaultMockResponse,
+        text: vi.fn().mockResolvedValue(
+          JSON.stringify({
+            error: { message: 'Model not found' },
+          }),
+        ),
+      };
+      mockFetch.mockResolvedValue(mockResponse);
+
+      const provider = new HuggingfaceTextGenerationProvider('model', {
+        config: {
+          apiEndpoint: 'https://api.example.com/v1/chat/completions',
+        },
+      });
+      const result = await provider.callApi('Test');
+
+      expect(result.error).toContain('Model not found');
+    });
+
+    it('falls back to Inference API format when chatCompletion is false', async () => {
+      const mockResponse = {
+        ...defaultMockResponse,
+        text: vi.fn().mockResolvedValue(JSON.stringify({ generated_text: 'Output' })),
+      };
+      mockFetch.mockResolvedValue(mockResponse);
+
+      const provider = new HuggingfaceTextGenerationProvider('model', {
+        config: {
+          apiEndpoint: 'https://api.example.com/v1/chat/completions',
+          chatCompletion: false, // Explicitly disable
+        },
+      });
+      await provider.callApi('Test');
+
+      const [, options] = mockFetch.mock.calls[0];
+      const body = JSON.parse(options.body);
+      expect(body).toHaveProperty('inputs');
+      expect(body).toHaveProperty('parameters');
+      expect(body).not.toHaveProperty('messages');
+    });
+  });


⚠️ Potential issue | 🟠 Major

Add required 4xx/5xx, rate‑limit, config‑validation, and token‑usage tests for the chat path. The new block covers success and body‑error cases only.

As per coding guidelines: Every provider must have tests covering: success cases, error cases (4xx, 5xx, rate limits), configuration validation, and token usage tracking.

🤖 Prompt for AI Agents

In `@test/providers/index.test.ts` around lines 388 - 514, Add tests to cover HTTP 4xx/5xx errors, rate-limit responses, invalid configuration, and token-usage tracking for the chat completion path by extending the existing describe block that exercises HuggingfaceTextGenerationProvider and its callApi method; specifically, add cases that (1) mock fetch to return status 400/500 with an error body and assert result.error contains the message, (2) mock a 429 response and/or a 200 with rate-limit headers (eg. x-ratelimit-reset, retry-after) and assert the provider surfaces rate-limit info, (3) call new HuggingfaceTextGenerationProvider instances with invalid configs (e.g., missing apiEndpoint or invalid chatCompletion types) and assert validation errors, and (4) mock responses that include usage/token fields (or headers) and assert the provider records token usage on result. Use the same test utilities (mockFetch, defaultMockResponse) and follow the style of existing tests (parsing options.body, checking messages/inputs) to implement these additional specs.

github-actions · 2026-02-02T23:06:29Z

Security Review ✅

No critical issues found. This PR adds a well-structured HuggingFace chat completion provider that extends the existing OpenAI provider.

🟡 Minor Observations (3 items)

src/providers/huggingface.ts:88-89 - The inference provider parsing logic uses !modelName.includes(':') which could fail for models containing colons in their name that aren't provider suffixes. This is a minor edge case and matches the existing documentation pattern.
test/providers/huggingface.test.ts:550-551 - Global fetch mock is set without cleanup in afterAll. The afterEach with vi.clearAllMocks() handles this, but consider using vi.restoreAllMocks() for complete mock restoration.
site/docs/providers/huggingface.md:185-186 - The curl example in docs exposes a pattern for API enumeration (curl https://huggingface.co/api/models/MODEL_ID?expand[]=inferenceProviderMapping). This is intentional and documented by HuggingFace, so not a security concern.

Notes

✅ API keys properly sourced from environment variables (HF_TOKEN, HF_API_TOKEN) with no hardcoded secrets
✅ Extends OpenAiChatCompletionProvider which has established security patterns for API communication
✅ URL handling properly strips duplicate path segments (/chat/completions) to prevent malformed requests
✅ Tests cover API key fallback behavior and error handling
✅ PR scope (providers) is correct per conventions (not redteam-related)

Last updated: 2026-02-03T12:00:00Z | Reviewing: 885be34

promptfoo-scanner

👍 All Clear

I reviewed this PR which adds a new HuggingFace chat completion provider. The changes implement provider infrastructure code that connects applications to HuggingFace's OpenAI-compatible API. After analyzing the data flows, API key handling, and inherited capabilities from the OpenAI provider, no LLM security vulnerabilities were found.

_{Minimum severity threshold: 🟡 Medium | To re-scan after changes, comment @promptfoo-scanner}
_{Learn more}

_{Was this helpful? 👍 Yes | 👎 No}

- Fix overly broad auto-detection: change `/v1` to `/v1/chat` to avoid matching non-chat endpoints like `/v1/completions` - Add sentence-similarity to error messages in registry - Remove unused factory function `createHuggingfaceProvider` - Fix example config field order: prompts before providers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

mldangelo · 2026-02-02T23:30:46Z

@jameshiester Re: smoke test - looking at the existing smoke tests in test/smoke/, they use deterministic providers (like echo) to test CLI behavior without making real API calls. For HuggingFace, a meaningful smoke test would require real API calls (which needs an API key and has latency).

What we could test in smoke tests without API calls:

Provider registry loads huggingface:chat: correctly (already covered by unit tests in test/providers/index.test.ts)
Config validation accepts the new provider format

What would require an integration test with a real API key:

End-to-end chat completion responses
Auto-detection behavior with real endpoints

I've also addressed the review feedback:

Fixed overly broad /v1 auto-detection → now checks /v1/chat specifically
Added sentence-similarity to error messages
Removed unused createHuggingfaceProvider factory function
Fixed example config field ordering

Update documentation and code comment to accurately reflect that auto-detection matches `/v1/chat` (not `/v1`) after the fix. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Ensures auto-detection only triggers for /v1/chat endpoints, not /v1/completions (which would be a non-chat completion endpoint). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…verage - Remove requiresApiKey() override that suppressed auth error messages - Remove misleading repetition_penalty → frequency_penalty mapping - Fix URL stripping to use anchored regex instead of String.replace - Guard max_new_tokens mapping to not override explicit max_tokens - Shallow-copy providerOptions to prevent mutation bugs - Add dedicated test file with 29 tests covering identity, auth, URL handling, parameter mapping, errors, and registry integration - Move chat delegation tests from index.test.ts to dedicated file - Restore ## Examples heading in docs (was demoted, breaking anchors) - Update providers index to show huggingface:chat as primary example - Simplify example config Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…ce provider - getApiKey() now checks this.env?.HF_TOKEN and this.env?.HF_API_TOKEN, matching the provider-level env override pattern used by other OpenAI-compatible wrappers - HuggingfaceTextGenerationProvider.cleanup() forwards to inner chat provider, preventing MCP resource leaks when delegating Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Qualify tool calling and streaming as model-dependent in HF docs - Reduce example config to a single provider to avoid surprise costs - Add cost note to example README Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

… IDs Update docs and example to use HuggingFace's officially recommended Inference Provider models that have been verified working via E2E testing: DeepSeek-R1, openai/gpt-oss-120b, Qwen2.5-Coder-32B, GLM-4.5, and google/gemma-3-27b-it. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…r routing Add support for HuggingFace's Inference Provider routing system, which allows routing requests to specific backend providers (Cerebras, Together, Fireworks AI, etc.). Users can specify a provider via: - Model name suffix: `huggingface:chat:org/model:provider-name` - Config option: `config.inferenceProvider: 'provider-name'` This enables access to models that require explicit provider selection, such as `Qwen/QwQ-32B:featherless-ai`. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…lability Clarify that model name suffix takes precedence over inferenceProvider config, and note that available models/providers change over time. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

mldangelo requested review from MrFlounder and typpo as code owners February 2, 2026 22:54

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

promptfoo-scanner bot reviewed Feb 2, 2026

View reviewed changes

mldangelo and others added 2 commits February 2, 2026 16:16

docs(providers): fix HuggingFace auto-detection documentation

be79f36

Update documentation and code comment to accurately reflect that auto-detection matches `/v1/chat` (not `/v1`) after the fix. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

test(providers): add test for /v1/completions non-chat endpoint

e3f15f2

Ensures auto-detection only triggers for /v1/chat endpoints, not /v1/completions (which would be a non-chat completion endpoint). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

jameshiester self-requested a review February 3, 2026 05:05

jameshiester approved these changes Feb 3, 2026

View reviewed changes

mldangelo and others added 8 commits February 2, 2026 23:09

docs(providers): clarify token counting depends on provider response

05f3943

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

docs(providers): document inferenceProvider precedence and model avai…

317db54

…lability Clarify that model name suffix takes precedence over inferenceProvider config, and note that available models/providers change over time. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Merge branch 'main' into feat/huggingface-chat-provider

885be34

mldangelo merged commit cd709b7 into main Feb 3, 2026
45 checks passed

mldangelo deleted the feat/huggingface-chat-provider branch February 3, 2026 18:14

promptfoobot bot mentioned this pull request Feb 3, 2026

chore(main): release 0.120.21 #7363

Merged

Uh oh!

feat(providers): add HuggingFace chat completion provider #7446

feat(providers): add HuggingFace chat completion provider #7446

Uh oh!

Conversation

mldangelo commented Feb 2, 2026

Summary

Usage

Test plan

Uh oh!

use-tusk bot commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jameshiester commented Feb 2, 2026

Uh oh!

coderabbitai bot commented Feb 2, 2026

Walkthrough

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Security Review ✅

Notes

Uh oh!

promptfoo-scanner bot left a comment

Choose a reason for hiding this comment

Uh oh!

mldangelo commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

use-tusk bot commented Feb 2, 2026 •

edited

Loading

github-actions bot commented Feb 2, 2026 •

edited

Loading