Skip to content

Proposal: Introduce Categorical Sampling Modes for the Prompt API #203

@isaacahouma

Description

@isaacahouma

Chromium Issue Tracker
https://issues.chromium.org/issues/502214118

Observations

Currently, the Prompt API does not expose sampling parameters to standard web pages. Previously, raw sampling parameters like temperature and topK were available, but they were recently deprecated in web page contexts and restricted only to Chrome Extensions (see webmachinelearning/prompt-api#170).

This deprecation was a direct result of W3C TAG feedback regarding interoperability (see w3ctag/design-reviews#1093). Raw sampling parameters are model-internal scalars whose behavior drifts silently across different models and versions. Consequently, the same numerical input could produce vastly different behaviors depending on the user's browser and the specific model version they had downloaded, leading to silent breaks in API expectations.

Impact

Completely removing sampling controls limits the usefulness of the Prompt API, as developers lose the ability to fine-tune the model's output variety for specific use cases. Without these controls, developers cannot easily force a model to produce highly deterministic output for factual extraction, nor can they increase variance for creative brainstorming tasks.

Request for Specification

To provide developers with an easy lever for output variety while maintaining cross-browser interoperability, the Prompt API specification should introduce Categorical Sampling Modes.

Instead of passing raw numerical scalars, developers would pass a semantic enum (samplingMode) during session creation. Each browser vendor would then be responsible for mapping these semantic presets to the optimal raw parameters (e.g., temperature, topK, topP, minP) for their specific underlying model.

We propose the following 5 behavioral categories:

  • deterministic: For tasks requiring strict consistency and reproducibility (e.g., code generation or factual extraction).
  • precise: For highly focused outputs with minimal variation.
  • balanced: The default state, serving as the "sweet spot" for standard conversational interactions.
  • creative: For tasks where variety is preferred over strict factual accuracy.
  • imaginative: For maximum token diversity and brainstorming.

To prevent ambiguity, this parameter should be mutually exclusive with any legacy raw parameters where they still exist. Providing both a samplingMode and a raw parameter like temperature should result in a TypeError.

Example

The proposed Web IDL additions would look like this:

enum AILanguageModelSamplingMode {
  "deterministic",
  "precise",
  "balanced",
  "creative",
  "imaginative"
};

dictionary AILanguageModelCreateOptions {
  // ... existing fields ...
  
  AILanguageModelSamplingMode samplingMode;
};

Metadata

Metadata

Assignees

No one assigned

    Labels

    Agenda+enhancementNew feature or requestinteropPotential concerns about interoperability among multiple implementations of the API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions