Chromium Issue Tracker
https://issues.chromium.org/issues/502214118
Observations
Currently, the Prompt API does not expose sampling parameters to standard web pages. Previously, raw sampling parameters like temperature and topK were available, but they were recently deprecated in web page contexts and restricted only to Chrome Extensions (see webmachinelearning/prompt-api#170).
This deprecation was a direct result of W3C TAG feedback regarding interoperability (see w3ctag/design-reviews#1093). Raw sampling parameters are model-internal scalars whose behavior drifts silently across different models and versions. Consequently, the same numerical input could produce vastly different behaviors depending on the user's browser and the specific model version they had downloaded, leading to silent breaks in API expectations.
Impact
Completely removing sampling controls limits the usefulness of the Prompt API, as developers lose the ability to fine-tune the model's output variety for specific use cases. Without these controls, developers cannot easily force a model to produce highly deterministic output for factual extraction, nor can they increase variance for creative brainstorming tasks.
Request for Specification
To provide developers with an easy lever for output variety while maintaining cross-browser interoperability, the Prompt API specification should introduce Categorical Sampling Modes.
Instead of passing raw numerical scalars, developers would pass a semantic enum (samplingMode) during session creation. Each browser vendor would then be responsible for mapping these semantic presets to the optimal raw parameters (e.g., temperature, topK, topP, minP) for their specific underlying model.
We propose the following 5 behavioral categories:
deterministic: For tasks requiring strict consistency and reproducibility (e.g., code generation or factual extraction).
precise: For highly focused outputs with minimal variation.
balanced: The default state, serving as the "sweet spot" for standard conversational interactions.
creative: For tasks where variety is preferred over strict factual accuracy.
imaginative: For maximum token diversity and brainstorming.
To prevent ambiguity, this parameter should be mutually exclusive with any legacy raw parameters where they still exist. Providing both a samplingMode and a raw parameter like temperature should result in a TypeError.
Example
The proposed Web IDL additions would look like this:
enum AILanguageModelSamplingMode {
"deterministic",
"precise",
"balanced",
"creative",
"imaginative"
};
dictionary AILanguageModelCreateOptions {
// ... existing fields ...
AILanguageModelSamplingMode samplingMode;
};
Chromium Issue Tracker
https://issues.chromium.org/issues/502214118
Observations
Currently, the Prompt API does not expose sampling parameters to standard web pages. Previously, raw sampling parameters like
temperatureandtopKwere available, but they were recently deprecated in web page contexts and restricted only to Chrome Extensions (see webmachinelearning/prompt-api#170).This deprecation was a direct result of W3C TAG feedback regarding interoperability (see w3ctag/design-reviews#1093). Raw sampling parameters are model-internal scalars whose behavior drifts silently across different models and versions. Consequently, the same numerical input could produce vastly different behaviors depending on the user's browser and the specific model version they had downloaded, leading to silent breaks in API expectations.
Impact
Completely removing sampling controls limits the usefulness of the Prompt API, as developers lose the ability to fine-tune the model's output variety for specific use cases. Without these controls, developers cannot easily force a model to produce highly deterministic output for factual extraction, nor can they increase variance for creative brainstorming tasks.
Request for Specification
To provide developers with an easy lever for output variety while maintaining cross-browser interoperability, the Prompt API specification should introduce Categorical Sampling Modes.
Instead of passing raw numerical scalars, developers would pass a semantic enum (
samplingMode) during session creation. Each browser vendor would then be responsible for mapping these semantic presets to the optimal raw parameters (e.g.,temperature,topK,topP,minP) for their specific underlying model.We propose the following 5 behavioral categories:
deterministic: For tasks requiring strict consistency and reproducibility (e.g., code generation or factual extraction).precise: For highly focused outputs with minimal variation.balanced: The default state, serving as the "sweet spot" for standard conversational interactions.creative: For tasks where variety is preferred over strict factual accuracy.imaginative: For maximum token diversity and brainstorming.To prevent ambiguity, this parameter should be mutually exclusive with any legacy raw parameters where they still exist. Providing both a
samplingModeand a raw parameter liketemperatureshould result in aTypeError.Example
The proposed Web IDL additions would look like this: