LLM Obs: Move hallucination detection evaluation doc#35309
LLM Obs: Move hallucination detection evaluation doc#35309
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c8484e4cfa
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
content/en/llm_observability/evaluations/managed_evaluations/_index.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0afc7c05ce
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md
Outdated
Show resolved
Hide resolved
content/en/llm_observability/evaluations/managed_evaluations/_index.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d43571c900
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| {{< /tabs >}} | ||
|
|
||
| If your LLM provider restricts IP addresses, you can obtain the required IP ranges by visiting [Datadog's IP ranges documentation][2], selecting your `Datadog Site`, pasting the `GET` URL into your browser, and copying the `webhooks` section. | ||
| Learn more about the [compatibility requirements][2]. |
There was a problem hiding this comment.
Preserve removed BYOK anchor target
Removing the Connect your LLM provider account section also removed the #connect-your-llm-provider-account anchor, but content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md still links to /llm_observability/evaluations/managed_evaluations#connect-your-llm-provider-account ([2]). After this change, users following that custom-evaluation setup link are dropped at the top of the managed page with no matching section, so the provider-connection step is no longer reachable from the documented flow.
Useful? React with 👍 / 👎.
joepeeples
left a comment
There was a problem hiding this comment.
Approved with a couple small edit suggestions, thanks!
|
|
||
| ## Estimated token usage | ||
|
|
||
| You can monitor the token usage of your LLM evaluations using [this dashboard][8]. |
There was a problem hiding this comment.
| You can monitor the token usage of your LLM evaluations using [this dashboard][8]. | |
| You can monitor the token usage of your LLM evaluations using the [LLM Evaluations Token Usage dashboard][8]. |
...t/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md
Show resolved
Hide resolved
…_evaluations/_index.md Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6db9d84b86
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
...t/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md
Show resolved
Hide resolved
….com:DataDog/documentation into gergely.svigruha/templetized-hallu-detection
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6713bbbe86
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
...ent/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/connect_to_account.md
Show resolved
Hide resolved
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 282d759948
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| - [Language Mismatch][3] - Flags responses that are written in a different language than the user’s input | ||
| - [Sensitive Data Scanning][4] - Flags the presence of sensitive or regulated information in model inputs or outputs |
There was a problem hiding this comment.
Reconcile managed evaluation scope in this page
This new “Supported managed evaluations” list now limits managed evaluations to Language Mismatch and Sensitive Data Scanning, but the overview text in the same page still says managed evaluations include sentiment, topic relevancy, toxicity, failure to answer, and hallucination. That contradiction leaves readers with incompatible setup expectations (for example, looking for evaluations that are no longer listed as supported), so the page should be made internally consistent.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c2a424c636
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
* Add secret ID notes (#35272) * add notes * small edit * Update MCP docs: recommend custom connectors for Claude Desktop & claude.ai (#35285) * Update MCP docs: recommend custom connectors for Claude Desktop & claude.ai The local binary is no longer needed for Claude Desktop or claude.ai — both now support custom connectors with the remote MCP URL natively. Replaces the stdio/binary setup instructions with a link to the Claude help center guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Simplify tab title to just "Claude" to cover all Claude products Addresses PR feedback — custom connectors work across Claude (web), Claude Desktop, and Claude Cowork, so "Claude" covers them all. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Say "including Claude Cowork" instead of "including Claude Desktop" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Remove preview feature notice from prompt optimization (#35288) Removed preview feature notice for Prompt Optimization. * [DDSQL-1503] Follow-up on dd.logs() description (#35295) * Update dd.logs description * Fix spacing * [MLObs] adding clarification notes about the metrics (#35248) * adding clairification notes about the metrics * remove typo newline * explain the metrics are only generated for certain keys * [DOCS-13590] Add Fusion setup guide (#35059) * [DOCS-13590] Add Fusion setup guide * [DOCS-13590] Update preview callout * [DOCS-13590] Update preview callout text * [DOCS-13590] Add validation section * [DOCS-13590] Add US1-FED site support banner to Oracle Fusion integration setup guide * [DOCS-13590] Incorporate cswatt's feedback * [DOCS-13590] Remove ORA_FND_READ_ONLY_ACCESS_ABSTRACT permission * Remove MCP Server Preview form alert from VS Code & Cursor extension docs (#35303) Remove 'The Datadog MCP Server is in Preview. Complete this form to request access.' from both VS Code and Cursor tabs on the IDE plugins page. Made-with: Cursor Co-authored-by: Sumedha Mehta <sumedha.mehta@datadoghq.com> * [DOCS-13433] Fix valid tag characters to include commas (#35249) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Docs13590/fusion integration ga (#35315) * [DOCS-13590] Remove preview banner and make doc public * [DOCS-13590] Add Oracle Fusion integration setup guide * [DOCS-13642] Add US1-FED port restriction note to log forwarding docs (#35313) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Update Go Live Debugger page with eBPF limitations (#35310) * [DOCS-12531] Update integration developers getting started guide (#34741) * Rewrite requirements and getting-started * Update links * Make Vale corrections * Apply suggestions from code review Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> Co-authored-by: Dominic Medina <115744456+dd-dominic@users.noreply.github.com> Co-authored-by: Eva Parish <eva.parish@datadoghq.com> --------- Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> Co-authored-by: Dominic Medina <115744456+dd-dominic@users.noreply.github.com> * [DOCS-13670] Standardize buffer section in destination docs (#35267) * [DOCS-13670] Standardize buffer section in destination docs Replace destination_buffer_numbered with destination_buffer shortcode. * updates * small edit * small edit * add for splunk hec * Translation Pipeline PR (#35291) * Translated file updates * Translated file updates * Translated file updates * fix erroneously translated `tab` shortcodes * fix malformed link syntax --------- Co-authored-by: webops-guacbot[bot] <214537265+webops-guacbot[bot]@users.noreply.github.com> Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> * Add assets to support the Cdocs stepper (not in use yet) (#35312) * Sketch in stepper styles * Tweak styles * Check off completed steps * Flesh out example steps * Make steps searchable * Nudge elements * Update example step * Tweak stepper behavior * Use a green checkmark circle to mark completed tasks * Tweak button wording * Tweak wording * Tweak stepper line width * Tweak appearance * Improve focus visibility * Improve accessibility * Improve accessibility * Tweak checkmark * Tweak button text size * Tweak loading behavior * Button tweaks * Tweaks * Update demo markup * [wip] Incorporate feedback * Make the clicked step the active step * Prevent step titles from being hidden under the sticky menu * Tweak reset behavior * Style expand/collapse buttons as links * Improve responsiveness * Tweak styles * Tweak icons * Tweak spacing * Fix stepper icon URLs * Tone down expand/collapse toggle styling (#35284) Reduce visual weight of the expand all / collapse toggle so it reads as a quiet utility control rather than competing with step titles. - font-size: 16px → 14px - font-weight: 600 → 500 - text-transform: uppercase → none (sentence case) - Add subtle letter-spacing * Tweaks * Delete stepper demo file * Revert changes in package.json * Implement Codex feedback * Fix bug * Update assets/styles/components/_collapsible-section.scss Co-authored-by: StefonSimmons <57869435+StefonSimmons@users.noreply.github.com> --------- Co-authored-by: Brett Blue <84536271+brett0000FF@users.noreply.github.com> Co-authored-by: StefonSimmons <57869435+StefonSimmons@users.noreply.github.com> * LLM Obs: Move hallucination detection evaluation doc (#35309) * move hallucination doc * tweaks * add back screenshot * remove usused code * fixlinks * Update content/en/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/_index.md Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> * add back account * links * fix title * more fixes --------- Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> --------- Co-authored-by: May Lee <may.lee@datadoghq.com> Co-authored-by: Reilly Wood <163153147+rgwood-dd@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Charles Jacquet <charles.jacquet@datadoghq.com> Co-authored-by: Mariana Dutra <88353514+mariddc@users.noreply.github.com> Co-authored-by: Xinyuan Guo <xinyuan.guo@datadoghq.com> Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> Co-authored-by: sumedham <87997309+sumedham@users.noreply.github.com> Co-authored-by: Sumedha Mehta <sumedha.mehta@datadoghq.com> Co-authored-by: Rosa Trieu <107086888+rtrieu@users.noreply.github.com> Co-authored-by: Esther Kim <esther.kim@datadoghq.com> Co-authored-by: ajwerner <awerner32@gmail.com> Co-authored-by: Eva Parish <eva.parish@datadoghq.com> Co-authored-by: Joe Peeples <joe.peeples@datadoghq.com> Co-authored-by: Dominic Medina <115744456+dd-dominic@users.noreply.github.com> Co-authored-by: webops-guacbot[bot] <214537265+webops-guacbot[bot]@users.noreply.github.com> Co-authored-by: Jen Gilbert <jen.gilbert@datadoghq.com> Co-authored-by: Brett Blue <84536271+brett0000FF@users.noreply.github.com> Co-authored-by: StefonSimmons <57869435+StefonSimmons@users.noreply.github.com> Co-authored-by: Gergely Svigruha <gsvigruha@users.noreply.github.com>
What does this PR do? What is the motivation?
Merge instructions
Merge readiness: