feat(aiobs): declarative recipe-based normalizer#60713
Conversation
|
Size Change: 0 B Total Size: 81 MB ℹ️ View Unchanged
|
|
👋 Visual changes detected for this PR. Review and approve in PostHog Visual Review If these changes are unexpected, they may be caused by a flaky test or a broken snapshot on master. Don't approve — rerun the job or wait for a fix. |
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
products/ai_observability/frontend/normalizer/recipe/compile/compiler.ts:370-372
`operatorArg` silently returns `LiteralExpr(undefined)` when a required key is absent, so a recipe that typos `from:` as e.g. `fom:` compiles without error, then at runtime `JoinExpr`/`SelectExpr`/`RejectExpr`/`OmitExpr` each fall through to their "not an array/object" branches and produce `''` / `[]` / `{}` with no diagnostic. Because `asOperatorMapping` also discards unknown keys without complaint, both the misspelled key and the missing required key go completely unreported — the recipe just silently outputs wrong data.
```suggestion
function operatorArg(args: Record<string, unknown>, key: string): Expr {
if (!(key in args)) {
throw new Error(`operator requires a '${key}:' argument`)
}
return compileValue(args[key])
}
```
### Issue 2 of 2
products/ai_observability/frontend/normalizer/recipe/compile/operatorSuggestion.test.ts:6-10
Three distinct edit-distance cases (deletion, insertion, substitution) are asserted together in one test. If the first assertion fails, the other two don't run, and the failure message doesn't identify which edit type broke. Per the team's rule of preferring parameterised tests, these should each be their own row.
```suggestion
it.each([
['selct', 'select', 'deletion'],
['joins', 'join', 'insertion'],
['omat', 'omit', 'substitution'],
])('suggests %s → %s (%s)', (typo, expected) => {
expect(nearestOperator(typo, OPERATORS)).toBe(expected)
})
```
Reviews (1): Last reviewed commit: "chore(llma): fix typecheck in recipe nor..." | Re-trigger Greptile |
| if (result === NO_MATCH) { | ||
| // cajole.yaml matches anything, so NO_MATCH means a coverage gap, not a normal miss. | ||
| throw new Error( | ||
| `RecipeNormalizer: no recipe matched ${JSON.stringify(input)?.slice(0, 200)} — cajole.yaml should be the final catch-all` |
There was a problem hiding this comment.
Should we do a safe stringify that can't throw itself? Seems like if we get no match, something is really weird with input.
| litellm, | ||
| openaiChat, | ||
| openaiResponses, | ||
| otel, |
There was a problem hiding this comment.
Should the otel recipe be versioned?
There was a problem hiding this comment.
I don't think so, for the moment it just handles any shape, if at some point that becomes too complex or has a clear split, we can always do so then.
| } | ||
|
|
||
| function stringField(o: Record<string, unknown>, key: string): string | undefined { | ||
| return typeof o[key] === 'string' ? (o[key] as string) : undefined |
There was a problem hiding this comment.
Weird that you need to type cast in these functions, should we make a type guard instead?
| if (base.content === undefined) { | ||
| base.content = '' | ||
| } | ||
| return base as CompatMessage |
There was a problem hiding this comment.
It's not by that point, just the compiler didn't pick it up. Refactored it to make it clearer.
Radu-Raicea
left a comment
There was a problem hiding this comment.
Great, not much to say, I like the approach!
I'm thinking whether this belongs in frontend/ though (directory). It feels like we might want to parse properties with these recipes in the backend too (maybe), for example in the MCP?
Thought about this too. Will leave it here for the moment and see what we do with it when it when we need to call it from somewhere else. Another concern is that we might need to call this from Python code at some point for the MCP, which might be a whole other mess. Let's cross that bridge when we get there. |
Problem
Parsing of AI events input and output happens in a complex TS file. This means it's hard to follow as we add providers, and there's no way for users to customise it.
Changes
Add a new parsing system for AI events based on declarative rules under a feature flag. The scope for this is only internal as it's a large change code-wise, the fact that we don't change functionality simplifies review.
Also bundles some changes that allow us to load YAML files directly.
Deliberately does not wire this up yet, the shared test suit is the main exercise point for the new parser. This is because the plumbing would involve changes to many more files and bloat the PR even more, will handle it in a separate PR.
How did you test this code?
There is a comprehensive test suite that was already in use for the legacy path, the new parser passes it without any modifications to the suite except adding one extra test case.
Automatic notifications
Docs update
no