Fix LookML parser edge cases from real-world fixtures#37
Conversation
…es, duplicate refinements
- Handle both LookML filter syntaxes: shorthand [field: value] and block style {field: x, value: y}
- Skip placeholder measures (type: number with no SQL) used in template views
- Skip duplicate refinements (+view) to prevent "model already exists" errors
- Add 61 comprehensive edge case tests covering real-world patterns from 689 fixtures
- All 178 LookML tests pass; parser now handles 689 real-world fixtures from joshtemple/lkml
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 87196fc7c7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # Check if all parts are simple strings (IN clause) | ||
| if all(not p.startswith("-") and not re.match(r"^(>=|<=|!=|<>|>|<)", p) for p in parts): | ||
| # Check if all parts are non-numeric strings | ||
| if not all(p.replace(".", "").replace("-", "").isdigit() for p in parts): | ||
| quoted = ", ".join(f"'{p}'" for p in parts) |
There was a problem hiding this comment.
Treat numeric filter lists as IN conditions
When a filter value is a comma-separated list of numbers (e.g., filters: [order_id: "1,2,3"]), this logic only emits an IN (...) clause if not all parts are numeric. Numeric lists therefore fall through to the default branch and become {model}.field = '1,2,3', which changes filter semantics and will return no matches in typical SQL. This is a regression for numeric dimensions that use LookML list syntax and should produce IN (1,2,3) (or a numeric equivalent).
Useful? React with 👍 / 👎.
Summary
Fixed three critical bugs in the LookML parser discovered through testing against 689 real-world fixtures from joshtemple/lkml:
Block-style filters not parsed: LookML has two filter syntaxes (shorthand
[field: value]and block{field: x value: y}). The parser only handled shorthand, silently dropping block-style filters used in many real projects.Placeholder measures crash the parser: Template views use placeholder measures like
measure: my_measure { type: number }with no SQL. These are meant to be overridden but crashed with validation errors.Duplicate refinements cause errors: Files with multiple refinements of the same view (
view: +order_itemsappearing twice) crashed with "model already exists" instead of gracefully handling the pattern.Changes