Define Python Prelude in Laurel instead of Core#568
Merged
keyboardDrummer merged 90 commits intomainfrom Mar 17, 2026
Merged
Conversation
keyboardDrummer
commented
Mar 16, 2026
keyboardDrummer
commented
Mar 16, 2026
keyboardDrummer
commented
Mar 16, 2026
shigoel
reviewed
Mar 16, 2026
shigoel
reviewed
Mar 16, 2026
shigoel
reviewed
Mar 16, 2026
Contributor
|
Look good to me. |
thanhnguyen-aws
previously approved these changes
Mar 17, 2026
aqjune-aws
previously approved these changes
Mar 17, 2026
Contributor
aqjune-aws
left a comment
There was a problem hiding this comment.
Approving since Thanh approved this but his approval won't count as it from strata-reviewers. Probably we will still need one more approval.
b4b1d90
aqjune-aws
approved these changes
Mar 17, 2026
shigoel
approved these changes
Mar 17, 2026
olivier-aws
pushed a commit
that referenced
this pull request
Mar 17, 2026
### Changes 1. Support datatypes in field types. 2. Remove the override `datatype Box` hack that was used in Python. This is possible because of the previous change. 3. In the Python through Laurel pipeline, introduce `PythonRuntimeLaurelPart` and move a small part of `PythonLaurelCorePrelude` to it. In the final generated Core, declarations from `PythonRuntimeLaurelPart.lean` come before `PythonLaurelCorePrelude`. I will rename `PythonLaurelCorePrelude` to `PythonRuntimeCorePart` in a separate PR, to avoid conflicts. 4. Add `--update` option to run_py_analyze.sh to update the test expect files (the bytecode offsets in them change often) #### Caveats - Because DDM parsing also does a form of resolution, `PythonRuntimeCorePart.lean` needs a lot of type definitions that are already provided in `PythonPreludeInLaurel.lean`. Ideally we'd find a way so that's no longer necessary, using something like `#strata_parsing_only`. @joehendrix is that something we could add? I looked into adding that but it wasn't so simple. - Axioms in `PythonLaurelCorePrelude` can't be moved to `PythonRuntimeLaurelPart` because Laurel does not and I think should not support axioms. A long term solution would be that the Python pipeline stops relying on axioms. ### Future work - Turn on reporting of resolution errors in the Python through Laurel pipeline. This requires moving most definitions from PythonLaurelCorePrelude to PythonRuntimeLaurelPart. ### Tested 1. Updated T1_MutableFields test with a new test-case. 2. Refactoring, no tests needed 3. Updated Python expect files for Laurel because the reported locations are different 4. Manually used the --update option for `run_py_analyze.sh` By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Shilpi Goel <shigoel@gmail.com> Co-authored-by: Remy Willems <rwillems@dev-dsk-rwillems-1a-53862c30.eu-west-1.amazon.com> Co-authored-by: thanhnguyen-aws <ntson@amazon.com>
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Mar 18, 2026
…dispatch (#515) Resolves overloaded service dispatch (e.g., `servicelib.connect("storage")`) by walking the Python AST to identify which PySpec modules are needed, replacing manual `--pyspec` flags. Restructures the PySpec-to-Core pipeline into composable steps and adds argument validation with user-friendly error messages. ## Changes - **Overload resolution** (`Specs/IdentifyOverloads.lean`) — New AST walker collects the exact set of PySpec modules used by matching `Call` nodes against an `OverloadTable`. A new `pyResolveOverloads` CLI command exposes this for tooling. `pyAnalyzeLaurel` now auto-discovers needed PySpec files, eliminating manual `--pyspec` flags. - **Single Laurel.translate pass** (`PySpecPipeline.lean`, `StrataMain.lean`) — PySpec and user Laurel declarations are merged into one program before translation to Core, eliminating `TypeTag` collisions from independent translation. - **PreludeInfo abstraction** (`PythonToLaurel.lean`) — Decouples `pythonToLaurel'` from `Core.Program` so prelude info can come from either Core or Laurel programs. External stubs are generated separately, allowing `pythonToLaurel'` to produce a clean user program. - **Module-prefix PySpec identifiers** (`Specs/ToLaurel.lean`) — Type/procedure names are prefixed by PySpec filename (e.g., `Tag` → `S3_Tag`) to prevent collisions. Name collisions across PySpec files are now hard errors. - **Argument validation** (`PythonToLaurel.lean`) — `combinePositionalAndKeywordArgs` rejects unknown kwargs and reports missing required args by original Python method name (e.g., `'put_item' called with missing required arguments: [Key, Data]`). Unknown methods and invalid service names also produce specific errors. - **User class method recognition** (`PythonToLaurel.lean`) — `userFunctions` now includes class methods in `ClassName_methodName` format, so calls to user-defined class methods are recognized by `hasModel`. - **Pipeline extraction** (`PySpecPipeline.lean`) — Implementation moved from `SimpleAPI.lean` into a dedicated module with composable steps (`buildPreludeInfo`, `combinePySpecLaurel`, `translateCombinedLaurel`, `replaceStubsWithPrelude`), keeping SimpleAPI as a thin stable surface. - **Self/kwargs handling** (`PySpecPipeline.lean`, `Specs/ToLaurel.lean`) — PySpec method signatures strip `self` (with validation that it exists) and expand `**kwargs: Unpack[TypedDict]` into individual parameters. Non-TypedDict kwargs report errors via `expandKwargsArgsM`. - **SpecDefault in DDM** (`Specs/DDM.lean`, `Specs/Decls.lean`) — `Arg.hasDefault : Bool` replaced with `Arg.default : Option SpecDefault` - **Field access Any coercions** (`PythonToLaurel.lean`) — After #568 resolved field types to concrete types, heap parameterization produces typed Box destructors (`Box..intVal!` returns `int`) but Python operators expect `Any`. Fixed by wrapping `FieldSelect` expressions in `from_int`/`from_bool`/etc. at the PythonToLaurel layer, keeping HeapParameterization language-agnostic. Replaced unused `classAttributeType` with nested `classFieldHighType` HashMap. `lookupFieldHighType` throws explicit errors for `self.field` on known classes; `tryLookupFieldHighType` falls back gracefully for `obj.field` on unknown types. - **Infrastructure** (`ReadPython.lean`, `Specs.lean`) — Extracted `runPyToStrata`/`readParseStrataFile`, added `PythonToStrataOptions` with perf logging. `List` → `HashMap`/`HashSet` for `preludeProcedures`, `preludeTypes`, `compositeTypeNames`. ## Test coverage Two new test files exercise the dispatch pipeline against a mock service fixture (`Storage`, `Messaging`, `servicelib` dispatch file) with 17 Python test scripts in `StrataTest/Languages/Python/Specs/dispatch_test/`: **`AnalyzeLaurelTest.lean`** — 12 data-driven end-to-end tests running through the full `pyAnalyzeLaurel` pipeline with concurrent execution via `IO.asTask`: - 4 positive: single service (`test_single_service.py`), multi-service (`test_multi_service.py`), type annotation fallback (`test_annotation_fallback.py`), required+optional args (`test_required_with_optional.py`) - 8 negative with exact error message string matching: - `test_invalid_service.py` — `'connect' called with unknown service name "invalid"; known services: [storage, messaging]` - `test_invalid_method.py` — `Unknown method 'nonexistent_method'` - `test_invalid_args.py` — `'put_item' called with unknown keyword arguments: [Wrong]` - `test_missing_required.py` — `'put_item' called with missing required arguments: [Key, Data]` - `test_extra_kwarg.py` — `'get_item' called with unknown keyword arguments: [Bogus]` - `test_no_args.py` — `'put_item' called with missing required arguments: [Bucket, Key, Data]` - `test_optional_missing_required.py` — `'list_items' called with missing required arguments: [Bucket]` - `test_positional_missing.py` — `'delete_item' called with missing required arguments: [Key]` **`IdentifyOverloadsTest.lean`** — 7 unit tests calling `resolveOverloads` directly, asserting it detects modules. - `test_single_service.py` → exactly `{Storage}` - `test_multi_service.py` → exactly `{Storage, Messaging}` - `test_class_dispatch.py` → dispatch inside class methods → `{Storage}` - `test_dispatch_in_conditional.py` → both if/else branches → `{Storage, Messaging}` - `test_dispatch_in_try.py` → dispatch in try block → `{Storage}` - `test_no_dispatch.py` → no dispatch calls → `{}` - `test_dispatch_in_loop.py` → variable argument (non-literal) → `{}` (correctly not resolved) **Existing test updates:** - `SpecsTest.lean` — Updated DDM format (`hasDefault: Bool` → `default: Option SpecDefault`) - `ToLaurelTest.lean` — Test fixtures updated with `self` parameter for method validation - `test_class_field_use.expected` — Updated with passing verification results now that field access type mismatch is fixed By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
datatype Boxhack that was used in Python. This is possible because of the previous change.PythonRuntimeLaurelPartand move a small part ofPythonLaurelCorePreludeto it. In the final generated Core, declarations fromPythonRuntimeLaurelPart.leancome beforePythonLaurelCorePrelude. I will renamePythonLaurelCorePreludetoPythonRuntimeCorePartin a separate PR, to avoid conflicts.--updateoption to run_py_analyze.sh to update the test expect files (the bytecode offsets in them change often)Caveats
Because DDM parsing also does a form of resolution,
PythonRuntimeCorePart.leanneeds a lot of type definitions that are already provided inPythonPreludeInLaurel.lean. Ideally we'd find a way so that's no longer necessary, using something like#strata_parsing_only. @joehendrix is that something we could add? I looked into adding that but it wasn't so simple.Axioms in
PythonLaurelCorePreludecan't be moved toPythonRuntimeLaurelPartbecause Laurel does not and I think should not support axioms. A long term solution would be that the Python pipeline stops relying on axioms.Future work
Tested
run_py_analyze.shBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.