feat/finishing typeddict inputs by blast-hardcheese · Pull Request #95 · replit/river-python

blast-hardcheese · 2024-09-28T04:24:37Z

Why

We got pretty close to having TypedDicts for river-python inputs before, but had to roll back due to a protocol mismatch.

Trying again, and also adding some tests to confirm that at the very least the Pydantic models can decode was was encoded by the TypedDict encoders. It's not a perfect science, but it should be good enough to start building more confidence as we make additional progress.

The reason for "janky" tests

There's a bit of a chicken-and-egg situation when trying to test code generation at runtime.

We have three options:

write pytest handlers where each invocation runs the codegen with a temp target (like the shell script does here), writes a static file for each text into that directory, then executes a new python into that directory. The challenge with this is that it would suck to write or maintain.
write pytest handlers which runs the codegen with unique module name targets (like gen1, gen2, gen3, one for each codegen run necessary) and carefully juggle the imports to make sure we don't try to import something that's not there yet. This might be the best option, but I'm not convinced about the ergonomics at the moment. It might be OK though, with highly targeted .gitignore's.
maintain a bespoke test runner, optimize for writing and maintaining these tests, and just acknowledge that we are doing something obscure and difficult.

I definitely wrote the tests here in a way that would give some coverage and also provide confidence, while intentionally deferring the above decision so we can keep making progress. in the meantime.

What changed

Added some janky tests for comparing the encoding of both models
Fixed many bugs in the TypedDict codegen and encoders

Test plan

$ bash scripts/parity.sh
Using /tmp/river-codegen-parity.bAZ
Starting...
Verified

blast-hardcheese · 2024-09-30T04:52:01Z

@ryantm I edited the description to explain the odd testing story here

This has given enough confidence to move forward with the TypedDict inputs, though we still need a good story for structurally parsing and validating.

Yet more confidence that we are testing what we think we are

blast-hardcheese requested a review from a team as a code owner September 28, 2024 04:24

blast-hardcheese requested review from ryantm and removed request for a team September 28, 2024 04:24

blast-hardcheese force-pushed the dstewart/feat/finishing-typeddict-inputs branch 3 times, most recently from 58885f2 to bfb710d Compare September 28, 2024 05:15

ryantm approved these changes Sep 30, 2024

View reviewed changes

Comment thread replit_river/codegen/client.py

Comment thread scripts/parity/check_parity.py

blast-hardcheese added 7 commits September 30, 2024 11:51

Avoid default base_model

37115ca

elif chaining so we can start reducing early termination

3ad9d8e

Adding a janky parity test between Pydantic models and TypedDict models

59d4265

This has given enough confidence to move forward with the TypedDict inputs, though we still need a good story for structurally parsing and validating.

Assorted patches to make parity checks pass

f12bae9

Ignoring parity test errors

d94f87d

Getting mypy working against check_parity

965d7ce

Yet more confidence that we are testing what we think we are

Exit on error, cleanup after execution

9fefb77

blast-hardcheese force-pushed the dstewart/feat/finishing-typeddict-inputs branch from c319d0b to 9fefb77 Compare September 30, 2024 18:52

blast-hardcheese merged commit ca9d552 into main Sep 30, 2024

blast-hardcheese deleted the dstewart/feat/finishing-typeddict-inputs branch September 30, 2024 18:54

blast-hardcheese added the bug Something isn't working label Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat/finishing typeddict inputs#95

feat/finishing typeddict inputs#95
blast-hardcheese merged 7 commits intomainfrom
dstewart/feat/finishing-typeddict-inputs

blast-hardcheese commented Sep 28, 2024 •

edited

Loading

Uh oh!

blast-hardcheese commented Sep 30, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

blast-hardcheese commented Sep 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

The reason for "janky" tests

What changed

Test plan

Uh oh!

blast-hardcheese commented Sep 30, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

blast-hardcheese commented Sep 28, 2024 •

edited

Loading