-
Notifications
You must be signed in to change notification settings - Fork 141
Generate RPC calls in Bridge Client #1123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might also want to generate all of that mess in temporalio/service.py. Off the top of my head, what I would suggest is to generate some file at temporalio/bridge/client_generated.py.
Could do the simple way of just having an init_workflow_service_client_calls(client: ServiceClient) that just does what it does today which is makes a _new_call for each known service name. Alternatively, a cleaner approach may be to just move the WorkflowService (and operator and cloud equivalents) to this generated file and have explicit methods created for each call that do as expected and then alias the service class publicly (or have the public WorkflowService class extend the internal/bridge form).
cretz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, just little things
temporalio/service.py
Outdated
| import temporalio.api.workflowservice.v1 | ||
| import temporalio.bridge.client | ||
| import temporalio.bridge.proto.health.v1 | ||
| import temporalio.bridge.services_generated as svc_gen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import temporalio.bridge.services_generated as svc_gen | |
| import temporalio.bridge.services_generated |
Pedantic and not technically required, but usually we fully qualify each use when referring to ourselves in these files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated these to be fully qualified. I also moved the generated python to be in bridge/generated so I could add a pydocstyle exclusion for the file. If that doesn't seem reasonable, I can see about spitting out some autogen'd doc strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think that' be nice. Yeah, one issue about moving the base class into the bridge area is that it won't be as clear on https://python.temporal.io (we consider the bridge private code and don't show it in API docs). This will be the first time user-facing code references the bridge module (in this case for a class extension). But I don't think it's that big of a deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not huge, but I don't see any good reason to move the code to bridge rather than another generated file. Some autgenerated doc strings seem fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing blocking, though we may want to consider not putting user-facing components in internal bridge module. @tconley1428 - thoughts?
…ying on previous behavior but still contain the same assertions
…port in service.py
…s to a generated directory to allow pydocstyle exclusion. Add linter ignore rule on tests/worker/test_workflow.py:4840 to avoid a linter rule about method assignment.
a971dcf to
2812a80
Compare
tconley1428
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should consider
| file_descriptors: list[FileDescriptor], | ||
| output_file: str = "temporalio/bridge/src/client_rpc_generated.rs", | ||
| ): | ||
| print("generating bridge rpc calls") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe one print for the whole thing, but I'm not sure we need all the steps. That said, marginal
| futures = "0.3" | ||
| prost = "0.13" | ||
| pyo3 = { version = "0.25", features = [ | ||
| "extension-module", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good place to put this comment, but I think we should consider adding a test for the actual user issue, just to have the coverage we fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test that executes all the possible rpc calls and ensures that none result in the "Unknown RPC call" exception. The calls are made with empty request objects so exceptions are expected, but the absence of the "Unknown RPC call" exception should indicate that all the rpc calls are wired up from Python -> Bridge -> Core. I confirmed that removing one of the RPC calls from the generated Rust source produces a test failure which was the cause of the user issue.
Also open to other approaches on getting the coverage!
temporalio/service.py
Outdated
| import temporalio.api.workflowservice.v1 | ||
| import temporalio.bridge.client | ||
| import temporalio.bridge.proto.health.v1 | ||
| import temporalio.bridge.services_generated as svc_gen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not huge, but I don't see any good reason to move the code to bridge rather than another generated file. Some autgenerated doc strings seem fine.
…on is wired up through bridge and core
| "httpx>=0.28.1", | ||
| "pytest-pretty>=1.3.0", | ||
| "openai-agents[litellm]>=0.3,<0.4", | ||
| "googleapis-common-protos==1.70.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tconley1428 , want to confirm with you that adding this googleapis-common-protos to this dev dependecy list won't cause any adverse effects. The new test that imports to proto descriptors relies on it, but if it's not reasonable to add, I'll modify the test to take a different approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's any adverse impact. It would only affect folks working on the repo, not library users. If it is needed to improve the test coverage, worth it I'd say.
tests/test_service.py
Outdated
| rpc_call = getattr(target_service, method_name) | ||
| try: | ||
| await rpc_call(request, timeout=timedelta(milliseconds=1)) | ||
| except Exception as err: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we take a quick look that the exceptions which do occur are all after the call is made? If any of them fail with like, parameter checking or something before that, then they wouldn't be covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The potential failures happen in the _BridgeServiceClient._rpc_call where there are 3 outcomes leading to exceptions that I'm aware of.
- The call to
connectthe rpc client fails. The connect method calls into the bridge source where it potentially raises aRuntimeError - The rpc call gets rejected by bridge before actually being dispatched due to the unknown RPC call error which is raised as a ValueError in the generated service clients on the Rust side.
- The rpc call is proxied through bridge and results in a Bridge RPCError which gets caught and re-raised as a python RPCError by the
_BridgeServiceClient.
With those listed out, I've updated the test to explicitly fail on the ValueError and explicitly pass on the RPCError which is what is returned when the call is proxied successfully but is rejected by the server.
I've left the RuntimeError unhandled as the test should have a client that's able to connect based on the existing test structure.
| gen-protos = "uv run scripts/gen_protos.py" | ||
| gen-protos-docker = "uv run scripts/gen_protos_docker.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VegetarianOrc - to confirm, we decided to move the simple uv-run type of steps out of gen_protos_docker any as poe sub-commands to these two commands right?
What was changed
This PR adds code generation of the RPC call wrapper in the bridge client to ensure ease the maintenance of manually updating the methods when core has a proto update. This generation has been added to the end of the
gen_protos_dockerscript.Some changes were included to make the generation possible:
rpc_callandrpc_call_on_traitmacros have been refactored to only include an exported macro namedrpc_callthat uses the fully qualified call syntax that was previously in the body ofrpc_call_on_trait.multiple-pymethodsfeature of PyO3 was enabled to allow theClientRefimpl to be split into two modules.clientmodule was elevated topub(crate)for referencing from theclient_rpc_generatedmodule.Why?
This allows us to easily keep the bridge client up to date with the RPC methods exposed via core.
Checklist
Closes [Bug] Batch operation feature in the Temporal Python SDK doesn't work #927
How was this tested:
The generated client was tested against the existing test suite and used to ensure the repro described in #927 correctly functions against a Temporal dev server.