Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Worker hangs after polling error #459

Closed
ndtretyak opened this issue Jan 10, 2024 · 5 comments
Closed

[Bug] Worker hangs after polling error #459

ndtretyak opened this issue Jan 10, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@ndtretyak
Copy link
Contributor

Describe the bug

Worker hangs after an error during polling.

Minimal Reproduction

I just start a worker with an incorrect token, so that my server responds with Request unauthorized..

import asyncio

from temporalio import activity, client, workflow, worker


@activity.defn
async def a() -> None:
    pass


@workflow.defn
class Workflow:
    @workflow.run
    async def run(self) -> None:
        pass


async def main():
    c = await client.Client.connect(
        "my_temporal_host:7233",
        rpc_metadata={"authorization": "wrong_token"},
        tls=True,
    )
    w = worker.Worker(
        c,
        task_queue="default",
        activities=[a],
        workflows=[Workflow],
    )
    await w.run()


if __name__ == "__main__":
    asyncio.run(main())

Output:

2024-01-10T14:27:15.726021Z  WARN temporal_sdk_core::worker::activities::activity_task_poller_stream: Error while polling for activity tasks error=Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }
2024-01-10T14:27:15.726020Z  WARN temporal_sdk_core::worker::workflow::wft_poller: Error while polling for workflow tasks error=Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }
2024-01-10T14:27:15.726316Z  WARN temporal_sdk_core::worker::workflow::wft_poller: Error while polling for workflow tasks error=Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }
2024-01-10T14:27:15.726543Z  WARN temporal_sdk_core::worker::workflow::wft_poller: Error while polling for workflow tasks error=Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }
2024-01-10T14:27:15.726584Z  WARN temporal_sdk_core::worker::workflow::wft_poller: Error while polling for workflow tasks error=Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }
2024-01-10T14:27:15.727412Z  WARN temporal_sdk_core::worker::workflow::wft_poller: Error while polling for workflow tasks error=Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }
2024-01-10T14:27:15.727472Z  WARN temporal_sdk_core::worker::workflow::workflow_stream: WFT poller errored, shutting down
2024-01-10T14:27:15.727490Z ERROR temporal_sdk_core::worker::workflow::workflow_stream: Workflow processing encountered fatal error and must shut down TonicError(Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None })
2024-01-10T14:27:15.728052Z  WARN temporal_sdk_core::worker::workflow::workflow_stream: WFT poller errored, shutting down
2024-01-10T14:27:15.728070Z ERROR temporal_sdk_core::worker::workflow::workflow_stream: Workflow processing encountered fatal error and must shut down TonicError(Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None })
2024-01-10T14:27:15.728168Z  WARN temporal_sdk_core::worker::workflow::workflow_stream: WFT poller errored, shutting down
2024-01-10T14:27:15.728475Z ERROR temporal_sdk_core::worker::workflow::workflow_stream: Workflow processing encountered fatal error and must shut down TonicError(Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None })
2024-01-10T14:27:15.728506Z  WARN temporal_sdk_core::worker::workflow::workflow_stream: WFT poller errored, shutting down
2024-01-10T14:27:15.728512Z ERROR temporal_sdk_core::worker::workflow::workflow_stream: Workflow processing encountered fatal error and must shut down TonicError(Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None })
2024-01-10T14:27:15.728945Z  WARN temporal_sdk_core::worker::workflow::workflow_stream: WFT poller errored, shutting down
2024-01-10T14:27:15.728958Z ERROR temporal_sdk_core::worker::workflow::workflow_stream: Workflow processing encountered fatal error and must shut down TonicError(Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None })
Worker failed, shutting down
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/temporalio/worker/_activity.py", line 160, in run
    task = await poll_task
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/temporalio/bridge/worker.py", line 98, in poll_activity_task
    await self._ref.poll_activity_task()
RuntimeError: Poll failure: Unhandled grpc error when activity polling: Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/temporalio/worker/_activity.py", line 180, in run
    raise RuntimeError("Activity worker failed") from err
RuntimeError: Activity worker failed
2024-01-10T14:27:15.729754Z  WARN temporal_sdk_core::worker::activities::activity_task_poller_stream: Error while polling for activity tasks error=Status { code: PermissionDenied, message: "Request unauthorized.", details: b"\x08\x07\x12\x15Request unauthorized.\x1aJ\nHtype.googleapis.com/temporal.api.errordetails.v1.PermissionDeniedFailure", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }

Then, the process just hangs forever.

Environment/Versions

  • SDK Version: 1.5.0
@ndtretyak ndtretyak added the bug Something isn't working label Jan 10, 2024
@cretz
Copy link
Member

cretz commented Jan 10, 2024

Thanks for the report! The worker shutdown should succeed and the exception be thrown out of run() if those are the actual workflow and activities you're running. But if it's a more advanced activity that is not respecting cancellation, it's possible worker shutdown is not completing because it is cancelling activities but they are not respecting cancellation in some way. I will test with your above scenario and confirm.

@ndtretyak
Copy link
Contributor Author

I don't actually run the activities. The authorization token is incorrect, so the worker cannot retrieve any tasks from the server.

@cretz
Copy link
Member

cretz commented Jan 10, 2024

I have replicated this with your code. I think this is actually in our Rust Core layer, so I have opened temporalio/sdk-core#667.

@milani
Copy link

milani commented Jul 29, 2024

@cretz the ticket in rust sdk is resolved. does it mean this ticket is resolved and can be closed?

@cretz
Copy link
Member

cretz commented Aug 6, 2024

Yes, closing. This has not yet been made available in a released Python SDK version, but we hope to do another release soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants