Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: health check exception with opentelemetry tracing interceptors #5392

Merged
merged 10 commits into from Nov 15, 2022

Conversation

girishc13
Copy link
Contributor

@girishc13 girishc13 commented Nov 14, 2022

The async tracing interceptors expect the RPC stubs to be a coroutine function otherwise an exception is thrown for the RPC. When tracing was enabled, the health checks with docker or k8s containers failed silently and the Flow timed out eventually. The underlying issue was the usage of the default grpc_health.HealthServicer implementation with an async grpc server.

Goals:

  • use aio.HealthServicer and await keyword when calling the health servicer.
  • add an instrumentation test with a docker container to address the fix.

@girishc13 girishc13 self-assigned this Nov 14, 2022
@github-actions github-actions bot added size/S area/core This issue/PR affects the core codebase area/cli This issue/PR affects the command line interface labels Nov 14, 2022
@@ -60,6 +60,8 @@ def telemetry_wrapper(behavior, request_streaming, response_streaming):
return _wrap_rpc_behavior(next_handler, telemetry_wrapper)

def _intercept_aio_server_unary(self, behavior, handler_call_details):
from jina.helper import iscoroutinefunction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed if we decide to use the aio.HealthServicer approach. I was just experimenting if the interceptors can be easily modified to accept any method.

@girishc13
Copy link
Contributor Author

The latest commit (b6ddf2e) uses the hidden aio.HealthServicer that implements the health check stubs as async methods. This is one potential fix.

Although I'm not sure why this issue is masked when running the Flow locally and in integration tests without using docker.

The other potential solution is to adapt the instrumentation/_aio_server.py tracing interceptor to check if the function is a coroutine or a normal method and correctly use the await keyword only with coroutine functions.

@codecov
Copy link

codecov bot commented Nov 14, 2022

Codecov Report

Merging #5392 (1ec49cf) into master (babb04d) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5392      +/-   ##
==========================================
- Coverage   86.94%   86.93%   -0.02%     
==========================================
  Files         101      101              
  Lines        6611     6611              
==========================================
- Hits         5748     5747       -1     
- Misses        863      864       +1     
Flag Coverage Δ
jina 86.93% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/serve/runtimes/asyncio.py 85.32% <ø> (ø)
jina/serve/instrumentation/__init__.py 100.00% <100.00%> (ø)
jina/serve/runtimes/gateway/grpc/gateway.py 97.26% <100.00%> (ø)
jina/serve/runtimes/worker/request_handling.py 96.98% <0.00%> (-0.61%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions github-actions bot added size/M area/testing This issue/PR affects testing labels Nov 15, 2022
@girishc13 girishc13 changed the title Debug jinahub docker fix: health check exception with opentelemetry tracing interceptors Nov 15, 2022
@@ -182,6 +182,7 @@ def is_ready(ctrl_address: str, **kwargs) -> bool:
:return: True if status is ready else False.
"""

# print('--->address', ctrl_address)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remember to remove

@girishc13 girishc13 marked this pull request as ready for review November 15, 2022 09:53
@JoanFM JoanFM merged commit f738d34 into master Nov 15, 2022
@JoanFM JoanFM deleted the debug-jinahub-docker branch November 15, 2022 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cli This issue/PR affects the command line interface area/core This issue/PR affects the core codebase area/testing This issue/PR affects testing size/M size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants