Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQ] More detailed logging on SDK internal flow #39385

Open
2 tasks done
XiaofeiCao opened this issue Mar 25, 2024 · 5 comments
Open
2 tasks done

[FEATURE REQ] More detailed logging on SDK internal flow #39385

XiaofeiCao opened this issue Mar 25, 2024 · 5 comments
Assignees
Labels
Azure.Core azure-core

Comments

@XiaofeiCao
Copy link
Contributor

XiaofeiCao commented Mar 25, 2024

Is your feature request related to a problem? Please describe.
VMWare is migrating to track2, and they met occasional thread stuck. When it happens, there's no HttpClient/Netty log indicating the request is sent or not.
Since the request was never reached backend service, the stuck is most likely on SDK side. We are unsure where, whether it's in HttpPipeline, or Reactor chain, or HttpClient/Netty.

I'm aware that there are two types of logs in current SDK: Request/response log in HttpLoggingPolicy, and HttpPipeline tracer. Though they both track the final HTTP request, not our internal pipeline flow.

Describe the solution you'd like
We need more detailed log to track the client-side request lifecycle, e.g. which HttpPipelinePolicy has the request reached, optimally both inbound and outbound.

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Description Added
  • Expected solution specified

/cc @alzimmermsft @weidongxu-microsoft

@github-actions github-actions bot added the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Mar 25, 2024
@XiaofeiCao XiaofeiCao added the Azure.Core azure-core label Mar 25, 2024
@github-actions github-actions bot removed the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Mar 25, 2024
@lmolkova
Copy link
Member

@XiaofeiCao have you looked into distributed tracing?
We create a span per public API call (well, ideally, in Java we create one in Rest-proxy and it's not great) and per each HTTP request. Some details on what we put are here https://github.com/Azure/azure-sdk/blob/main/docs/tracing/distributed-tracing-conventions.md#public-api-calls

They are industry-wide practice on how to expose the details to client applications in a structured and visualization/analysis-friendly way. More verbose data can come, but would be expensive to collect and store, and should be off by default. For example per-policy logs or traces would be overwhelming.

We can definitely add ad-hoc logs into policies to record key events, configs, branches, etc if we see a log is missing. We should also enrich our public spans with more details and are actively looking for customer feedback.

Let me know what you think.

@lmolkova
Copy link
Member

lmolkova commented Mar 25, 2024

Also, one more thing to explore when diagnosing threading and async issues is Reactor metrics - https://projectreactor.io/docs/core/release/reference/#metrics and JVM metrics supported by OpenTelemetry or Micrometer. They would be much more frugal way to find reactor-related, threading, locking, etc issues.
Reactor metrics documentation needs some love, but maybe VMWare can pull some internal strings to make it better?

@XiaofeiCao
Copy link
Contributor Author

Thanks @lmolkova ! The distributed tracing best fits our goal(glad to know that my previous perception of out Tracer is wrong). Though there may be more work to do for VMWare to integrate with it(or not?).

We'll definitely consider Reactor metrics as well. Appreciate your help!

@lmolkova
Copy link
Member

Thanks @XiaofeiCao , please let me know if you need any help or have examples where we can improve things - there is a lot we can do and should improve and I'm sure we can find ways!

@lmolkova lmolkova self-assigned this Mar 26, 2024
@weidongxu-microsoft
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Azure.Core azure-core
Projects
Status: No status
Development

No branches or pull requests

3 participants