Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Few minor diagnostics perf improvements #38232

Conversation

FabianMeiswinkel
Copy link
Member

@FabianMeiswinkel FabianMeiswinkel commented Jan 6, 2024

Description

Few minor perf improvements with diagnostics enabled that were identified when going through CPU profiles.

No monitoring, no Java Agent

About 5 percent throughput improvement with the changes in this PR - primarily by avoiding to use ImmutableEnumMap (which has allocation overhead in iterators)

image
https://fabianm-perf-results.westeurope.kusto.windows.net/fabianm-java-perf-results?query=H4sIAAAAAAAEAG1RS2vCQBC%2bC%2f6HYU8JGI22QgmkoKUHS7USpdcySSbJFHdTdtdX6Y%2fvmkgV6p5mZ%2fZ7zLc5WkzRkCcKTBmVDD5xh8EX6SLQZLYba4TfT9qq2%2fmBfUWaYD2bP6%2fWk%2fkSHgHL2hvn%2fmX45tBouVYQxyASwvwVLansKG4T5G5qWZI3Ckf3QTgMwjEMH6K7MArDftieK%2fqpRpVVC5QExqK2Zs%2b2AlFrLllFW0PaDM7LXAuSse%2bouTHWgFmB17j7WNTzWrF1DKp0lxeXwKQkZUWjSgdnPofVtij4EBurM7Qec%2bFdjMTxTfmBRFbZplYkeiCmLuYNt%2fW6YhMsE%2bG7OnKNp1pKtrP8WrAhPsu14ufH%2f1ZpUGYrpWt%2bE%2bCudIAsI2MSF60P6RFSVt5f5j0YSqd8gp6Q2qmRhtMXZJULtNvpdn4BfO%2fL1hgCAAA%3d&web=0

database("fabianm-java-perf-results").Results
| where TIMESTAMP > ago(5d)
| where Operation == "ReadLatency"
| where TIMESTAMP > datetime(2024-01-05 18:30:00.0000000)
| where BranchName startswith "origin:users/fabianm"
| where TestVariationName in ("Read_NoMonitoring_NoJavaAgent")
| extend Suffix=strcat(iif(BranchName=="origin:users/fabianm/mainclone", "Baseline", "This-PR"), ":", CommitId)
| extend Name=strcat(Suffix, ":", TestVariationName)
| summarize avg(SuccessRate) by bin(TIMESTAMP, 1m), Name
| render timechart

Metrics only (tracing disabled) via Java-Agent (but with Reactor disabled)

Even the other micro-optimizations (avoiding string.format etc.) helped to increase throughput slightly. The throughput improvement was relatively small, but continuously better than the main-branch. With the change to avoid ImmutableEnumMap the improvement across the board increased.

image
https://fabianm-perf-results.westeurope.kusto.windows.net/fabianm-java-perf-results?query=H4sIAAAAAAAEAG1RS2vCQBC%2bC%2f6HYU8RjEZboQgpaOnBUh%2botMcybibJlOym7G58lP74rlGqUPc0OzPfY2YSdLhBS4FIccOoVfiJWwy%2fyKShIVsVzopWZ3mKmo0f2OVkCNaT6fNqPZou4BEwK4NB0roU5x6NjksNcQxiSZi8oiMtD%2bI2QeKrjhUF%2fah%2fH0a9MBpA72F4Fw2jqBOd3hX92KCW%2bQwVgXVonN2xy0GUhjPWw8qSsd3zMNeCZN0bGq6N1WDWENTuPt49wZScYWnnujjU%2fxe%2fhlFG2o0rNyt9m3SlEbUN2vtpElhVacr72Doj0QXMaXBxFsc3%2fXQVspZFqUm0QYz93gs%2bxeucbbhYipaPhz7xVCrFbpJcC9bEZ7mT%2bLn532w1ylZK%2beQ3AW4zD5CSrF36Xbdgc4AN6%2bDvCG3oKa98hB6RxquRgeNNZO433Gw0G79Bcg%2fVKQIAAA%3d%3d&web=0

database("fabianm-java-perf-results").Results
| where TIMESTAMP > ago(5d)
| where Operation == "ReadLatency"
| where TIMESTAMP > datetime(2024-01-05 18:30:00.0000000)
| where BranchName startswith "origin:users/fabianm"
| where TestVariationName in ("Read_WithMetricsOnly_WithJavaAgentButNoReactor")
| extend Suffix=strcat(iif(BranchName=="origin:users/fabianm/mainclone", "Baseline", "This-PR"), ":", CommitId)
| extend Name=strcat(Suffix, ":", TestVariationName)
| summarize avg(SuccessRate) by bin(TIMESTAMP, 1m), Name
| render timechart

Metrics only (tracing disabled) but No Java Agent (or other MeterRegistry subscriber - so, Global MicroMeterRegistry is actually empty)

Before this PR even when no actual meter registry existed, the SDK code to populate metrics was stille xecuted - the actual Micro meter APIs to register values are no-ops in this case, but the overhead was still visible.

image
https://fabianm-perf-results.westeurope.kusto.windows.net/fabianm-java-perf-results?query=H4sIAAAAAAAEAG1RS2vCQBC%2bC%2f6HYU8JGI22QgmkoKUHS30QpT3KJJkkU5JN2V1fpT%2b%2bayJVaPc0zMz3mG9TNBijJkdkGDPKyvvAPXqfpDJPkd6VRgu3H7VVt%2fMNh4IUwWY2f15vJvMVPALmtTNO3etwadFouJYQhiAiwvQVDcnkJP4nSO3UcEXOyB%2fde%2f7Q88cwfAju%2fMD3%2b377buinCmVSLLAi0AaV0Qc2BYhacc4y2GlSenA55laQtHlDxY2xBswSnMbd9t0SzMkoTvRSlqfton6xIUxykkY0wnS0%2flNY77KMj6E2KkHjMGfO1UsY%2futgUCHLpKwliR6IqU265LbeFKy9VSRcWwe28VRXFZtZeivYEF%2fkWvHL8p9rGpTeVZVtfhHgPreAJCGtI5uuC%2fEJYpbOb%2bw9GBZW%2bQw9I5VVIwXnX0gKm2m30%2b38AMdQe3UbAgAA&web=0

database("fabianm-java-perf-results").Results
| where TIMESTAMP > ago(5d)
| where Operation == "ReadLatency"
| where TIMESTAMP > datetime(2024-01-05 18:30:00.0000000)
| where BranchName startswith "origin:users/fabianm"
| where TestVariationName in ("Read_WithMetricsOnly_NoJavaAgent")
| extend Suffix=strcat(iif(BranchName=="origin:users/fabianm/mainclone", "Baseline", "This-PR"), ":", CommitId)
| extend Name=strcat(Suffix, ":", TestVariationName)
| summarize avg(SuccessRate) by bin(TIMESTAMP, 1h), Name
| render timechart

Tracing only (metrics disabled) with Java agent - but Reactor instrumentation disabled

About 3 percent improvement with this PR. When looking at CPU profiles I identified one issue in the AppInsights Java-Agent. It looks like a fix will be available with the next version of the Java Agent.

CPU Profile showing potential improvement in the Java Agent
image

Throughput improvement with this PR
image

https://fabianm-perf-results.westeurope.kusto.windows.net/fabianm-java-perf-results?query=H4sIAAAAAAAEAG1R3UsCQRB%2fF%2fwfhn06wcvTCkK4oKIHI01M6lHGvbm7CXcvdvf8iP74xlNKyH36MTu%2fj5nJMOASPUUqxyWjNfEHrjH%2bJJfHjny9Cl51LmYH1G59w6YkRzAfjR9f53fjKdwCFlV0nXX%2bPl%2bEjYErC2kKakaYPWMgq3fqvEAmv4ENRYNkcBUn%2fTi5hv7N8DIZJslFcngn8vcOrS4naAh8QBf8hkMJqnJcsB3WnpzvHYc5NSQf3tBxE6whs4WoSbd4F4G5Q822aPCTrOCuIBsWk0oadKjcyPrgaiO1RkA1eWgrY2XwWuc5b1Np0Bgi5jz6i5imZ4P1DLLVq8qS6oK6lwOs%2bIDnJft4OlMdwUMpPFTGcBhlp4aN8NHuYH5s%2fjdkw%2fK1MVL8IsB1IQStyfuZLL0Dyx0s2Ua%2f1%2bhC34jznrpnOnEjB%2fvj6FJW3W61Wz8mEzDrMgIAAA%3d%3d&web=0

database("fabianm-java-perf-results").Results
| where TIMESTAMP > ago(5d)
| where Operation == "ReadLatency"
| where TIMESTAMP > datetime(2024-01-05 18:30:00.0000000)
| where BranchName startswith "origin:users/fabianm"
| where TestVariationName in ("Read_WithTracing_WithJavaAgent_NoReactorInstrumentation")
| extend Suffix=strcat(iif(BranchName=="origin:users/fabianm/mainclone", "Baseline", "This-PR"), ":", CommitId)
| extend Name=strcat(Suffix, ":", TestVariationName)
| summarize avg(SuccessRate) by bin(TIMESTAMP, 1m), Name
| render timechart

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

@FabianMeiswinkel
Copy link
Member Author

/azp run java - cosmos - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@FabianMeiswinkel
Copy link
Member Author

/azp run java - cosmos - tests

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@kushagraThapar kushagraThapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @FabianMeiswinkel

Copy link
Member

@xinlian12 xinlian12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@FabianMeiswinkel FabianMeiswinkel merged commit 89fca6c into Azure:main Jan 14, 2024
65 checks passed
FabianMeiswinkel added a commit to FabianMeiswinkel/azure-sdk-for-java that referenced this pull request Feb 3, 2024
* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java
FabianMeiswinkel added a commit that referenced this pull request Feb 4, 2024
* Few minor diagnostics perf improvements (#38232)

* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java

* Update CHANGELOG.md

* Release preparations

* Update CHANGELOG.md

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable (#38388)

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable

* Tests

* Reacting to code review feedback

* remove winutils download steps (#38030)

Co-authored-by: annie-mac <xinlian@microsoft.com>

* addChannelAcquisitionContextInCosmosDiagnosticsBasedOnLatency (#38416)

* add channelAcquisitionContext in cosmosDiagnostics when reaching threshold

---------

Co-authored-by: annie-mac <xinlian@microsoft.com>

---------

Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
Co-authored-by: annie-mac <xinlian@microsoft.com>
FabianMeiswinkel added a commit that referenced this pull request Feb 4, 2024
* Few minor diagnostics perf improvements (#38232)

* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java

* Update CHANGELOG.md

* Release preparations

* Update CHANGELOG.md

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable (#38388)

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable

* Tests

* Reacting to code review feedback

* remove winutils download steps (#38030)

Co-authored-by: annie-mac <xinlian@microsoft.com>

* addChannelAcquisitionContextInCosmosDiagnosticsBasedOnLatency (#38416)

* add channelAcquisitionContext in cosmosDiagnostics when reaching threshold

---------

Co-authored-by: annie-mac <xinlian@microsoft.com>

* Fixing release break for hotfix

---------

Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
Co-authored-by: annie-mac <xinlian@microsoft.com>
FabianMeiswinkel added a commit that referenced this pull request Feb 4, 2024
* Few minor diagnostics perf improvements (#38232)

* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java

* Update CHANGELOG.md

* Release preparations

* Update CHANGELOG.md

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable (#38388)

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable

* Tests

* Reacting to code review feedback

* remove winutils download steps (#38030)

Co-authored-by: annie-mac <xinlian@microsoft.com>

* addChannelAcquisitionContextInCosmosDiagnosticsBasedOnLatency (#38416)

* add channelAcquisitionContext in cosmosDiagnostics when reaching threshold

---------

Co-authored-by: annie-mac <xinlian@microsoft.com>

* Fixing release break for hotfix

* Update utils.py

---------

Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
Co-authored-by: annie-mac <xinlian@microsoft.com>
FabianMeiswinkel added a commit that referenced this pull request Feb 4, 2024
* Few minor diagnostics perf improvements (#38232)

* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java

* Update CHANGELOG.md

* Release preparations

* Update CHANGELOG.md

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable (#38388)

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable

* Tests

* Reacting to code review feedback

* remove winutils download steps (#38030)

Co-authored-by: annie-mac <xinlian@microsoft.com>

* addChannelAcquisitionContextInCosmosDiagnosticsBasedOnLatency (#38416)

* add channelAcquisitionContext in cosmosDiagnostics when reaching threshold

---------

Co-authored-by: annie-mac <xinlian@microsoft.com>

* Fixing release break for hotfix

* Update utils.py

* Update set_versions.py

---------

Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
Co-authored-by: annie-mac <xinlian@microsoft.com>
FabianMeiswinkel added a commit that referenced this pull request Feb 4, 2024
* Few minor diagnostics perf improvements (#38232)

* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java

* Update CHANGELOG.md

* Release preparations

* Update CHANGELOG.md

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable (#38388)

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable

* Tests

* Reacting to code review feedback

* remove winutils download steps (#38030)

Co-authored-by: annie-mac <xinlian@microsoft.com>

* addChannelAcquisitionContextInCosmosDiagnosticsBasedOnLatency (#38416)

* add channelAcquisitionContext in cosmosDiagnostics when reaching threshold

---------

Co-authored-by: annie-mac <xinlian@microsoft.com>

* Fixing release break for hotfix

* Update utils.py

* Update set_versions.py

* Update set_versions.py

---------

Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
Co-authored-by: annie-mac <xinlian@microsoft.com>
FabianMeiswinkel added a commit that referenced this pull request Feb 4, 2024
* Few minor diagnostics perf improvements (#38232)

* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java

* Update CHANGELOG.md

* Release preparations

* Update CHANGELOG.md

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable (#38388)

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable

* Tests

* Reacting to code review feedback

* remove winutils download steps (#38030)

Co-authored-by: annie-mac <xinlian@microsoft.com>

* addChannelAcquisitionContextInCosmosDiagnosticsBasedOnLatency (#38416)

* add channelAcquisitionContext in cosmosDiagnostics when reaching threshold

---------

Co-authored-by: annie-mac <xinlian@microsoft.com>

* Fixing release break for hotfix

* Update utils.py

* Update set_versions.py

* Update set_versions.py

* Update set_versions.py

---------

Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
Co-authored-by: annie-mac <xinlian@microsoft.com>
FabianMeiswinkel added a commit that referenced this pull request Feb 4, 2024
…se (#38626)

* Few minor diagnostics perf improvements (#38232)

* Check for actual meterRegistry

* Update ClientTelemetryMetrics.java

* Update ClientTelemetryMetrics.java

* Update CosmosDiagnosticsTest.java

* Removing String.format in metrics

* Update ClientTelemetryMetrics.java

* Update Uri.java

* Update RntbdClientChannelHealthChecker.java

* Update RntbdToken.java

* Update FaultInjectionRequestContext.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* Update RntbdTokenStream.java

* EnumMap changes

* Avoiding ImmutableEnumMaps

* Fixing unit tests

* Update ClientTelemetryMetrics.java

* Update FaultInjectionRequestContext.java

* Update CHANGELOG.md

* Release preparations

* Update CHANGELOG.md

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable (#38388)

* Allowing to opt-out of E2E timeout for non-point operations via system property or environment variable

* Tests

* Reacting to code review feedback

* remove winutils download steps (#38030)

Co-authored-by: annie-mac <xinlian@microsoft.com>

* addChannelAcquisitionContextInCosmosDiagnosticsBasedOnLatency (#38416)

* add channelAcquisitionContext in cosmosDiagnostics when reaching threshold

---------

Co-authored-by: annie-mac <xinlian@microsoft.com>

* Fixing release break for hotfix

* Update utils.py

* Update set_versions.py

* Update set_versions.py

* Update set_versions.py

* Update set_versions.py

---------

Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
Co-authored-by: annie-mac <xinlian@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants