Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

carbonserver/quota: throughput racy counter fixes and refactoring #446

Merged
merged 2 commits into from
Feb 11, 2022

Conversation

bom-d-van
Copy link
Member

@bom-d-van bom-d-van commented Feb 7, 2022

This PR includes two fixes. Most of the observable fix is in the first commit. The second commit is more subtle and more about refactoring.

  • carbonserver: fix broken/untimely quota usage report/reset logics

    The usageRefreshTimeout implementation in updateFileList is quite broken as it produces
    report metric gaps from time to time.

    And the listener.refreshQuotaAndUsage call after updateFileList in fileListUpdater loop
    also causes incorrect quota usage report.

  • carbonserver/quota: fix racy throughput usage collection and throttling, with some refactorings

    The original throughput usage collection and reset process are done though creating a new
    throughput counter, which might be racy as report and reset are done in parallel. This commit
    fixes it by resetting the same usage counter rather than creating new variables every time
    during usage report.

    At the same time, throughput reset are done in intervals (specified by carbonserver.quota-usage-report-frequency).
    This means that there might be latency between resets and causing unnecessary throttling
    when quota is saturated. This commit fixes it by checking against quota during the whole
    range rather than just use fixed quota (similar to sliding window).

    Include some tests for checking the changes above.

    The rest of the changes are mainly refactoring, like renames and more comments.

The usageRefreshTimeout implementation in updateFileList is quite broken as it produces
report metric gaps from time to time.

And the listener.refreshQuotaAndUsage call after updateFileList in fileListUpdater loop
also causes incorrect quota usage report.
@bom-d-van
Copy link
Member Author

The end result: the root throughput usage metrics is matching tcp.metricsReceived more accurately and no more jumps.

Screenshot 2022-02-07 at 10 48 22

…ng, with some refactorings

The original throughput usage collection and reset process are done though creating a new
throughput counter, which might be racy as report and reset are done in parallel. This commit
fixes it by resetting the same usage counter rather than creating new variables every time
during usage report.

At the same time, throughput reset are done in intervals (specified by carbonserver.quota-usage-report-frequency).
This means that there might be latency between resets and causing unnecessary throttling
when quota is saturated. This commit fixes it by checking against quota during the whole
range rather than just use fixed quota (similar to sliding window).

Include some tests for checking the changes above.

The rest of the changes are mainly refactoring, like renames and more comments.
@bom-d-van bom-d-van force-pushed the quota/throughput-racy-counter-fix branch from 5d32d94 to 5f4717b Compare February 7, 2022 10:19
@bom-d-van bom-d-van merged commit 8208e30 into master Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants