Fixed commitments amount uint64 overflow on more active servers with flavor than committed#376
Merged
PhilippMatthes merged 1 commit intomainfrom Nov 7, 2025
Merged
Conversation
…tments and SyncReservations
8 tasks
Contributor
Test Coverage ReportCoverage in lib module (lib/): 48.9%Coverage in reservations module (reservations/internal/): 65.0%Coverage in scheduling module (scheduling/internal/): 68.3%Coverage in knowledge module (knowledge/internal/): 63.1% |
SoWieMarkus
added a commit
that referenced
this pull request
Nov 12, 2025
Fixed metrics and dashboards that use them. Will fix the alerts in a separate PR. Would be especially interested in a review of this commit: f7483ce ## TODO - [x] Add missing metrics - [x] Fix Metrics in Dashboards <del>Check Alerts Figure out where `component` label of metrics did come from in the past Add core alerts back to the services and scope them correctly not get alert on other service error</del> - [x] Why is the Gather() method called so many times? - [x] maybe service monitor config (matches all 3 operators and thats why for each prometheus scrape we are calling all the endpoints, thats why it appears 3x as much) - [x] Have a look at reservations metrics - [x] Reserverations Controller restarting without error? (OOM, commitment with 18446744073709551598 instances, see: #376) - [x] What happened to pg_stat_user_tables_size_bytes - [x] Open Connections just drops to zero and never comes back?
umswmayj
added a commit
that referenced
this pull request
May 7, 2026
…OnConflict in API, generation predicate - Add LastFullReconcileAt to ProjectQuotaStatus (separate watermark for incremental add detection, not advanced by PaygUsage-only recomputes) - isVMNewSinceLastReconcile now uses LastFullReconcileAt instead of LastReconcileAt to avoid false skips after CR-triggered recomputes - Add domainID validation in HandleQuota (400 if missing) - Wrap HandleQuota create/update in RetryOnConflict to handle concurrent status updates from the quota controller - Add projectQuotaGenerationChangePredicate (GenerationChangedPredicate) on ProjectQuota For() watch to prevent infinite reconcile loops from status-only updates - Add CreatedAt field to VM struct, populated in buildVMsFromHypervisors Ref: BLI #376
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The reservations operator ran into an uint64 overflow when there are more active servers with the given flavor of a commitment, than requested by the commitment (
commitment.Amount).This was because after iterating the active servers we expected
commitment.Amountto be negative if this case occurs. Sincecommitment.Amountis unsigned, it will never be negative.In this PR i fixed it by counting the matching servers and checking them before subtracting them form
commitment.Amount.(A cleaner solution might be to not modify the results of limes and instead create a commitment DTO)