fix(credit): snapshotting concurrency#3997
Conversation
📝 WalkthroughWalkthroughThis PR adds a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
openmeter/credit/helper.go (1)
180-185: Add observability when lock acquisition fails.When the optimistic lock fails, the function silently returns
niland skips snapshot persistence. This is fine from a correctness standpoint (matches the PR objective), but there's no logging or metric emission to help operators understand when/how often this happens.Frequent lock contention causing snapshot misses could indicate a performance issue worth investigating. Consider adding a debug/info log here.
♻️ Suggested improvement
if err := transaction.RunWithNoValue(ctx, m.GrantRepo, func(ctx context.Context) error { return m.OwnerConnector.LockOwnerForTx(ctx, snapParams.owner, false) }); err != nil { // If we failed to acquire the lock we simply don't save the snapshot + m.Logger.DebugContext(ctx, "skipping snapshot persistence due to lock contention", "owner", snapParams.owner) return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openmeter/credit/helper.go` around lines 180 - 185, The block calling transaction.RunWithNoValue with m.GrantRepo and m.OwnerConnector.LockOwnerForTx currently swallows errors and returns nil; update this error branch to emit observability (e.g., process or package logger and/or metrics) so operators know when lock acquisition for snapParams.owner fails—log a descriptive message including the owner identifier (snapParams.owner) and the error returned from RunWithNoValue/LockOwnerForTx, and optionally increment a contention/failure metric; keep the current behavior of not persisting the snapshot after logging/metric emission.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openmeter/credit/balance/service_test.go`:
- Around line 97-99: Add a unit test that simulates lock acquisition failure by
making MockOwnerConnector.LockOwnerForTx return a non-nil error when called with
wait=false and then invoke the helper function in openmeter/credit/helper.go
(the code path that calls LockOwnerForTx with snapParams.owner, false); assert
that snapshot persistence is skipped (e.g., no calls to the persistence/mock
store or no snapshot written) and that the function returns gracefully without
propagating the lock error. Locate MockOwnerConnector.LockOwnerForTx and the
helper function in helper.go to implement the mock behavior and the assertions.
---
Nitpick comments:
In `@openmeter/credit/helper.go`:
- Around line 180-185: The block calling transaction.RunWithNoValue with
m.GrantRepo and m.OwnerConnector.LockOwnerForTx currently swallows errors and
returns nil; update this error branch to emit observability (e.g., process or
package logger and/or metrics) so operators know when lock acquisition for
snapParams.owner fails—log a descriptive message including the owner identifier
(snapParams.owner) and the error returned from RunWithNoValue/LockOwnerForTx,
and optionally increment a contention/failure metric; keep the current behavior
of not persisting the snapshot after logging/metric emission.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d8d3f1d1-a337-4293-9b66-726d3edf9a94
📒 Files selected for processing (8)
openmeter/credit/balance.goopenmeter/credit/balance/service_test.goopenmeter/credit/grant.goopenmeter/credit/grant/owner_connector.goopenmeter/credit/helper.goopenmeter/entitlement/adapter/entitlement.goopenmeter/entitlement/metered/grant_owner_adapter.goopenmeter/entitlement/repository.go
| func (m *MockOwnerConnector) LockOwnerForTx(ctx context.Context, id models.NamespacedID, wait bool) error { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for tests that might cover wait=false or NoWait behavior
rg -n "wait.*false|NoWait|LockOwnerForTx.*false" --type go -g '*_test.go'Repository: openmeterio/openmeter
Length of output: 47
🏁 Script executed:
# First, let's find where LockOwnerForTx with wait=false is actually called
rg -n "LockOwnerForTx.*false|wait.*false" --type go -g '*.go' | head -20Repository: openmeterio/openmeter
Length of output: 166
🏁 Script executed:
# Look at the helper.go file mentioned in the comment to see how wait=false is used
fd -n "helper.go" . | grep -E "balance|credit"Repository: openmeterio/openmeter
Length of output: 295
🏁 Script executed:
# Check the service_test.go file to understand what tests exist
wc -l openmeter/credit/balance/service_test.goRepository: openmeterio/openmeter
Length of output: 109
🏁 Script executed:
# Look at helper.go context around line 181 where wait=false is called
sed -n '170,190p' openmeter/credit/helper.goRepository: openmeterio/openmeter
Length of output: 1231
🏁 Script executed:
# Read the entire service_test.go to understand test coverage
cat -n openmeter/credit/balance/service_test.go | head -100Repository: openmeterio/openmeter
Length of output: 3812
🏁 Script executed:
# Also check what functions call the helper function with wait=false
rg -B 5 "LockOwnerForTx.*false" --type go openmeter/credit/helper.goRepository: openmeterio/openmeter
Length of output: 467
🏁 Script executed:
# Check if there are any other tests that might verify lock acquisition failures
rg -A 10 "func Test" openmeter/credit/balance/service_test.go | head -50Repository: openmeterio/openmeter
Length of output: 352
🏁 Script executed:
# Check if there are integration tests that might cover lock failure scenarios
fd service_test.go --type f | xargs wc -lRepository: openmeterio/openmeter
Length of output: 565
🏁 Script executed:
# Look at the rest of service_test.go to understand the test structure
tail -100 openmeter/credit/balance/service_test.goRepository: openmeterio/openmeter
Length of output: 3673
Add tests for lock acquisition failures with wait=false.
The mock's signature update is correct, but there's no test coverage for when lock acquisition fails. Looking at openmeter/credit/helper.go:181, the code gracefully skips snapshot persistence when LockOwnerForTx(ctx, snapParams.owner, false) fails—but no tests verify this behavior. Since the mock always returns nil, we can't exercise this error path.
Consider adding a test that verifies the snapshot is skipped gracefully when the lock can't be acquired with wait=false. This would catch regressions if the lock-contention handling changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openmeter/credit/balance/service_test.go` around lines 97 - 99, Add a unit
test that simulates lock acquisition failure by making
MockOwnerConnector.LockOwnerForTx return a non-nil error when called with
wait=false and then invoke the helper function in openmeter/credit/helper.go
(the code path that calls LockOwnerForTx with snapParams.owner, false); assert
that snapshot persistence is skipped (e.g., no calls to the persistence/mock
store or no snapshot written) and that the function returns gracefully without
propagating the lock error. Locate MockOwnerConnector.LockOwnerForTx and the
helper function in helper.go to implement the mock behavior and the assertions.
Overview
A race-condition can happen between persisting new cached values and invalidating them, this change optimistically tries to acquire the lock and if not possible then simply drops the cache entry
Notes for reviewer
Summary by CodeRabbit
Release Notes