fix(credit): snapshotting concurrency by GAlexIHU · Pull Request #3997 · openmeterio/openmeter

GAlexIHU · 2026-03-23T17:43:38Z

Overview

A race-condition can happen between persisting new cached values and invalidating them, this change optimistically tries to acquire the lock and if not possible then simply drops the cache entry

Notes for reviewer

Summary by CodeRabbit

Release Notes

Refactor
- Enhanced internal transaction locking mechanisms across credit and entitlement operations to provide improved flexibility in lock acquisition behavior.
- Updated test utilities to align with refined locking implementation.

coderabbitai · 2026-03-23T17:44:00Z

📝 Walkthrough

Walkthrough

This PR adds a wait bool parameter to the LockOwnerForTx and LockEntitlementForTx locking methods across the codebase, controlling whether lock acquisitions should block or fail immediately. Call sites are updated to pass true (wait for lock) in most contexts, with false (NoWait) used specifically in snapshot persistence logic.

Changes

Cohort / File(s)	Summary
Credit Owner Lock Interface & Call Sites `openmeter/credit/grant/owner_connector.go`, `openmeter/credit/balance.go`, `openmeter/credit/grant.go`, `openmeter/credit/balance/service_test.go`	Added `wait bool` parameter to `OwnerConnector.LockOwnerForTx` interface method; updated all call sites to pass `true`; mock test double updated with new parameter.
Entitlement Lock Interface & Implementation `openmeter/entitlement/repository.go`, `openmeter/entitlement/adapter/entitlement.go`, `openmeter/entitlement/metered/grant_owner_adapter.go`	Added `wait bool` parameter to `EntitlementRepo.LockEntitlementForTx` method; forwarded through adapter layers; implementation now conditionally applies `sql.WithLockAction(sql.NoWait)` when `wait == false`.
Snapshot Persistence Logic `openmeter/credit/helper.go`	Wrapped owner lock acquisition in transaction scope using `LockOwnerForTx(ctx, owner, false)`, returning early on lock failure instead of continuing with snapshot creation logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Credit API Refactors (History Rewrite Part I) #2369: Modified the same LockOwnerForTx callsites and interface definition; that PR changed parameter types while this one adds the wait behavior parameter.
Balance & Usage Consistency (History Rewrite Part II) #2383: Updates owner/entitlement row-lock APIs with the same boolean wait parameter pattern and threads it through implementation layers.
feat(entitlements): improve snapshotting behavior #3557: Modifies snapshot-related logic in openmeter/credit/helper.go; overlaps on the snapshot timing and locking semantics.

Suggested reviewers

turip
chrisgacsal

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(credit): snapshotting concurrency' directly and clearly summarizes the main change—addressing a race condition in the snapshotting process by implementing optimistic locking behavior.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/balance-snapshot

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

openmeter/credit/helper.go (1)
180-185: Add observability when lock acquisition fails.

When the optimistic lock fails, the function silently returns nil and skips snapshot persistence. This is fine from a correctness standpoint (matches the PR objective), but there's no logging or metric emission to help operators understand when/how often this happens.

Frequent lock contention causing snapshot misses could indicate a performance issue worth investigating. Consider adding a debug/info log here.
♻️ Suggested improvement
 	if err := transaction.RunWithNoValue(ctx, m.GrantRepo, func(ctx context.Context) error {
 		return m.OwnerConnector.LockOwnerForTx(ctx, snapParams.owner, false)
 	}); err != nil {
 		// If we failed to acquire the lock we simply don't save the snapshot
+		m.Logger.DebugContext(ctx, "skipping snapshot persistence due to lock contention", "owner", snapParams.owner)
 		return nil
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openmeter/credit/helper.go` around lines 180 - 185, The block calling
transaction.RunWithNoValue with m.GrantRepo and m.OwnerConnector.LockOwnerForTx
currently swallows errors and returns nil; update this error branch to emit
observability (e.g., process or package logger and/or metrics) so operators know
when lock acquisition for snapParams.owner fails—log a descriptive message
including the owner identifier (snapParams.owner) and the error returned from
RunWithNoValue/LockOwnerForTx, and optionally increment a contention/failure
metric; keep the current behavior of not persisting the snapshot after
logging/metric emission.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@openmeter/credit/balance/service_test.go`:
- Around line 97-99: Add a unit test that simulates lock acquisition failure by
making MockOwnerConnector.LockOwnerForTx return a non-nil error when called with
wait=false and then invoke the helper function in openmeter/credit/helper.go
(the code path that calls LockOwnerForTx with snapParams.owner, false); assert
that snapshot persistence is skipped (e.g., no calls to the persistence/mock
store or no snapshot written) and that the function returns gracefully without
propagating the lock error. Locate MockOwnerConnector.LockOwnerForTx and the
helper function in helper.go to implement the mock behavior and the assertions.

---

Nitpick comments:
In `@openmeter/credit/helper.go`:
- Around line 180-185: The block calling transaction.RunWithNoValue with
m.GrantRepo and m.OwnerConnector.LockOwnerForTx currently swallows errors and
returns nil; update this error branch to emit observability (e.g., process or
package logger and/or metrics) so operators know when lock acquisition for
snapParams.owner fails—log a descriptive message including the owner identifier
(snapParams.owner) and the error returned from RunWithNoValue/LockOwnerForTx,
and optionally increment a contention/failure metric; keep the current behavior
of not persisting the snapshot after logging/metric emission.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d8d3f1d1-a337-4293-9b66-726d3edf9a94

📥 Commits

Reviewing files that changed from the base of the PR and between 91cdcc7 and fb74161.

📒 Files selected for processing (8)

openmeter/credit/balance.go
openmeter/credit/balance/service_test.go
openmeter/credit/grant.go
openmeter/credit/grant/owner_connector.go
openmeter/credit/helper.go
openmeter/entitlement/adapter/entitlement.go
openmeter/entitlement/metered/grant_owner_adapter.go
openmeter/entitlement/repository.go

coderabbitai · 2026-03-23T17:50:11Z

+func (m *MockOwnerConnector) LockOwnerForTx(ctx context.Context, id models.NamespacedID, wait bool) error {
 	return nil
 }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for tests that might cover wait=false or NoWait behavior rg -n "wait.*false|NoWait|LockOwnerForTx.*false" --type go -g '*_test.go'

Repository: openmeterio/openmeter

Length of output: 47

🏁 Script executed:

# First, let's find where LockOwnerForTx with wait=false is actually called rg -n "LockOwnerForTx.*false|wait.*false" --type go -g '*.go' | head -20

Repository: openmeterio/openmeter

Length of output: 166

🏁 Script executed:

# Look at the helper.go file mentioned in the comment to see how wait=false is used fd -n "helper.go" . | grep -E "balance|credit"

Repository: openmeterio/openmeter

Length of output: 295

🏁 Script executed:

# Check the service_test.go file to understand what tests exist wc -l openmeter/credit/balance/service_test.go

Repository: openmeterio/openmeter

Length of output: 109

🏁 Script executed:

# Look at helper.go context around line 181 where wait=false is called sed -n '170,190p' openmeter/credit/helper.go

Repository: openmeterio/openmeter

Length of output: 1231

🏁 Script executed:

# Read the entire service_test.go to understand test coverage cat -n openmeter/credit/balance/service_test.go | head -100

Repository: openmeterio/openmeter

Length of output: 3812

🏁 Script executed:

# Also check what functions call the helper function with wait=false rg -B 5 "LockOwnerForTx.*false" --type go openmeter/credit/helper.go

Repository: openmeterio/openmeter

Length of output: 467

🏁 Script executed:

# Check if there are any other tests that might verify lock acquisition failures rg -A 10 "func Test" openmeter/credit/balance/service_test.go | head -50

Repository: openmeterio/openmeter

Length of output: 352

🏁 Script executed:

# Check if there are integration tests that might cover lock failure scenarios fd service_test.go --type f | xargs wc -l

Repository: openmeterio/openmeter

Length of output: 565

🏁 Script executed:

# Look at the rest of service_test.go to understand the test structure tail -100 openmeter/credit/balance/service_test.go

Repository: openmeterio/openmeter

Length of output: 3673

Add tests for lock acquisition failures with wait=false.

The mock's signature update is correct, but there's no test coverage for when lock acquisition fails. Looking at openmeter/credit/helper.go:181, the code gracefully skips snapshot persistence when LockOwnerForTx(ctx, snapParams.owner, false) fails—but no tests verify this behavior. Since the mock always returns nil, we can't exercise this error path.

Consider adding a test that verifies the snapshot is skipped gracefully when the lock can't be acquired with wait=false. This would catch regressions if the lock-contention handling changes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@openmeter/credit/balance/service_test.go` around lines 97 - 99, Add a unit test that simulates lock acquisition failure by making MockOwnerConnector.LockOwnerForTx return a non-nil error when called with wait=false and then invoke the helper function in openmeter/credit/helper.go (the code path that calls LockOwnerForTx with snapParams.owner, false); assert that snapshot persistence is skipped (e.g., no calls to the persistence/mock store or no snapshot written) and that the function returns gracefully without propagating the lock error. Locate MockOwnerConnector.LockOwnerForTx and the helper function in helper.go to implement the mock behavior and the assertions.

fix(credit): cache concurrency

fb74161

GAlexIHU requested a review from a team as a code owner March 23, 2026 17:43

GAlexIHU added the release-note/bug-fix Release note: Bug Fixes label Mar 23, 2026

GAlexIHU temporarily deployed to dev March 23, 2026 17:43 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 23, 2026

View reviewed changes

GAlexIHU enabled auto-merge (squash) March 23, 2026 17:53

tothandras approved these changes Mar 23, 2026

View reviewed changes

GAlexIHU merged commit b36c866 into main Mar 23, 2026
37 of 39 checks passed

GAlexIHU deleted the fix/balance-snapshot branch March 23, 2026 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(credit): snapshotting concurrency#3997

fix(credit): snapshotting concurrency#3997
GAlexIHU merged 1 commit into
mainfrom
fix/balance-snapshot

GAlexIHU commented Mar 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GAlexIHU commented Mar 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Notes for reviewer

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GAlexIHU commented Mar 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 23, 2026 •

edited

Loading