Skip to content
This repository was archived by the owner on Jun 23, 2025. It is now read-only.

Conversation

@lucifercr07
Copy link
Contributor

@lucifercr07 lucifercr07 commented Mar 25, 2025

Issues:

  1. watch_manager map is not thread safe thus, when multiple go-routines for watch mode are triggered it causes race condition and errors out.
  2. cmdResNil was a shared object used across cmd package. At time multiple go-routines were updating the same object as part of r.R.Attrs.Fields["fingerprint"] = structpb.NewStringValue(strconv.FormatUint(uint64(c.Fingerprint()), 10)) or proto.Marshal which was causing concurrency error.

Testing done:

  • Validated with 900 concurrent watch connections with 10 updates for key/sec.
  • Observed no issues with watch response across clients.

Summary by CodeRabbit

  • New Features

    • Improved command processing now provides a standardized response when no valid data is available.
  • Refactor

    • Enhanced concurrent operations with robust synchronization in watch management, ensuring greater stability and data consistency during multi-threaded interactions.

@coderabbitai
Copy link

coderabbitai bot commented Mar 25, 2025

Walkthrough

The changes update how nil objects are handled in the command module by replacing a static nil response with a dynamically generated one via a new function, GetNilRes(). Additionally, the code within the Execute method now uses this function instead of an unused variable. Separate modifications in the WatchManager add a sync.RWMutex and apply locking mechanisms across its methods to ensure thread safety during concurrent operations.

Changes

File(s) Change Summary
internal/cmd/cmd_get.go, internal/cmd/cmds.go Refactored nil response handling by introducing GetNilRes(), which encapsulates the creation of a nil CmdRes. The Execute method now calls this function instead of using a static variable.
internal/server/…/watch_manager.go Added a sync.RWMutex field (mu) to WatchManager and incorporated locking (via defer) in methods like RegisterThread, HandleWatch, HandleUnwatch, CleanupThreadWatchSubscriptions, and NotifyWatchers for thread safety.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Caller
    participant CmdGet as cmdResFromObject
    participant NilRes as GetNilRes

    Caller->>CmdGet: Call cmdResFromObject(obj)
    alt obj is nil
         CmdGet->>NilRes: Request nil response
         NilRes-->>CmdGet: Return CmdRes with Response_VNil
    else obj is valid
         CmdGet-->>Caller: Processed non-nil response
    end
    CmdGet-->>Caller: Return final response
Loading
sequenceDiagram
    participant Thread as Thread
    participant WM as WatchManager

    Thread->>WM: Call method (e.g., RegisterThread)
    WM->>WM: Acquire mu lock (write)
    WM-->>WM: Update maps (clientWatchThreadMap, keyFPMap, fpClientMap)
    WM->>WM: Defer release of mu lock
    WM-->>Thread: Return result
Loading

Poem

I hop through lines of clever code,
With changes neat, a brand-new mode.
Nil responses now spring to life,
And mutex locks save threads from strife.
A bunny's cheer for every tweak,
CodeRabbit smiles—our changes speak!
🐇🌟

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
internal/cmd/cmd_get.go (1)

59-62: Improves thread safety by using per-request nil response

The change from cmdResNil to GetNilRes() addresses potential race conditions when multiple goroutines access the same shared object. This ensures that each request gets its own instance of the nil response object, preventing concurrent modifications.

Consider adding a small comment explaining why GetNilRes() is used instead of the static cmdResNil:

  if obj == nil {
+    // Use GetNilRes() instead of cmdResNil to avoid race conditions in concurrent environments
    return GetNilRes(), nil
  }
internal/cmd/cmds.go (1)

155-159: Good encapsulation of nil response creation

This new function properly encapsulates the creation of a nil response object, which helps maintain code consistency and addresses the concurrency issues mentioned in the PR.

Consider adding documentation to explain the purpose of this function:

+// GetNilRes returns a new CmdRes with a nil value response.
+// This function is used to ensure thread safety by creating a new response object
+// for each request instead of using a shared static object.
func GetNilRes() *CmdRes {
	return &CmdRes{R: &wire.Response{
		Value: &wire.Response_VNil{VNil: true},
	}}
}
internal/server/ironhawk/watch_manager.go (2)

36-42: Proper locking for RegisterThread

Adding mutex locking for the RegisterThread method ensures thread-safe modifications to the clientWatchThreadMap. The use of defer to unlock is a good practice to ensure the mutex is always released.

Since this method only writes to the map when t.Mode == "watch", consider using a read lock first and upgrading to a write lock only when necessary:

func (w *WatchManager) RegisterThread(t *IOThread) {
-	w.mu.Lock()
-	defer w.mu.Unlock()
+	if t.Mode == "watch" {
+		w.mu.Lock()
+		defer w.mu.Unlock()
+		w.clientWatchThreadMap[t.ClientID] = t
+	}
-	if t.Mode == "watch" {
-		w.clientWatchThreadMap[t.ClientID] = t
-	}
}

132-135: Thread safety added to NotifyWatchers

Adding mutex locking to the NotifyWatchers method ensures thread-safe access to the various maps when notifying watchers. This is a critical method for the watch functionality, and the locking improves reliability.

For optimal performance, you might consider using read locks for parts of this method that only read from the maps (before executing commands), as this would allow multiple concurrent reads:

func (w *WatchManager) NotifyWatchers(c *cmd.Cmd, shardManager *shardmanager.ShardManager, t *IOThread) {
-	w.mu.Lock()
-	defer w.mu.Unlock()
+	w.mu.RLock()
+	defer w.mu.RUnlock()

	key := c.Key()
	// Rest of the method...
}

However, since this method does modify maps in certain conditions (line 156), a more complex locking strategy with upgrading from read to write locks would be needed, which might add unnecessary complexity. The current approach with a single write lock is safer and easier to maintain.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 64e24d4 and 16fe270.

📒 Files selected for processing (3)
  • internal/cmd/cmd_get.go (1 hunks)
  • internal/cmd/cmds.go (2 hunks)
  • internal/server/ironhawk/watch_manager.go (5 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
internal/server/ironhawk/watch_manager.go (2)
internal/server/ironhawk/iothread.go (3)
  • t (36-84)
  • t (86-89)
  • IOThread (17-22)
internal/cmd/cmds.go (6)
  • c (31-33)
  • c (35-37)
  • c (39-44)
  • c (46-65)
  • c (148-148)
  • Cmd (23-29)
internal/cmd/cmd_get.go (1)
internal/cmd/cmds.go (1)
  • GetNilRes (155-159)
🔇 Additional comments (5)
internal/cmd/cmds.go (1)

46-48: Thread safety improvement for command execution

Changing from the static cmdResNil to a dynamically generated response through GetNilRes() is an excellent improvement for thread safety. This prevents race conditions in concurrent environments by ensuring each command execution gets its own fresh response object.

internal/server/ironhawk/watch_manager.go (4)

17-19: Thread safety improvement for WatchManager

Adding a mutex to the WatchManager struct is an excellent change that addresses race conditions when multiple goroutines interact with the watch manager's maps. This is a key fix for the concurrency issues described in the PR.


44-47: Thread safety added to HandleWatch

Adding mutex locking for the HandleWatch method ensures thread-safe access to the various maps modified in this method. Good use of defer to ensure the mutex is always unlocked.


77-80: Thread safety added to HandleUnwatch

The addition of mutex locking here addresses potential race conditions when multiple goroutines attempt to unsubscribe from watches concurrently. The use of defer ensures the mutex is properly released.


114-117: Thread safety added to CleanupThreadWatchSubscriptions

Adding mutex locking to the cleanup method ensures that the removal of watch subscriptions happens safely in concurrent environments. The consistent pattern of using defer for unlocking is good practice.

Comment on lines +155 to +160
func GetNilRes() *CmdRes {
return &CmdRes{R: &wire.Response{
Value: &wire.Response_VNil{VNil: true},
}}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this function? why can't we reuse cmdResNil?

Copy link
Contributor Author

@lucifercr07 lucifercr07 Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmdResNil was a shared object used across cmd package. At time multiple go-routines were updating the same object as part of r.R.Attrs.Fields["fingerprint"] = structpb.NewStringValue(strconv.FormatUint(uint64(c.Fingerprint()), 10)) or proto.Marshal which was causing concurrency error.

Example it was returned in cmdResFromObject for different go-routines when there's no value for object

func cmdResFromObject(obj *object.Obj) (*CmdRes, error) {
    if obj == nil {
        return cmdResNil, nil
    }

Can be validated from below logs, they've same object address for different go-routines

2025-03-25T12:38:46+05:30 DBG Response object address address=0x104c9cea0
2025-03-25T12:38:46+05:30 DBG notifying watchers for key key=k1 watchers=141
2025-03-25T12:38:46+05:30 DBG command executed client_id=f94ec20d-28d1-454e-b505-b0e7e0f0ab0d cmd="HANDSHAKE f94ec20d-28d1-454e-b505-b0e7e0f0ab0d command" mode=command took_ns=334
2025-03-25T12:38:46+05:30 DBG command executed client_id=28cd16a8-8f5a-4924-b16f-437a067b9185 cmd="GET.WATCH k1" mode=command took_ns=417
2025-03-25T12:38:46+05:30 DBG Response object address address=0x104c9cea0
fatal error: concurrent map iteration and map write

@arpitbbhayani arpitbbhayani merged commit cf52728 into dicedb:master Mar 26, 2025
3 of 4 checks passed
@arpitbbhayani
Copy link
Contributor

Thanks for the patch merged!

@arpitbbhayani arpitbbhayani mentioned this pull request Mar 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants