Skip to content

Conversation

@agaffney
Copy link
Contributor

@agaffney agaffney commented Nov 25, 2025


Summary by cubic

Add initial Handshake indexer that connects to a Handshake peer and streams blocks. It’s gated by the new INDEXER_HANDSHAKE_ADDRESS config and includes automatic reconnect with backoff.

  • New Features

    • New Handshake indexer with peer connect, sync callback, and state tracking.
    • Exponential reconnect backoff up to 2 minutes on disconnect, with logging.
    • New config: Indexer.handshakeAddress (env: INDEXER_HANDSHAKE_ADDRESS).
    • Split startup into startCardano and startHandshake.
  • Bug Fixes

    • Signal shutdown via Peer.DoneChan and avoid emitting async errors during shutdown.
    • Fix Update covenant parsing to expect 3 items; remove BlockHash and improve panic messages.

Written for commit f2d6cee. Summary will update automatically on new commits.

Summary by CodeRabbit

  • New Features

    • Handshake subsystem with automatic peer reconnection, exponential backoff, and sync-driven updates.
    • New configuration option to specify the handshake address.
  • Refactor

    • Split indexer startup into separate Cardano and handshake startup steps for clearer sequencing.
  • Bug Fixes

    • Improved shutdown/error signaling to avoid spurious errors during graceful shutdown.
  • Breaking Changes

    • Removed a previously exported field from the covenant update structure (tests updated).

✏️ Tip: You can customize this high-level summary in your review settings.

@agaffney agaffney requested a review from a team as a code owner November 25, 2025 01:51
@coderabbitai
Copy link

coderabbitai bot commented Nov 25, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a HandshakeAddress config field and integrates a handshake subsystem into the indexer. Indexer gains a handshakeState field and Start() now invokes startCardano() followed by startHandshake(). The handshake code reads the configured address (no-op if empty), creates a Peer, connects to the remote address, and runs a Sync-driven loop that tracks block height, logs block/hash/prev-hash and iterates transactions to log covenant-related data. A reconnection goroutine implements exponential backoff (starting 1s, doubling, capped at 120s) and resets on successful connect. Peer.Close now closes the done channel; recvLoop and Sync check that done channel to avoid emitting async errors during shutdown. UpdateCovenant had its BlockHash field removed and related parsing and tests updated.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Inspect concurrency and channel lifecycle changes in internal/handshake/peer.go for potential send-after-close races and proper shutdown signaling (DoneChan, Close, recvLoop, Sync).
  • Review reconnection/backoff loop and error handling in internal/indexer/handshake.go, including reset-on-success behavior and proper goroutine termination when peer.DoneChan is closed.
  • Verify startup sequencing and handshakeState integration in internal/indexer/indexer.go (startCardano vs startHandshake).
  • Confirm new HandshakeAddress config field tags and usage in internal/config/config.go.
  • Validate covenant parsing changes and updated tests in internal/handshake/covenant*.go.

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: initial Handshake indexer support' directly and accurately describes the main objective of the changeset: adding initial Handshake indexer functionality.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/handshake-indexer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
internal/indexer/handshake.go (1)

25-38: Consider applying reconnection logic to initial connection failures.

If the initial connection at line 31 fails, the function returns an error without attempting reconnection. This means the indexer will fail to start if the handshake peer is temporarily unavailable. Consider applying the same exponential backoff reconnection logic for initial connection failures.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 27a3f30 and ba87fc7.

📒 Files selected for processing (4)
  • internal/config/config.go (1 hunks)
  • internal/handshake/peer.go (1 hunks)
  • internal/indexer/handshake.go (1 hunks)
  • internal/indexer/indexer.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
internal/config/config.go (2)
internal/handshake/network.go (1)
  • Network (9-13)
internal/handshake/transaction.go (1)
  • Address (185-188)
internal/indexer/handshake.go (5)
internal/handshake/peer.go (2)
  • Peer (30-44)
  • NewPeer (48-65)
internal/indexer/indexer.go (1)
  • Indexer (57-65)
internal/config/config.go (1)
  • GetConfig (177-179)
internal/handshake/network.go (1)
  • NetworkMainnet (15-19)
internal/handshake/block.go (1)
  • Block (22-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Analyze (go)
🔇 Additional comments (6)
internal/config/config.go (1)

53-61: LGTM!

The new HandshakeAddress field is properly configured with consistent YAML and environment variable tags, following the established pattern in this struct.

internal/handshake/peer.go (1)

479-480: LGTM!

Closing the error channel properly signals shutdown to consumers. Each peer instance has its own error channel, so this is safe even when reconnection creates new peer instances.

internal/indexer/handshake.go (1)

18-23: LGTM!

The handshakeState struct appropriately tracks peer connection state, backoff timing, and block height.

internal/indexer/indexer.go (3)

57-65: LGTM!

The handshakeState field is properly added to track the handshake subsystem state.


79-87: LGTM!

The refactored Start() method cleanly separates Cardano and Handshake initialization concerns with appropriate error handling.


89-251: LGTM!

The extracted startCardano() method properly encapsulates all Cardano-related initialization logic. The refactoring maintains the original behavior while improving code organization.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files

Prompt for AI agents (all 1 issues)

Understand the root cause of the following 1 issues and fix them.


<file name="internal/handshake/peer.go">

<violation number="1" location="internal/handshake/peer.go:480">
Closing `errorCh` here is unsafe because `recvLoop` and the sync goroutine still send errors to the same channel; once the connection shuts down they will panic with “send on closed channel.”</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

_ = p.Close()
}
// Close error channel to signify shutdown to consumer
close(p.errorCh)
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closing errorCh here is unsafe because recvLoop and the sync goroutine still send errors to the same channel; once the connection shuts down they will panic with “send on closed channel.”

Prompt for AI agents
Address the following comment on internal/handshake/peer.go at line 480:

<comment>Closing `errorCh` here is unsafe because `recvLoop` and the sync goroutine still send errors to the same channel; once the connection shuts down they will panic with “send on closed channel.”</comment>

<file context>
@@ -476,6 +476,8 @@ func (p *Peer) Sync(locator [][32]byte, syncFunc SyncFunc) error {
 			_ = p.Close()
 		}
+		// Close error channel to signify shutdown to consumer
+		close(p.errorCh)
 	}(locator)
 	return nil
</file context>
Fix with Cubic

@agaffney agaffney force-pushed the feat/handshake-indexer branch 2 times, most recently from c3edd89 to 3a52b50 Compare November 26, 2025 00:54
@agaffney
Copy link
Contributor Author

@cubic-dev-ai review

@cubic-dev-ai
Copy link

cubic-dev-ai bot commented Nov 26, 2025

@cubic-dev-ai review

@agaffney I've started the AI code review. It'll take a few minutes to complete.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 4 files

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
internal/indexer/handshake.go (1)

48-82: Fix stale error value used in reconnect logging

Inside the reconnect loop, the err you log is always the one read from ErrorChan() (the initial disconnect), because the if err := i.handshakeConnectPeer(); line declares a new shadowed err. As a result, "connection to Handshake peer failed" will never show the actual dial/connect error for retry attempts, which makes debugging harder.

Reuse the same err variable instead of shadowing it so each failure logs the latest error:

-	// Async error handler
-	go func() {
-		err := <-i.handshakeState.peer.ErrorChan()
+	// Async error handler
+	go func() {
+		err := <-i.handshakeState.peer.ErrorChan()
 		if err != nil {
 			slog.Error(
 				"peer disconnected",
 				"error",
 				err,
 			)
 		}
 		// Try reconnecting to peer until we are successful
 		for {
-			if err := i.handshakeConnectPeer(); err == nil {
+			if err = i.handshakeConnectPeer(); err == nil {
 				i.handshakeState.peerBackoffDelay = 0
 				return
 			}
 			if i.handshakeState.peerBackoffDelay == 0 {
 				// Set initial backoff delay
 				i.handshakeState.peerBackoffDelay = 1 * time.Second
 			} else {
 				// Double backoff delay
 				i.handshakeState.peerBackoffDelay *= 2
 			}
 			// Don't delay longer than 2m
 			if i.handshakeState.peerBackoffDelay > 120*time.Second {
 				i.handshakeState.peerBackoffDelay = 120 * time.Second
 			}
 			slog.Error(
 				"connection to Handshake peer failed",
 				"error",
 				err,
 				"delay",
 				i.handshakeState.peerBackoffDelay.String(),
 			)
 			time.Sleep(i.handshakeState.peerBackoffDelay)
 		}
 	}()

If Peer.ErrorChan() ever closes cleanly without sending an error (e.g., on orderly shutdown), you may also want to treat err == nil as a signal to skip the reconnect loop entirely, to avoid reconnecting during shutdown.

🧹 Nitpick comments (1)
internal/indexer/handshake.go (1)

18-35: startHandshake gating and state wiring look good

Config‑gated startup and storing peerAddress into handshakeState before connecting is clean and keeps this subsystem nicely isolated. If you expect operators to sometimes forget to set Indexer.HandshakeAddress, consider an Info‑level log when the address is empty to make it obvious that the Handshake indexer is disabled, but this is optional.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3edd89 and 3a52b50.

📒 Files selected for processing (4)
  • internal/config/config.go (1 hunks)
  • internal/handshake/peer.go (3 hunks)
  • internal/indexer/handshake.go (1 hunks)
  • internal/indexer/indexer.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/handshake/peer.go
  • internal/config/config.go
  • internal/indexer/indexer.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/indexer/handshake.go (5)
internal/handshake/peer.go (2)
  • Peer (30-44)
  • NewPeer (48-65)
internal/indexer/indexer.go (1)
  • Indexer (57-65)
internal/config/config.go (1)
  • GetConfig (177-179)
internal/handshake/network.go (1)
  • NetworkMainnet (15-19)
internal/handshake/block.go (1)
  • Block (22-25)
🔇 Additional comments (1)
internal/indexer/handshake.go (1)

91-99: Sync callback and structured logging look solid

Incrementing a local blockHeight counter and emitting a slog.Debug with hex‑encoded hash and prevHash is a good minimal sync signal and consistent with the rest of the logging style. This should be easy to wire into future metrics if needed.

@agaffney agaffney force-pushed the feat/handshake-indexer branch from 3a52b50 to 26ec6e8 Compare November 26, 2025 21:45
@agaffney
Copy link
Contributor Author

{"timestamp":"2025-11-26T16:37:10-05:00","level":"DEBUG","msg":"Handshake domain update","component":"main","resdata":{"Version":0,"Records":[{"Name":"whois.oo00oo.","Address":"52.218.220.89"}]}}
{"timestamp":"2025-11-26T16:37:14-05:00","level":"DEBUG","msg":"Handshake domain update","component":"main","resdata":{"Version":0,"Records":[{"Name":"ns1.laboratory."},{"Name":"laboratory.","Address":"165.22.195.251"}]}}
{"timestamp":"2025-11-26T16:37:15-05:00","level":"DEBUG","msg":"Handshake domain update","component":"main","resdata":{"Version":0,"Records":[{"Name":"ns1.nameserver.io."},{"Name":"ns2.nameserver.io."},{"Name":"ns3.nameserver.io."},{"Name":"ns4.nameserver.io."}]}}
{"timestamp":"2025-11-26T16:37:15-05:00","level":"DEBUG","msg":"Handshake domain update","component":"main","resdata":{"Version":0,"Records":[{"Name":"ns1.nameserver.io."},{"Name":"ns2.nameserver.io."},{"Name":"ns3.nameserver.io."},{"Name":"ns4.nameserver.io."}]}}
{"timestamp":"2025-11-26T16:37:15-05:00","level":"DEBUG","msg":"Handshake domain update","component":"main","resdata":{"Version":0,"Records":[{"Name":"ns1.laboratory."},{"Name":"laboratory.","Address":"165.22.195.251"},{"Name":"ns1.laboratory.","Address":"165.22.195.251"}]}}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
internal/handshake/peer.go (1)

92-106: Race condition between Close() and error senders persists.

The guards at lines 193-198 and 485-490 check doneCh before sending to errorCh, but there's a TOCTOU race: Close() can execute between the check and the send, causing a panic on the closed errorCh.

A safer approach: either avoid closing errorCh (let GC reclaim it when the peer is no longer referenced), or use a mutex to protect the send path.

 func (p *Peer) Close() error {
 	p.mu.Lock()
 	defer p.mu.Unlock()
 	if p.conn == nil {
 		return errors.New("connection is not established")
 	}
 	if err := p.conn.Close(); err != nil {
 		return err
 	}
 	p.conn = nil
 	// Close done channel to signify shutdown internally
 	close(p.doneCh)
-	// Close error channel to signify shutdown to consumer
-	close(p.errorCh)
+	// Note: errorCh is not closed to avoid races with senders.
+	// Consumers should select on both errorCh and doneCh.
 	return nil
 }
🧹 Nitpick comments (2)
internal/handshake/peer.go (1)

193-201: Guard pattern is reasonable but consider helper extraction.

The shutdown guard pattern is duplicated at lines 193-198 and 485-490. Consider extracting to a helper method for consistency and to ensure both paths stay synchronized.

internal/indexer/handshake.go (1)

47-83: Consider adding shutdown signal for reconnection goroutine.

The async reconnection loop has no mechanism to stop during indexer shutdown. While the current os.Exit pattern terminates all goroutines, adding a done channel would enable graceful shutdown if the exit pattern changes in the future.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a52b50 and 26ec6e8.

📒 Files selected for processing (6)
  • internal/config/config.go (1 hunks)
  • internal/handshake/covenant.go (3 hunks)
  • internal/handshake/covenant_test.go (0 hunks)
  • internal/handshake/peer.go (3 hunks)
  • internal/indexer/handshake.go (1 hunks)
  • internal/indexer/indexer.go (2 hunks)
💤 Files with no reviewable changes (1)
  • internal/handshake/covenant_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/config/config.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (7)
internal/handshake/covenant.go (2)

65-82: LGTM - Improved panic message formatting.

Using fmt.Sprintf for panic messages provides clearer error context and is consistent with Go conventions.


120-150: UpdateCovenant BlockHash removal is verified as internally consistent.

All checks confirm the struct definition (lines 120-124), items length validation (3 items at line 134), and parsing logic are aligned after removing BlockHash. Tests have been appropriately updated, and the only external consumer (internal/indexer/handshake.go) accesses only ResourceData, which is unaffected. No orphaned references to UpdateCovenant.BlockHash remain in the codebase.

internal/indexer/handshake.go (2)

18-35: LGTM - Clean initialization flow.

The handshakeState struct is well-organized, and startHandshake correctly guards against missing configuration before attempting connection.


91-113: LGTM - Sync handler with appropriate logging.

Using slog.Debug for per-block logging avoids noise during high-volume sync while still providing observability when needed.

internal/indexer/indexer.go (3)

79-87: LGTM - Clean subsystem separation.

The refactoring cleanly separates Cardano and Handshake startup. Note that if startHandshake() fails, the already-started Cardano pipeline continues running. This may be intentional (independent subsystems), but consider whether a failure in one should affect the other.


57-65: LGTM - Struct extension for Handshake state.

Adding handshakeState to the Indexer struct cleanly encapsulates the Handshake subsystem's state.


89-251: LGTM - Clean extraction of Cardano startup logic.

The startCardano() function preserves the existing startup behavior. The os.Exit(1) calls on lines 236 and 245 are pre-existing patterns, not introduced by this PR.

@agaffney agaffney force-pushed the feat/handshake-indexer branch from 26ec6e8 to 2407685 Compare November 26, 2025 22:00
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
internal/indexer/indexer.go (1)

232-247: Pre-existing pattern: os.Exit(1) bypasses cleanup.

These os.Exit(1) calls existed before this PR, but they prevent deferred cleanup and graceful shutdown. Consider returning errors up the call chain or using a shutdown signal for future improvement.

internal/indexer/handshake.go (2)

18-23: Consider adding sync protection for handshakeState.

The handshakeState struct is accessed from the main goroutine (during startup) and the async error handler goroutine (during reconnection). While writes to different fields don't race in Go, concurrent read/write to the same field (e.g., peer, peerBackoffDelay) without synchronization is a data race.

Consider adding a mutex:

 type handshakeState struct {
+	mu               sync.Mutex
 	peer             *handshake.Peer
 	peerAddress      string
 	peerBackoffDelay time.Duration
 	blockHeight      int
 }

And protect access in handshakeConnectPeer() and the error handler.


92-99: blockHeight tracking may be inaccurate after reconnection.

The counter starts at 0 and increments per block, but:

  1. It doesn't reflect the actual chain height (just blocks synced since startup).
  2. On reconnection, sync restarts from genesis (nil locator), but blockHeight isn't reset.

If this is intended as a debug counter, consider renaming to syncedBlockCount. If it should track actual height, use the block's height from the chain data.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26ec6e8 and 2407685.

📒 Files selected for processing (6)
  • internal/config/config.go (1 hunks)
  • internal/handshake/covenant.go (3 hunks)
  • internal/handshake/covenant_test.go (0 hunks)
  • internal/handshake/peer.go (3 hunks)
  • internal/indexer/handshake.go (1 hunks)
  • internal/indexer/indexer.go (2 hunks)
💤 Files with no reviewable changes (1)
  • internal/handshake/covenant_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/config/config.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
internal/handshake/covenant.go (1)

120-150: LGTM! Consistent removal of BlockHash from UpdateCovenant.

The changes are coherent: the struct definition, item count validation (4 → 3), and initialization logic are all updated consistently. The improved panic messages with fmt.Sprintf aid debugging.

internal/handshake/peer.go (1)

192-201: Guard pattern is correct, but depends on errorCh not being closed.

The non-blocking select on doneCh before sending to errorCh is the right pattern. However, this only works safely if errorCh is never closed (see comment on Close()).

internal/indexer/indexer.go (1)

79-87: Clean separation of startup concerns.

Splitting Start() into startCardano() and startHandshake() improves code organization and allows each subsystem to be initialized independently. The sequential error handling is appropriate.

internal/indexer/handshake.go (1)

100-112: LGTM! Transaction processing and covenant logging.

The sync handler correctly iterates transactions and outputs, logging domain registrations and updates via structured logging. The type switch on covenant types is clean.

@agaffney agaffney force-pushed the feat/handshake-indexer branch from 2407685 to 8800a47 Compare November 26, 2025 22:10
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal/handshake/covenant.go (1)

65-82: Consider avoiding panics on malformed covenants.

GenericCovenant.Covenant() still panics if NewRegisterCovenantFromGeneric/NewUpdateCovenantFromGeneric fail, now with clearer messages. Given this runs on external chain data, a single bad or unexpected covenant will crash the process. Consider propagating an error (or returning the generic covenant) instead of panicking so the indexer can log-and-skip bad items without taking down the service.

internal/indexer/indexer.go (1)

89-129: Optional: guard against duplicate watched addresses on repeated Start().

startCardano() appends to i.watched without resetting it. If Start() were ever invoked more than once on the same Indexer (tests, hot-restart, etc.), watched entries and discovered addresses would be duplicated. If repeated starts are a possibility, consider clearing i.watched at the top of startCardano().

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2407685 and 8800a47.

📒 Files selected for processing (6)
  • internal/config/config.go (1 hunks)
  • internal/handshake/covenant.go (3 hunks)
  • internal/handshake/covenant_test.go (0 hunks)
  • internal/handshake/peer.go (4 hunks)
  • internal/indexer/handshake.go (1 hunks)
  • internal/indexer/indexer.go (2 hunks)
💤 Files with no reviewable changes (1)
  • internal/handshake/covenant_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/handshake/peer.go
  • internal/config/config.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
internal/handshake/covenant.go (1)

120-150: UpdateCovenant item-layout change looks consistent.

Switching UpdateCovenant to 3 items (name hash, height, resource data) and dropping BlockHash is internally consistent: length check matches fields used and indices 0/1/2 are consumed as expected.

internal/indexer/indexer.go (1)

57-87: Split of Start into startCardano/startHandshake looks fine.

The new handshakeState field and the refactor of Start() into startCardano() + startHandshake() preserve the previous Cardano startup behavior while cleanly layering handshake initialization on top.

internal/indexer/handshake.go (2)

25-35: Handshake startup gating is reasonable.

Conditionally starting handshake based on Indexer.HandshakeAddress and storing the address in handshakeState is straightforward and keeps handshake optional.


95-116: Handshake sync logging and covenant inspection look good for initial visibility.

Incrementing a local blockHeight counter and emitting slog.Debug with block hash/prev-hash plus ResourceData for Register/Update covenants provides useful initial observability and matches the debug logs you shared.

Comment on lines 37 to 85
func (i *Indexer) handshakeConnectPeer() error {
slog.Info("connecting to Handshake peer", "address", i.handshakeState.peerAddress)
p, err := handshake.NewPeer(nil, handshake.NetworkMainnet)
if err != nil {
return err
}
i.handshakeState.peer = p
if err := i.handshakeState.peer.Connect(i.handshakeState.peerAddress); err != nil {
return err
}
// Async error handler
go func() {
select {
case err := <-i.handshakeState.peer.ErrorChan():
slog.Error(
"peer disconnected",
"error",
err,
)
case <-i.handshakeState.peer.DoneChan():
// Stop waiting on connection shutdown
}
// Try reconnecting to peer until we are successful
for {
err = i.handshakeConnectPeer()
if err == nil {
i.handshakeState.peerBackoffDelay = 0
return
}
if i.handshakeState.peerBackoffDelay == 0 {
// Set initial backoff delay
i.handshakeState.peerBackoffDelay = 1 * time.Second
} else {
// Double backoff delay
i.handshakeState.peerBackoffDelay *= 2
}
// Don't delay longer than 2m
if i.handshakeState.peerBackoffDelay > 120*time.Second {
i.handshakeState.peerBackoffDelay = 120 * time.Second
}
slog.Error(
"connection to Handshake peer failed",
"error",
err,
"delay",
i.handshakeState.peerBackoffDelay.String(),
)
time.Sleep(i.handshakeState.peerBackoffDelay)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: reconnect loop can spawn overlapping goroutines and race on state.

handshakeConnectPeer both:

  • Spawns an async goroutine that, after a disconnect, loops calling handshakeConnectPeer() in a for {} (lines 47–85), and
  • Starts sync with peer.Sync(...) and closes the peer on error (lines 88–90).

Because every successful connect creates a new goroutine with its own reconnect loop, and reconnect attempts also call handshakeConnectPeer() (which again spawns a goroutine), you can end up with multiple reconnect loops running concurrently after successive failures. Those loops:

  • All call handshakeConnectPeer() in parallel, and
  • Mutate shared fields like i.handshakeState.peer and i.handshakeState.peerBackoffDelay without synchronization.

This is a data-race risk and can also lead to duplicated connection attempts/backoff behavior.

Consider restructuring so there is exactly one goroutine responsible for reconnection, for example:

  • Factor out a helper that does a single “connect + sync” attempt without starting another error-handler goroutine.
  • Have a single long-lived error-handler / reconnect goroutine that:
    • Waits on the current peer’s ErrorChan/DoneChan.
    • Runs the single-attempt helper in a backoff loop until it succeeds.
    • Updates handshakeState.peer and peerBackoffDelay exclusively.

This keeps reconnection logic centralized and avoids overlapping loops writing shared state.

🤖 Prompt for AI Agents
internal/indexer/handshake.go lines 37-85: the current implementation spawns a
new goroutine on every successful connect which itself calls
handshakeConnectPeer in a loop, producing overlapping reconnection goroutines
and unsynchronized mutations of handshakeState (peer, peerBackoffDelay).
Refactor so handshakeConnectPeer performs a single connect+sync attempt and does
NOT spawn a reconnect goroutine; create one dedicated, long-lived reconnect
goroutine that exclusively waits on the current peer's ErrorChan/DoneChan and
performs reconnect attempts in a backoff loop using a helper (e.g.,
connectAttempt) that returns the new peer or an error — update
handshakeState.peer and peerBackoffDelay only from that single goroutine (or
protect shared state with a mutex) to eliminate races and duplicated loops.

Signed-off-by: Aurora Gaffney <aurora@blinklabs.io>
@agaffney agaffney force-pushed the feat/handshake-indexer branch from 8800a47 to f2d6cee Compare November 26, 2025 22:27
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
internal/indexer/handshake.go (1)

36-65: Reconnection design is sound.

The past review concern about overlapping goroutines has been addressed. There's exactly one long-lived reconnect loop (handshakeReconnectPeer), and each connection spawns one short-lived error handler that exits when DoneChan closes. Both the error handler and the wait at line 76 unblock on the same channel, preventing overlap.

🧹 Nitpick comments (1)
internal/indexer/handshake.go (1)

67-99: No graceful shutdown mechanism for the reconnect loop.

The for loop runs indefinitely with no exit condition. When the indexer shuts down, this goroutine will be leaked or abruptly terminated. Consider adding a context or done channel that can be closed to signal shutdown.

Example approach:

+func (i *Indexer) handshakeReconnectPeer(ctx context.Context) {
 	var err error
 	// Try reconnecting to peer until we are successful
 	for {
+		select {
+		case <-ctx.Done():
+			return
+		default:
+		}
 		err = i.handshakeConnectPeer()
 		if err == nil {
 			// Reset backoff delay
 			i.handshakeState.peerBackoffDelay = 0
 			// Wait for connection close
-			<-i.handshakeState.peer.DoneChan()
+			select {
+			case <-i.handshakeState.peer.DoneChan():
+			case <-ctx.Done():
+				_ = i.handshakeState.peer.Close()
+				return
+			}
 			continue
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8800a47 and f2d6cee.

📒 Files selected for processing (6)
  • internal/config/config.go (1 hunks)
  • internal/handshake/covenant.go (3 hunks)
  • internal/handshake/covenant_test.go (0 hunks)
  • internal/handshake/peer.go (4 hunks)
  • internal/indexer/handshake.go (1 hunks)
  • internal/indexer/indexer.go (2 hunks)
💤 Files with no reviewable changes (1)
  • internal/handshake/covenant_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/config/config.go
  • internal/handshake/peer.go
  • internal/indexer/indexer.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/indexer/handshake.go (5)
internal/handshake/peer.go (2)
  • Peer (30-44)
  • NewPeer (48-65)
internal/config/config.go (1)
  • GetConfig (177-179)
internal/handshake/network.go (1)
  • NetworkMainnet (15-19)
internal/handshake/block.go (1)
  • Block (22-25)
internal/handshake/covenant.go (5)
  • Covenant (32-34)
  • RegisterCovenant (84-89)
  • RegisterCovenant (91-91)
  • UpdateCovenant (120-124)
  • UpdateCovenant (126-126)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
internal/handshake/covenant.go (2)

65-82: Improved error context in panic messages.

The use of fmt.Sprintf to include the underlying error in panic messages provides better debugging context when covenant conversion fails.


128-151: Correct fix for UpdateCovenant parsing.

The change to expect 3 items and the removal of BlockHash handling aligns with the protocol fix described in the PR objectives.

internal/indexer/handshake.go (2)

25-34: LGTM: Async startup pattern.

Starting the reconnect loop in a goroutine allows the indexer to continue initialization while handshake connection happens asynchronously.


101-123: LGTM: Initial sync handler implementation.

The handler correctly processes blocks and logs covenant data at DEBUG level. This provides the foundation for the handshake indexer as described in the PR objectives.

@agaffney agaffney merged commit 206e4df into main Nov 27, 2025
11 checks passed
@agaffney agaffney deleted the feat/handshake-indexer branch November 27, 2025 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Demonstrate the ability to synchronize blocks and process domain ownership updates

3 participants