Skip to content

Conversation

@alltheseas
Copy link

@alltheseas alltheseas commented Dec 17, 2025

Summary

Native C implementation of the negentropy set reconciliation protocol (NIP-77) for nostrdb.

Key features:

  • MSB-first varint encoding/decoding
  • Bound and Range encoding with delta compression
  • Storage with binary search and fingerprint computation
  • Full reconciliation state machine
  • Filter-based storage initialization
  • Overflow guards for untrusted input

Commits:

  1. Core data structures and encoding
  2. Unit tests for core encoding functions
  3. Range encoding and decoding
  4. Message encoding and decoding
  5. Storage for sorted items
  6. Reconciliation state machine
  7. Config struct and is_complete API
  8. Filter-based storage initialization
  9. Critical NIP-77 protocol bug fixes
  10. Overflow guards for untrusted input

Test plan

  • All 43 unit tests pass (./test_negentropy)
  • Varint, hex, accumulator, bound, fingerprint, range, message, storage, and reconciliation tests
  • Integration test with reference negentropy implementation

closes #3

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added Negentropy protocol (NIP-77) implementation for efficient peer-to-peer data reconciliation with configurable synchronization parameters and reduced bandwidth usage.
  • Tests

    • Added comprehensive test suite covering encoding, storage management, and multi-round synchronization scenarios for the reconciliation module.

✏️ Tip: You can customize this high-level summary in your review settings.

alltheseas and others added 11 commits December 17, 2025 10:08
Implement the foundational components for the negentropy set
reconciliation protocol (NIP-77):

- MSB-first varint encoding/decoding (differs from LEB128)
- Bound structure for range boundaries with delta timestamp encoding
- 256-bit accumulator for fingerprint computation (mod 2^256 addition)
- Fingerprint generation using SHA256(sum || varint(count))[:16]
- Hex encoding utilities for NIP-77 wire format

The implementation uses early returns and guard clauses throughout
for clean control flow (nevernesting pattern).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comprehensive tests for:
- Varint size calculation and roundtrip encoding
- Specific encoding values matching negentropy spec
- Buffer overflow protection
- Hex encoding/decoding with edge cases
- 256-bit accumulator arithmetic including carry
- Bound encoding with delta timestamps and infinity
- Fingerprint determinism

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add the Range structure and wire format encoding for all four modes:
- SKIP: no payload
- FINGERPRINT: 16-byte fingerprint
- IDLIST: count + array of 32-byte IDs
- IDLIST_RESPONSE: have_ids list + bitfield

The decode functions use zero-copy pointers into the input buffer
for IDLIST and IDLIST_RESPONSE payloads, avoiding unnecessary memcpy.

Includes comprehensive unit tests for all range modes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add message-level encoding/decoding for complete negentropy wire format:
- ndb_negentropy_message_encode(): Encode version + array of ranges
- ndb_negentropy_message_version(): Extract protocol version byte
- ndb_negentropy_message_count_ranges(): Count ranges without full decode

Messages start with version byte (0x61 for V1) followed by delta-encoded
ranges. Includes DOS protection limit of 256 ranges per message.

This completes the core wire format layer (Feature 1), enabling
encoding/decoding of all negentropy protocol messages as specified
in NIP-77.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add ndb_negentropy_storage structure for managing sorted (timestamp, id)
item arrays used in negentropy reconciliation:

- storage_init/destroy: Lifecycle management
- storage_add/add_many: Add items (any order)
- storage_seal: Sort items by (timestamp, id) and lock
- storage_get/size: Access sorted items
- storage_lower_bound: Binary search for range queries
- storage_fingerprint: Compute fingerprint for item ranges

The storage uses a dynamic array with exponential growth and qsort
for sorting. Items are sorted first by timestamp, then lexicographically
by ID to match the negentropy protocol's ordering requirements.

Fingerprint computation streams through items using the accumulator,
enabling efficient range-based reconciliation without buffering.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add the core negentropy reconciliation engine that processes messages
and determines set differences:

- ndb_negentropy_init/destroy: Context lifecycle
- ndb_negentropy_initiate: Create initial message with full fingerprint
- ndb_negentropy_reconcile: Process message and generate response
- ndb_negentropy_get_have/need_ids: Retrieve discovered differences

The reconcile() function implements the full protocol logic:
- SKIP: No action needed, ranges are identical
- FINGERPRINT: Compare and either SKIP (match) or split (differ)
- IDLIST: Respond with IdListResponse containing differences
- IDLIST_RESPONSE: Extract have/need IDs, respond with SKIP

Range splitting uses NDB_NEGENTROPY_IDLIST_THRESHOLD (16) to decide
between sending full IdLists vs sub-range Fingerprints, and
NDB_NEGENTROPY_SPLIT_COUNT (16) for the number of sub-ranges.

This enables efficient set reconciliation in O(log n) round trips
with minimal bandwidth usage via fingerprint comparison.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ndb_negentropy_config struct with frame_size_limit, idlist_threshold,
  and split_count parameters for tuning reconciliation behavior
- Add ndb_negentropy_is_complete() to check if reconciliation is done
- Update ndb_negentropy_init() to accept optional config (NULL = defaults)
- Use configurable thresholds in reconcile() instead of hardcoded constants
- Set is_complete flag when reconcile() produces empty response

The config struct allows callers to customize:
- frame_size_limit: constrain message sizes for memory-limited devices
- idlist_threshold: when to switch from fingerprint to idlist mode
- split_count: how many sub-ranges to create when fingerprints differ

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add ndb_negentropy_storage_from_filter() to populate negentropy storage
directly from NostrDB filter queries. This enables sync workflows where
a client can:

1. Create a filter (e.g., kind:1, authors:[pubkey])
2. Query local nostrdb with the filter
3. Initialize negentropy storage from results
4. Sync with a relay using the same filter

The function:
- Queries nostrdb using ndb_query()
- Adds each matching note's (timestamp, id) to storage
- Automatically seals storage after population
- Returns count of items added, or -1 on error

Wrapped in NDB_NEGENTROPY_STANDALONE conditional to allow standalone
testing of core negentropy functions without full nostrdb dependencies.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix IDLIST encoding: copy IDs to contiguous buffer since storage
  items have timestamps interleaved (struct layout issue)
- Fix SKIP response: echo SKIP ranges to maintain space coverage,
  return empty message when all incoming ranges are SKIP
- Fix Bound comparison: treat omitted bytes as implicit zeros per spec
- Change IDLIST response from mode 3 to SKIP per NIP-77 (mode 3 not
  in specification)

These fixes address interoperability issues that would prevent
correct reconciliation with other NIP-77 implementations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add bounds checking in decode paths to prevent integer overflows
and DOS attacks when processing untrusted negentropy messages:

- Add NDB_NEGENTROPY_MAX_IDS_PER_RANGE (100K) constant to limit
  ID count per IDLIST range, preventing overflow in id_count * 32
- Check timestamp delta for overflow before adding to prev_timestamp
- Add 1MB limit for bitfield length in IDLIST_RESPONSE

These guards protect against malformed messages that could cause
memory corruption or excessive resource consumption.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@alltheseas
Copy link
Author

I'm working on the integration testing with iOS Damus

@alltheseas
Copy link
Author

alltheseas commented Dec 17, 2025

Confirmed integration with iOS Damus works. Will link iOS Damus PR shortly. Working on final bits.

image

@alltheseas
Copy link
Author

iOS damus timeline integration here: damus-io/damus#3444

Add stderr debug output to ndb_negentropy_reconcile() to aid debugging
when reconciliation fails. Each failure point now logs:

- Which range number caused the failure (1-indexed)
- The specific operation that failed (decode, encode, malloc)
- Relevant context (buffer offset, item count, remaining bytes)

This helps identify whether failures are due to:
- Malformed input messages from relays
- Buffer size exhaustion during output encoding
- Memory allocation failures under load
- Protocol violations (unknown modes)

The logging follows early-return patterns - each error case logs
immediately before returning, keeping the diagnostic close to
the failure site for easier debugging.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@alltheseas
Copy link
Author

alltheseas commented Dec 17, 2025

updated with debug logging in 0b1c113. This was not only useful, but absolutely necessary in troubleshooting iOS Damus nostrdb native negentropy sync.

negentropy: add diagnostic logging to reconcile function

  Changes (+46/-8 lines)

  1. Added #include <stdio.h> for fprintf
  2. Added range_count tracking - 1-indexed counter to identify which range caused failure
  3. Added 10 diagnostic fprintf statements at each early-return failure point:
    - Decode failure (malformed relay message)
    - SKIP encode failure
    - FP match SKIP encode failure
    - malloc failure (memory exhaustion)
    - IDLIST encode failure
    - FP split encode failure
    - IDLIST response SKIP encode failure
    - IDLIST_RESPONSE SKIP encode failure
    - Unknown mode (protocol violation)

  Why this helps

  When negentropy sync fails silently, these logs tell you:
  - Where it failed (range index, split index)
  - What failed (decode vs encode vs malloc)
  - How much buffer was consumed (offset, remaining bytes)

  This follows the nevernesting principle - each failure logs immediately before its early return, keeping diagnostics close to failure sites.

@alltheseas
Copy link
Author

@jb55 👀

@alltheseas
Copy link
Author

@jb55

Theres a lot of duplicate code in here from functions we already have, like varint and hex encoding

@alltheseas
Copy link
Author

@jb55

Theres a lot of duplicate code in here from functions we already have, like varint and hex encoding

@jb55 here is the assessment:

Valid criticism: Hex encoding is duplicated                                                                                                                                                
                                                                                                                                                                                             
  ndb_negentropy.c (lines 147-225) reimplements hex encoding/decoding when hex.h already provides this:                                                                                      
                                                                                                                                                                                             
  | ndb_negentropy.c          | hex.h (existing)         |                                                                                                                                   
  |---------------------------|--------------------------|                                                                                                                                   
  | hex_chars[] lookup table  | hex_table[] lookup table |                                                                                                                                   
  | ndb_negentropy_to_hex()   | hex_encode()             |                                                                                                                                   
  | hex_char_value()          | char_to_hex()            |                                                                                                                                   
  | ndb_negentropy_from_hex() | hex_decode()             |                                                                                                                                   
                                                                                                                                                                                             
  Not duplicate: Varint encoding                                                                                                                                                             
                                                                                                                                                                                             
  The varint functions are intentionally different:                                                                                                                                          
  - ndb_negentropy_varint_* = MSB-first (required by negentropy protocol)                                                                                                                    
  - cursor_push/pull_varint = LEB128/LSB-first                                                                                                                                               
                                                                                                                                                                                             
  Encoding 300:                                                                                                                                                                              
  - Negentropy MSB-first: 0x82 0x2C                                                                                                                                                          
  - LEB128: 0xAC 0x02                                                                                                  

Remove duplicate hex encoding functions and use the existing hex.h
utilities. The public API is preserved as thin wrappers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@alltheseas
Copy link
Author

@pablof7z can you review? Ser

@alltheseas
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

A new negentropy-based synchronization module implementing NIP-77 protocol for efficient bidirectional reconciliation between peers. Includes complete C API for varint encoding, fingerprinting, range-based message handling, storage management, and multi-round reconciliation logic with NostrDB integration hooks and comprehensive test coverage.

Changes

Cohort / File(s) Summary
Build Configuration
Makefile
Added new module ndb_negentropy to build system by including header src/ndb_negentropy.h in HEADERS and source src/ndb_negentropy.c in SRCS without removing or reordering existing entries.
Negentropy Protocol Implementation
src/ndb_negentropy.h, src/ndb_negentropy.c
Comprehensive native implementation of Negentropy protocol with: varint encoding/decoding (MSB-first), 256-bit fingerprint accumulation and computation, bound/range/message frame encoding-decoding with multiple payload modes (SKIP, FINGERPRINT, IDLIST, IDLIST_RESPONSE), hex utilities, storage abstraction supporting add/seal/fingerprint/lower_bound operations, optional NostrDB filter integration, and reconciliation state machine with initiate/reconcile phases, have/need ID tracking, and DOS protection against buffer overruns and excessive range counts.
Test Suite
test_negentropy.c
Comprehensive unit test suite covering varint encoding, hex utilities, fingerprint accumulation, bound encoding, range encoding across all modes, message framing, storage operations, and multi-round reconciliation behavior with assertions on determinism, edge cases, and protocol conformance.

Sequence Diagram(s)

sequenceDiagram
    actor Initiator as Peer A<br/>(Initiator)
    participant ReconcileA as Reconciliation<br/>Engine A
    participant StorageA as Storage A
    actor Responder as Peer B<br/>(Responder)
    participant ReconcileB as Reconciliation<br/>Engine B
    participant StorageB as Storage B

    Note over Initiator,StorageB: Round 1: Initiation
    Initiator->>ReconcileA: initiate()
    ReconcileA->>StorageA: compute fingerprint<br/>over full range
    StorageA-->>ReconcileA: fingerprint + bounds
    ReconcileA-->>Initiator: initial message<br/>(FINGERPRINT range)
    Initiator->>Responder: send initial message

    Note over Initiator,StorageB: Round 2: Response
    Responder->>ReconcileB: reconcile(incoming msg)
    ReconcileB->>StorageB: check fingerprint<br/>against stored items
    alt Fingerprints match
        StorageB-->>ReconcileB: complete
        ReconcileB-->>Responder: empty response
    else Fingerprints differ
        StorageB-->>ReconcileB: have/need IDs
        ReconcileB->>ReconcileB: split range or<br/>switch mode
        ReconcileB-->>Responder: response ranges<br/>(SKIP/FINGERPRINT/IDLIST)
    end
    Responder->>Initiator: send response

    Note over Initiator,StorageB: Round 3+: Reconcile Loop
    Initiator->>ReconcileA: reconcile(response)
    ReconcileA->>StorageA: extract have/need IDs
    StorageA-->>ReconcileA: ID sets
    ReconcileA->>ReconcileA: decide next ranges
    alt Complete
        ReconcileA-->>Initiator: is_complete() = true
    else Continue
        ReconcileA-->>Initiator: next message
    end
    Initiator->>Responder: send next message
    
    Note over Initiator,Responder: Loop repeats until both<br/>reach completion
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes


🐰 A negentropy bloom unfolds,
Fingerprints and splits, stories told,
Two peers now sync in bounded grace,
With varints swift and sorted space—
Where reconciliation hops and springs,
Efficient as the wind's soft wings!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.77% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'negentropy: native NIP-77 set reconciliation' accurately and concisely summarizes the main change—implementing a native C implementation of the negentropy NIP-77 protocol for set reconciliation.
Linked Issues check ✅ Passed The PR successfully implements the primary objective from issue #3 by providing a complete native C implementation of negentropy for efficient set reconciliation with nostrdb queries.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the negentropy implementation objective: new negentropy module (header, implementation, tests), Makefile integration, and no unrelated modifications.
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Fix all issues with AI Agents 🤖
In @Makefile:
- Around line 6-9: Add a Makefile target named test_negentropy that builds the
new test_negentropy.c into an executable (use existing variables like CC,
CFLAGS, LDS, LDFLAGS and DEPS/testdata/db/.dir as needed) and ensure it links
the same SRCS/dependencies as other tests; then wire that target into the check
(or check-all) target so make check runs test_negentropy as part of the test
suite. Reference symbols: create a target "test_negentropy", compile source
"test_negentropy.c", and add it to the "check" target dependency list.

In @src/ndb_negentropy.c:
- Around line 964-981: In storage_add_many, the addition needed = storage->count
+ count can overflow; before computing needed check that count <= SIZE_MAX -
storage->count (or equivalently if (count > SIZE_MAX - storage->count) return 0)
to detect overflow and fail safely; also compute realloc_size as size_t
new_capacity * sizeof(struct ndb_negentropy_item) and verify new_capacity <=
SIZE_MAX / sizeof(...) before calling realloc, returning 0 on overflow, then
proceed to realloc and update storage->items and storage->capacity only if
realloc succeeds.

In @src/ndb_negentropy.h:
- Around line 600-604: Update the misleading comment to reflect that IDs are
copied, not referenced: change the note in the block above storage_from_filter
to state that ndb_negentropy_storage_add copies the 32-byte ID (see memcpy to
item->id) so the transaction does not need to remain valid after
storage_from_filter returns; reference the functions storage_from_filter and
ndb_negentropy_storage_add and remove the claim about storing only references to
event data.
🧹 Nitpick comments (4)
test_negentropy.c (2)

138-153: Verify case-insensitive hex decoding.

The test uses uppercase hex input "deadbeef" (actually lowercase here) but the underlying hex_decode from hex.h should handle both cases. Consider adding a test with mixed-case input to verify case-insensitivity.


1196-1206: Test should verify the reconciliation results.

The test runs the reconciliation loop but doesn't verify the actual have_ids and need_ids at the end. After reconciliation:

  • Client should have need_ids containing id3 (0x03)
  • Client should have have_ids containing id1 (0x01)

Without these assertions, the test only verifies that reconciliation doesn't crash, not that it produces correct results.

🔎 Suggested verification
 	/* After reconciliation, check results */
-	/* Note: The exact IDs depend on protocol details */
+	const unsigned char *have, *need;
+	size_t have_count, need_count;
+	
+	have_count = ndb_negentropy_get_have_ids(&neg1, &have);
+	need_count = ndb_negentropy_get_need_ids(&neg1, &need);
+	
+	/* Client should need id3 (server has, client doesn't) */
+	/* Client should have id1 (client has, server doesn't) */
+	/* Note: verify at least one of these is non-zero */
+	assert(have_count > 0 || need_count > 0);

 	ndb_negentropy_destroy(&neg1);
src/ndb_negentropy.c (2)

1483-1486: Consider removing or gating debug logging for production.

The fprintf(stderr, ...) calls throughout ndb_negentropy_reconcile will produce output during error conditions. While useful for debugging, this may not be desired in production builds. Consider:

  1. Removing them before merge
  2. Gating behind a NDEBUG or similar compile-time flag
  3. Using a logging callback that can be disabled

1287-1311: storage_has_id is unused and should be either removed or integrated into IDLIST processing.

This function implements O(log n) binary search lookup for IDs in storage, but it's never called. The IDLIST processing at lines 1686-1722 uses nested linear searches with memcmp instead, resulting in O(n*m) performance. Either integrate this function into the IDLIST logic to improve performance, or remove it if it's not planned for future use.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 87a239e and ed0f8fb.

📒 Files selected for processing (4)
  • Makefile
  • src/ndb_negentropy.c
  • src/ndb_negentropy.h
  • test_negentropy.c
🧰 Additional context used
🧬 Code graph analysis (1)
src/ndb_negentropy.c (3)
src/hex.h (2)
  • hex_encode (53-65)
  • hex_decode (25-41)
ccan/ccan/crypto/sha256/sha256.c (1)
  • sha256 (243-250)
src/nostrdb.c (3)
  • ndb_query (5087-5112)
  • ndb_note_created_at (8857-8860)
  • ndb_note_id (8842-8845)
🔇 Additional comments (17)
test_negentropy.c (4)

1-11: LGTM!

Clean test file structure with proper includes and organization into logical test groups.


16-118: LGTM!

Varint tests are thorough, covering size calculation, roundtrip encoding/decoding, specific byte patterns for MSB-first format, and buffer overflow edge cases.


875-878: LGTM!

The lower_bound test correctly validates that searching for timestamp 2000 with an empty prefix returns index 1, which is the exact match for the item at that timestamp.


1212-1274: LGTM!

Good test organization with clear section headers and a final success message. The test suite provides solid coverage of the negentropy module's core functionality.

src/ndb_negentropy.c (9)

1-13: LGTM!

Clean file header and necessary includes. The hex.h include indicates the PR addressed the earlier code duplication concern.


103-141: LGTM!

The varint decode implementation correctly handles edge cases: empty buffer, null output, 10-byte maximum, and unterminated sequences. The MSB-first encoding matches the negentropy protocol specification.


201-218: LGTM!

The 256-bit accumulator addition with carry propagation is correctly implemented. Little-endian byte order and modular arithmetic (discarding final carry) matches the negentropy spec.


344-355: LGTM!

Good overflow protection when decoding timestamps. The check delta > UINT64_MAX - *prev_timestamp prevents integer overflow from malicious input.


583-597: LGTM!

Good DOS protection with NDB_NEGENTROPY_MAX_IDS_PER_RANGE limit preventing multiplication overflow when computing ids_size = id_count * 32. The zero-copy design is efficient.


820-855: LGTM!

The item_bound_compare function correctly implements the negentropy bound comparison semantics, including infinity handling and the rule that omitted prefix bytes are implicitly zero.


1151-1213: LGTM!

Good integration with NostrDB's query system. Memory management is correct with proper cleanup on error paths. The automatic sealing after population is a convenient design.


1686-1722: O(n*m) ID comparison is acceptable given threshold constraint.

The nested loops for comparing ID lists have O(nm) complexity, but since ID lists are only used for ranges smaller than idlist_threshold (default 16), the practical performance impact is limited to 1616 = 256 comparisons per range at most.


1611-1664: LGTM!

The range splitting logic correctly handles edge cases: ensures minimum 1 item per split, last split takes remainder, and preserves the original upper bound for protocol correctness.

src/ndb_negentropy.h (4)

1-29: LGTM!

Excellent documentation in the header. The protocol description, algorithm overview, and NIP-77 reference provide good context for users of the API.


52-55: Good documentation of non-standard mode.

The comment clarifies that IDLIST_RESPONSE is from the reference implementation, not NIP-77. This helps maintainers understand protocol compatibility.


784-791: LGTM!

The documentation correctly warns that the returned pointer may become invalid after the next reconcile() call due to potential reallocation of the ID arrays.


376-385: LGTM!

The DOS protection limits are well-documented with clear rationale. MAX_RANGES at 256 and MAX_IDS_PER_RANGE at 100,000 (3.2MB) are reasonable for production use.

Comment on lines +6 to +9
HEADERS = deps/lmdb/lmdb.h deps/secp256k1/include/secp256k1.h src/nostrdb.h src/cursor.h src/hex.h src/jsmn.h src/config.h src/random.h src/memchr.h src/cpu.h src/nostr_bech32.h src/block.h src/str_block.h src/print_util.h src/ndb_negentropy.h $(C_BINDINGS) $(CCAN_HDRS) $(BOLT11_HDRS)
FLATCC_SRCS=deps/flatcc/src/runtime/json_parser.c deps/flatcc/src/runtime/verifier.c deps/flatcc/src/runtime/builder.c deps/flatcc/src/runtime/emitter.c deps/flatcc/src/runtime/refmap.c
BOLT11_SRCS = src/bolt11/bolt11.c src/bolt11/bech32.c src/bolt11/amount.c src/bolt11/hash_u5.c
SRCS = src/base64.c src/hmac_sha256.c src/hkdf_sha256.c src/nip44.c src/nostrdb.c src/invoice.c src/nostr_bech32.c src/content_parser.c src/block.c src/binmoji.c src/metadata.c $(BOLT11_SRCS) $(FLATCC_SRCS) $(CCAN_SRCS)
SRCS = src/base64.c src/hmac_sha256.c src/hkdf_sha256.c src/nip44.c src/nostrdb.c src/invoice.c src/nostr_bech32.c src/content_parser.c src/block.c src/binmoji.c src/metadata.c src/ndb_negentropy.c $(BOLT11_SRCS) $(FLATCC_SRCS) $(CCAN_SRCS)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing build target for test_negentropy.

The new test file test_negentropy.c is mentioned in the PR (43 unit tests) but there's no corresponding Makefile target to compile and run it. The PR description says to run make test_negentropy && ./test_negentropy, but this target doesn't exist.

🔎 Suggested addition for test target
test_negentropy: test_negentropy.c $(DEPS) testdata/db/.dir
	$(CC) $(CFLAGS) test_negentropy.c $(LDS) $(LDFLAGS) -o $@

Also consider adding it to the check target or creating a check-all target.

🤖 Prompt for AI Agents
In @Makefile around lines 6-9, Add a Makefile target named test_negentropy that
builds the new test_negentropy.c into an executable (use existing variables like
CC, CFLAGS, LDS, LDFLAGS and DEPS/testdata/db/.dir as needed) and ensure it
links the same SRCS/dependencies as other tests; then wire that target into the
check (or check-all) target so make check runs test_negentropy as part of the
test suite. Reference symbols: create a target "test_negentropy", compile source
"test_negentropy.c", and add it to the "check" target dependency list.

Comment on lines +964 to +981
/* Ensure capacity for all items */
needed = storage->count + count;
if (needed > storage->capacity) {
new_capacity = storage->capacity;
if (new_capacity < STORAGE_INITIAL_CAPACITY)
new_capacity = STORAGE_INITIAL_CAPACITY;

while (new_capacity < needed)
new_capacity *= 2;

new_items = realloc(storage->items,
new_capacity * sizeof(struct ndb_negentropy_item));
if (new_items == NULL)
return 0;

storage->items = new_items;
storage->capacity = new_capacity;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential integer overflow in storage_add_many.

The calculation needed = storage->count + count at line 965 could overflow if both values are large. While unlikely in practice, this could cause under-allocation.

🔎 Suggested fix
 	/* Ensure capacity for all items */
 	needed = storage->count + count;
+	
+	/* Guard: check for overflow */
+	if (needed < storage->count)
+		return 0;
+	
 	if (needed > storage->capacity) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/* Ensure capacity for all items */
needed = storage->count + count;
if (needed > storage->capacity) {
new_capacity = storage->capacity;
if (new_capacity < STORAGE_INITIAL_CAPACITY)
new_capacity = STORAGE_INITIAL_CAPACITY;
while (new_capacity < needed)
new_capacity *= 2;
new_items = realloc(storage->items,
new_capacity * sizeof(struct ndb_negentropy_item));
if (new_items == NULL)
return 0;
storage->items = new_items;
storage->capacity = new_capacity;
}
/* Ensure capacity for all items */
needed = storage->count + count;
/* Guard: check for overflow */
if (needed < storage->count)
return 0;
if (needed > storage->capacity) {
new_capacity = storage->capacity;
if (new_capacity < STORAGE_INITIAL_CAPACITY)
new_capacity = STORAGE_INITIAL_CAPACITY;
while (new_capacity < needed)
new_capacity *= 2;
new_items = realloc(storage->items,
new_capacity * sizeof(struct ndb_negentropy_item));
if (new_items == NULL)
return 0;
storage->items = new_items;
storage->capacity = new_capacity;
}
🤖 Prompt for AI Agents
In @src/ndb_negentropy.c around lines 964-981, In storage_add_many, the addition
needed = storage->count + count can overflow; before computing needed check that
count <= SIZE_MAX - storage->count (or equivalently if (count > SIZE_MAX -
storage->count) return 0) to detect overflow and fail safely; also compute
realloc_size as size_t new_capacity * sizeof(struct ndb_negentropy_item) and
verify new_capacity <= SIZE_MAX / sizeof(...) before calling realloc, returning
0 on overflow, then proceed to realloc and update storage->items and
storage->capacity only if realloc succeeds.

Comment on lines +600 to +604
* Returns: Number of items added, or -1 on error.
*
* Note: The transaction must remain valid for the lifetime of the storage
* since we only store references to the event data.
*/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Documentation is misleading about reference vs. copy semantics.

The comment says "The transaction must remain valid for the lifetime of the storage since we only store references to the event data." However, the implementation in ndb_negentropy.c line 1198 calls ndb_negentropy_storage_add, which copies the 32-byte ID (see line 934: memcpy(item->id, id, 32)).

The transaction doesn't need to remain valid after storage_from_filter returns because IDs are copied, not referenced.

🔎 Suggested documentation fix
- * Note: The transaction must remain valid for the lifetime of the storage
- * since we only store references to the event data.
+ * Note: Event IDs are copied into storage, so the transaction only needs
+ * to remain valid during this function call.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* Returns: Number of items added, or -1 on error.
*
* Note: The transaction must remain valid for the lifetime of the storage
* since we only store references to the event data.
*/
* Returns: Number of items added, or -1 on error.
*
* Note: Event IDs are copied into storage, so the transaction only needs
* to remain valid during this function call.
*/
🤖 Prompt for AI Agents
In @src/ndb_negentropy.h around lines 600-604, Update the misleading comment to
reflect that IDs are copied, not referenced: change the note in the block above
storage_from_filter to state that ndb_negentropy_storage_add copies the 32-byte
ID (see memcpy to item->id) so the transaction does not need to remain valid
after storage_from_filter returns; reference the functions storage_from_filter
and ndb_negentropy_storage_add and remove the claim about storing only
references to event data.

@jb55
Copy link
Contributor

jb55 commented Jan 5, 2026

@danieldaquino didn't you add negentropy syncing without changes to nostrdb?

@danieldaquino
Copy link
Contributor

@danieldaquino didn't you add negentropy syncing without changes to nostrdb?

@jb55 yes, I have a (somewhat untested) draft for negentropy support in Damus iOS I have done a little while ago. Here is where you can find it:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Negentropy sync

3 participants