Skip to content

Fix/redis scan iter optimization#823

Merged
PythonFZ merged 15 commits intomainfrom
fix/redis-scan-iter-optimization
Dec 22, 2025
Merged

Fix/redis scan iter optimization#823
PythonFZ merged 15 commits intomainfrom
fix/redis-scan-iter-optimization

Conversation

@PythonFZ
Copy link
Copy Markdown
Member

@PythonFZ PythonFZ commented Dec 22, 2025

Summary by CodeRabbit

  • Performance & Stability Improvements

    • Faster room/user/admin lookups via a global index for snappier UI.
    • More robust lock and resource cleanup with session-level lock tracking and clearer structured logs.
    • Targeted bookmark cache updates to avoid unnecessary full refreshes.
  • New Features

    • Room deletion endpoint to remove a room and its associated data.
  • Reliability

    • Atomic updates for user/admin changes to reduce inconsistent state.
  • Tests

    • Updated tests to reflect index-based room name collision checks.

✏️ Tip: You can customize this high-level summary in your review settings.

PythonFZ and others added 8 commits December 21, 2025 19:57
  - Add GlobalIndexKeys for rooms:index, users:index, admins:index
  - Use sismember for O(1) room existence checks
  - Use smembers for O(M) room/user/admin listing
  - Add rooms to index on creation via pipeline
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 22, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds global Redis indices for rooms/users/admins, tracks locks per session and performs per-session lock cleanup on disconnect, replaces scan-based key enumerations with index lookups and pipelines across services, and adjusts tests and test-server IO redirection.

Changes

Cohort / File(s) Summary
Global Redis key helpers
src/zndraw/app/redis_keys.py
Add GlobalIndexKeys (ROOMS_INDEX, USERS_INDEX, ADMINS_INDEX) and accessors; add SessionKeys.session_locks(session_id) and RoomKeys.all_static_keys() for bulk room-key deletion.
Session lock tracking & cleanup
src/zndraw/app/lock_routes.py, src/zndraw/app/events.py
Record acquired lock_key in session:{id}:locks; remove on release. Rework disconnect cleanup to enumerate per-session locks, delete each lock and its metadata, then delete the session locks set; replace prints with structured logging.
Room index adoption & room deletion
src/zndraw/app/file_browser.py, src/zndraw/app/room_routes.py, src/zndraw/services/room_service.py, src/zndraw/utils.py
Replace scans over room:*:metadata with GlobalIndexKeys.rooms_index() lookups and sismember checks; decode bytes to strings; skip rooms missing metadata; add new rooms to global index on creation; add RoomService.delete_room(room_id) to remove from index and delete static room keys.
User & admin index adoption
src/zndraw/services/user_service.py, src/zndraw/services/admin_service.py
Convert user/admin writes and transitions to atomic Redis pipelines that update data and maintain users:index / admins:index; list_all_users() and get_all_admins() read from indices and prune missing entries.
Bookmarks cache optimization
src/zndraw/socket_manager.py
Replace full bookmark refresh with targeted cache updates for set/delete operations when index provided; add _refresh_all_bookmarks fallback and improved error handling (404 -> remove entry, others -> full refresh).
Tests & test setup
tests/test_room_naming.py, misc/conftest.py
Tests updated to populate rooms:index for collision scenarios and adjust a test signature; misc/conftest.py changes ZnDraw server startup stdout/stderr redirection from a pipe to DEVNULL to avoid blocking.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant Server
    participant Redis
    Note over Client,Server: Socket disconnect flow triggers cleanup
    Client-->>Server: disconnect event
    Server->>Server: extract/validate session_id
    alt session_id present
        Server->>Redis: SMEMBERS session:{session_id}:locks
        Redis-->>Server: [lock_key1, lock_key2, ...]
        loop for each lock_key
            Server->>Redis: DEL <lock metadata keys for lock_key>
            Redis-->>Server: OK
            Server->>Redis: DEL <lock_key>
            Redis-->>Server: OK
            Server->>Server: log debug cleaned lock_key
        end
        Server->>Redis: DEL session:{session_id}:locks
        Redis-->>Server: OK
        Server->>Server: notify room/presenter/workers (emit events / logs)
    else no session_id
        Server->>Server: skip lock cleanup (log)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Areas needing extra attention:
    • src/zndraw/app/events.py — correctness of per-session lock enumeration, deletion of associated metadata keys, and logging behavior.
    • src/zndraw/app/redis_keys.py — verify RoomKeys.all_static_keys() completeness and SessionKeys key naming conventions.
    • src/zndraw/services/user_service.py & src/zndraw/services/admin_service.py — pipeline atomicity and index maintenance, especially admin transfer logic in register_user.
    • file_browser.py / room_routes.py / utils.py — correct decoding of bytes, skipping rooms missing metadata, and consistent sismember usage.
    • tests/test_room_naming.py — ensure test semantics reflect index-based collision behavior.

Possibly related PRs

  • fix scrubbing issue #811 — modifies lock_routes.py and lock lifecycle behavior; likely related to session lock bookkeeping and acquire/release semantics.
  • Documentation Update #820 — updates misc/conftest.py ZnDraw server startup and IO redirection; directly related to misc/conftest.py change.
  • dynamic editing #804 — changes partial frame update endpoints and tests; may intersect with room/frame indexing and related services.

Poem

🐇 I hopped through sets and keys with cheer,

I kept each lock close, then swept them clear.
Indices planted, names no longer collide,
Redis hums tidy on every side.
A tiny hop — cleanup done, carrot near. 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix/redis scan iter optimization' directly relates to the PR's main objective of replacing Redis scan operations with more efficient index-based lookups throughout the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 94.87% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/redis-scan-iter-optimization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@PythonFZ PythonFZ marked this pull request as draft December 22, 2025 11:23
@PythonFZ PythonFZ marked this pull request as ready for review December 22, 2025 11:35
@PythonFZ PythonFZ requested a review from Copilot December 22, 2025 11:35
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/zndraw/app/events.py (1)

204-224: Good optimization from O(N) scan to O(M) session-scoped cleanup.

The refactored lock cleanup using per-session tracking is efficient. Consider using a pipeline for the batch deletion to reduce round-trips when cleaning up multiple locks:

🔎 Optional: Use pipeline for batch deletion
     if session_id:
         session_locks_key = SessionKeys.session_locks(session_id)
         lock_keys = r.smembers(session_locks_key)

         if lock_keys:
             log.info(
                 f"Cleaning up {len(lock_keys)} orphaned lock(s) for session {session_id}"
             )
-            for lock_key in lock_keys:
-                # Delete the lock and its metadata
-                r.delete(lock_key)
-                metadata_key = (
-                    f"{lock_key}:metadata"
-                    if isinstance(lock_key, str)
-                    else lock_key + b":metadata"
-                )
-                r.delete(metadata_key)
-                log.debug(f"Cleaned up orphaned lock '{lock_key}'")
-
-            # Delete the session locks set itself
-            r.delete(session_locks_key)
+            with r.pipeline() as pipe:
+                for lock_key in lock_keys:
+                    pipe.delete(lock_key)
+                    metadata_key = (
+                        f"{lock_key}:metadata"
+                        if isinstance(lock_key, str)
+                        else lock_key + b":metadata"
+                    )
+                    pipe.delete(metadata_key)
+                # Delete the session locks set itself
+                pipe.delete(session_locks_key)
+                pipe.execute()
+            for lock_key in lock_keys:
+                log.debug(f"Cleaned up orphaned lock '{lock_key}'")
src/zndraw/services/user_service.py (1)

231-258: Consider using WATCH for the admin status check to prevent race conditions.

The is_admin check at line 248 happens outside the pipeline. Between the check and the pipeline execution, another process could modify the admin status. For a "chill" review, this is unlikely to cause issues in practice, but worth noting for high-concurrency scenarios.

🔎 Alternative using WATCH for full atomicity
+        # Use WATCH for atomic check-and-update of admin status
+        with self.r.pipeline() as pipe:
+            pipe.watch(old_keys.admin_key())
+            is_admin = pipe.get(old_keys.admin_key())
+            pipe.multi()
             pipe.hset(
                 new_keys.hash_key(),
                 # ... existing mapping
             )
             # ... rest of pipeline operations
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 467ea97 and cbac2a5.

📒 Files selected for processing (10)
  • misc/conftest.py
  • src/zndraw/app/events.py
  • src/zndraw/app/file_browser.py
  • src/zndraw/app/lock_routes.py
  • src/zndraw/app/redis_keys.py
  • src/zndraw/app/room_routes.py
  • src/zndraw/services/admin_service.py
  • src/zndraw/services/room_service.py
  • src/zndraw/services/user_service.py
  • src/zndraw/utils.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: If sensible, implement collections.abc interfaces for classes, such as MutableMapping or MutableSequence
Use numpy style docstrings
Docstrings must be concise and to the point
Use type hints wherever possible. Import typing as t if necessary, but use list[int|float] | None instead of t.Optional[t.List[int|float]]
Imports should always be at the top of the file

Files:

  • misc/conftest.py
  • src/zndraw/app/events.py
  • src/zndraw/services/user_service.py
  • src/zndraw/services/admin_service.py
  • src/zndraw/app/file_browser.py
  • src/zndraw/app/room_routes.py
  • src/zndraw/app/lock_routes.py
  • src/zndraw/app/redis_keys.py
  • src/zndraw/utils.py
  • src/zndraw/services/room_service.py
🧬 Code graph analysis (7)
src/zndraw/app/events.py (1)
src/zndraw/app/redis_keys.py (2)
  • session_id (547-549)
  • session_locks (586-599)
src/zndraw/services/user_service.py (1)
src/zndraw/app/redis_keys.py (7)
  • GlobalIndexKeys (624-667)
  • UserKeys (484-530)
  • hash_key (500-502)
  • hash_key (608-610)
  • users_index (648-656)
  • admin_key (504-506)
  • admins_index (659-667)
src/zndraw/services/admin_service.py (1)
src/zndraw/app/redis_keys.py (3)
  • GlobalIndexKeys (624-667)
  • admin_key (504-506)
  • admins_index (659-667)
src/zndraw/app/file_browser.py (2)
src/zndraw/app/redis_keys.py (4)
  • GlobalIndexKeys (624-667)
  • RoomKeys (242-480)
  • rooms_index (635-645)
  • metadata (452-460)
src/zndraw/app/metadata_manager.py (2)
  • RoomMetadataManager (10-135)
  • get_all (40-50)
src/zndraw/app/room_routes.py (1)
src/zndraw/app/redis_keys.py (3)
  • GlobalIndexKeys (624-667)
  • RoomKeys (242-480)
  • rooms_index (635-645)
src/zndraw/app/lock_routes.py (1)
src/zndraw/app/redis_keys.py (3)
  • SessionKeys (534-599)
  • session_locks (586-599)
  • session_id (547-549)
src/zndraw/utils.py (1)
src/zndraw/app/redis_keys.py (2)
  • GlobalIndexKeys (624-667)
  • rooms_index (635-645)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Agent
  • GitHub Check: pytest (3.11, ubuntu-latest)
  • GitHub Check: pytest (3.12, ubuntu-latest)
  • GitHub Check: pytest (3.13, ubuntu-latest)
🔇 Additional comments (26)
misc/conftest.py (1)

44-57: LGTM!

Using subprocess.DEVNULL instead of piping stdout/stderr is the correct fix to prevent blocking when the subprocess buffer fills up. The comment accurately documents the rationale.

src/zndraw/app/events.py (1)

379-392: LGTM!

Consistent use of structured logging for extension cleanup notifications.

src/zndraw/utils.py (1)

109-121: LGTM! Good O(1) optimization.

Replacing the key scan with sismember on the global rooms index is the correct approach for efficient collision detection. The function-level import avoids potential circular imports.

src/zndraw/app/lock_routes.py (3)

13-13: LGTM!

Import addition for SessionKeys enables per-session lock tracking.


94-98: LGTM!

The session lock tracking is correctly added after successful acquisition. While not atomic with the lock SET, the TTL provides a safety net if the tracking fails or server crashes before sadd completes.


294-301: LGTM!

Lock release correctly removes the key from the session's tracking set. The order (delete lock → remove from set) is appropriate.

src/zndraw/services/room_service.py (2)

138-152: LGTM! Atomic index registration.

Adding the room to the global index within the pipeline ensures atomicity with room creation. This enables O(1) room existence checks.


189-228: Verify room deletion cleanup in global index

The review comment correctly raises a valid concern. Verification found that rooms are properly added to GlobalIndexKeys.rooms_index() at lines 152 and 228 in room_service.py (both _create_empty_room and _create_room_from_copy). However, no corresponding srem operation on the global index was found in the codebase for room deletion.

The docstring in redis_keys.py explicitly documents the expected behavior: "When a room is deleted, its ID is removed from this set" (lines 635-638), but the actual implementation appears to be missing. This pattern is correctly implemented elsewhere for user management (user_service.py, admin_service.py), confirming the cleanup pattern should exist.

No room deletion endpoint or method could be located in the routes or services, which suggests this may indicate incomplete implementation rather than a simple oversight in the code being reviewed.

src/zndraw/app/file_browser.py (3)

12-12: LGTM!

Import of GlobalIndexKeys and RoomKeys aligns with the new index-based architecture.


35-48: LGTM! Good migration to index-based room enumeration.

Using smembers(GlobalIndexKeys.rooms_index()) replaces O(N) key scanning with O(1) set retrieval. The bytes-to-string decoding and metadata existence check are appropriate safeguards.


157-168: LGTM! Consistent with the pattern in _get_all_rooms_metadata.

Same index-based approach applied correctly here with proper bytes decoding and metadata existence checks.

src/zndraw/services/admin_service.py (4)

10-10: LGTM!

Import addition is appropriate for the new index-based approach.


90-93: LGTM! Atomic pipeline for grant_admin.

Using a pipeline ensures the admin key and index are updated together, preventing inconsistency if one operation fails.


105-108: LGTM! Atomic pipeline for revoke_admin.

Consistent with grant_admin - ensures atomic removal from both the admin key and the global index.


144-145: No action required. The Redis client is configured with decode_responses=True in server.py, so smembers() returns set[str] as expected. The return type annotation is correct.

Likely an incorrect or invalid review comment.

src/zndraw/app/room_routes.py (5)

16-16: LGTM!

Import of GlobalIndexKeys alongside existing RoomKeys and SessionKeys imports.


75-76: LGTM! Efficient room enumeration from global index.

Using smembers(GlobalIndexKeys.rooms_index()) provides O(1) access to all room IDs instead of scanning keys.


151-152: LGTM! O(1) room existence check.

sismember is the correct choice for checking if a room exists in the global index.


331-332: LGTM!

Consistent with get_room - uses sismember for existence verification.


425-426: LGTM!

Same pattern applied correctly for default room validation.

src/zndraw/app/redis_keys.py (2)

585-599: LGTM! Session lock tracking key.

The session_locks static method follows the established pattern and provides the key for tracking locks held by a session, enabling efficient cleanup on disconnect.


622-667: LGTM! Well-structured GlobalIndexKeys class.

The class provides a clean, centralized source of truth for global index keys. The static methods with numpy-style docstrings follow the coding guidelines and maintain consistency with other key classes in this file.

src/zndraw/services/user_service.py (4)

14-14: LGTM!

Import of GlobalIndexKeys for index-based user management.


137-149: LGTM! Atomic user creation with index update.

Using a pipeline ensures the user hash and global index are updated together. This prevents orphaned index entries if the user creation fails.


407-413: LGTM! Atomic user deletion with index cleanup.

The pipeline ensures user data, admin key, and both indices are cleaned up atomically.


428-440: Good self-healing pattern for stale index entries.

Cleaning up stale entries when user data is missing prevents index bloat. However, this cleanup happens on read which could slow down list_all_users if there are many stale entries.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Redis operations by replacing expensive scan_iter calls with O(1) set-based lookups using global indices. The optimization introduces a new GlobalIndexKeys class that maintains sets of room IDs, user names, and admin usernames, enabling constant-time existence checks and listings.

Key Changes:

  • Introduced GlobalIndexKeys class with three index sets: rooms, users, and admins
  • Replaced scan_iter patterns with sismember and smembers operations throughout the codebase
  • Added session-based lock tracking for efficient cleanup on disconnect
  • Improved logging by replacing print statements with proper log.info calls

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/zndraw/app/redis_keys.py Added GlobalIndexKeys class with static methods for rooms, users, and admins index keys; Added session_locks method to SessionKeys for tracking locks by session
src/zndraw/services/user_service.py Updated user operations to maintain users and admins indices; Added pipeline-based operations for atomicity; Added stale index entry cleanup in list_all_users
src/zndraw/services/room_service.py Added room ID to global index when creating rooms (both empty and copied)
src/zndraw/services/admin_service.py Replaced scan_iter pattern with direct smembers lookup in get_all_admins; Added index maintenance in grant_admin and revoke_admin
src/zndraw/app/room_routes.py Replaced scan_iter with sismember for room existence checks; Updated room listing to use smembers
src/zndraw/utils.py Updated generate_room_name to use sismember for collision detection
src/zndraw/app/lock_routes.py Added session-based lock tracking using a set to enable efficient cleanup
src/zndraw/app/events.py Optimized lock cleanup to use session locks set instead of scanning all locks; Improved logging by replacing print statements with log.info
src/zndraw/app/file_browser.py Updated file browser to iterate over room IDs from index instead of scanning metadata keys
misc/conftest.py Fixed subprocess pipe blocking issue by redirecting stdout/stderr to DEVNULL

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/zndraw/app/events.py
Comment on lines +151 to +152
# Add room to global index for O(1) room listing
pipe.sadd(GlobalIndexKeys.rooms_index(), room_id)
Copy link

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rooms_index is being populated when rooms are created, but there's no corresponding cleanup when rooms are deleted. This will lead to stale entries in the index over time. When rooms are deleted, you should add:

pipe.srem(GlobalIndexKeys.rooms_index(), room_id)

This is important because the optimization relies on the index being accurate. Stale entries will cause sismember checks to return true for rooms that no longer exist.

Copilot uses AI. Check for mistakes.
Comment thread src/zndraw/app/room_routes.py
Comment thread src/zndraw/services/admin_service.py
Comment thread src/zndraw/app/lock_routes.py
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 22, 2025

Codecov Report

❌ Patch coverage is 65.18519% with 47 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.37%. Comparing base (467ea97) to head (67bb710).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/zndraw/app/events.py 0.00% 12 Missing ⚠️
src/zndraw/services/room_service.py 40.00% 9 Missing ⚠️
src/zndraw/app/file_browser.py 38.46% 8 Missing ⚠️
src/zndraw/app/lock_routes.py 16.66% 5 Missing ⚠️
src/zndraw/socket_manager.py 77.27% 5 Missing ⚠️
src/zndraw/app/room_routes.py 20.00% 4 Missing ⚠️
src/zndraw/app/redis_keys.py 88.88% 2 Missing ⚠️
src/zndraw/services/user_service.py 92.85% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #823      +/-   ##
==========================================
- Coverage   78.41%   78.37%   -0.05%     
==========================================
  Files         153      153              
  Lines       18461    18524      +63     
==========================================
+ Hits        14477    14518      +41     
- Misses       3984     4006      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/test_room_naming.py (2)

89-108: Critical: Multiple collision handling is broken.

Per the PR objectives, this test is failing because both result1 and result2 are returning "my_very_long_file_na" without hash suffixes, making them identical. The test expects distinct hash suffixes to differentiate between files that truncate to the same base name.

This collision handling failure could cause multiple files to map to the same room, resulting in data corruption.


148-158: Critical: Hash suffix format cannot be verified due to broken collision detection.

This test is failing because generate_room_name returns "test.xyz" without the expected "test.xyz_XXXX" format. This is the same root cause as the other failing tests.

🧹 Nitpick comments (2)
src/zndraw/app/redis_keys.py (2)

482-514: Consider including frame_selection("default") in the static keys list.

The method includes all static keys for bulk deletion, which looks comprehensive. However, frame_selection(group="default") might be worth including since it has a default parameter and the "default" group is likely used consistently.

🔎 Proposed enhancement
         self.jobs_active(),
         self.jobs_inactive(),
         self.jobs_by_time(),
+        self.frame_selection("default"),
     ]

658-701: Add ClassVar type hints to constants for consistency.

The GlobalIndexKeys class is well-structured and documented. However, for consistency with the UserKeys class (lines 528-530), consider adding ClassVar[str] type hints to the constants.

🔎 Proposed enhancement
+from typing import ClassVar
+
 class GlobalIndexKeys:
     """Redis keys for global indices."""
 
     # Room index: SET containing all room IDs
-    ROOMS_INDEX = "rooms:index"
+    ROOMS_INDEX: ClassVar[str] = "rooms:index"
 
     # User indices
-    USERS_INDEX = "users:index"  # SET of all usernames
-    ADMINS_INDEX = "admins:index"  # SET of all admin usernames
+    USERS_INDEX: ClassVar[str] = "users:index"  # SET of all usernames
+    ADMINS_INDEX: ClassVar[str] = "admins:index"  # SET of all admin usernames

Note: ClassVar is already imported at line 8, so no additional import is needed.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbac2a5 and 4f68410.

📒 Files selected for processing (4)
  • src/zndraw/app/lock_routes.py
  • src/zndraw/app/redis_keys.py
  • src/zndraw/services/room_service.py
  • tests/test_room_naming.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/zndraw/app/lock_routes.py
  • src/zndraw/services/room_service.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: If sensible, implement collections.abc interfaces for classes, such as MutableMapping or MutableSequence
Use numpy style docstrings
Docstrings must be concise and to the point
Use type hints wherever possible. Import typing as t if necessary, but use list[int|float] | None instead of t.Optional[t.List[int|float]]
Imports should always be at the top of the file

Files:

  • tests/test_room_naming.py
  • src/zndraw/app/redis_keys.py
**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/test_*.py: Use pytest.mark.parametrize to avoid code duplication in tests
Tests should be very specific and test only one thing
Avoid complex test setups
Each test must be a function, not a method of a class

Files:

  • tests/test_room_naming.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pytest (3.12, ubuntu-latest)
  • GitHub Check: pytest (3.13, ubuntu-latest)
  • GitHub Check: pytest (3.11, ubuntu-latest)
🔇 Additional comments (2)
src/zndraw/app/redis_keys.py (1)

619-633: LGTM!

Clean implementation for per-session lock tracking. The key format is consistent with other session-related keys, and the docstring clearly explains the purpose.

tests/test_room_naming.py (1)

22-32: Remove this comment - the implementation is correct and the test should pass.

The collision detection in generate_room_name is working as intended. The function correctly checks GlobalIndexKeys.rooms_index() using redis_client.sismember() and appends a UUID suffix (e.g., test_file.xyz_a1b2) when a collision is detected. The test setup adds the room to "rooms:index", and the function checks that same key, so the collision will be found and the UUID suffix will be appended as expected.

Likely an incorrect or invalid review comment.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/zndraw/app/events.py (1)

204-224: Consider using pipeline for atomic lock cleanup.

The lock cleanup logic iterates through locks individually, which is not atomic. If the cleanup process crashes or is interrupted, some locks may be deleted while others remain, and the session locks set might not be cleaned up.

Additionally, the type-checking logic for metadata key construction (lines 215-219) suggests Redis might return mixed types from smembers, which is unusual and could indicate a deeper issue.

🔎 Proposed refactor using pipeline for atomicity
 if session_id:
     session_locks_key = SessionKeys.session_locks(session_id)
     lock_keys = r.smembers(session_locks_key)
 
     if lock_keys:
         log.info(
             f"Cleaning up {len(lock_keys)} orphaned lock(s) for session {session_id}"
         )
-        for lock_key in lock_keys:
-            # Delete the lock and its metadata
-            r.delete(lock_key)
-            metadata_key = (
-                f"{lock_key}:metadata"
-                if isinstance(lock_key, str)
-                else lock_key + b":metadata"
-            )
-            r.delete(metadata_key)
-            log.debug(f"Cleaned up orphaned lock '{lock_key}'")
-
-        # Delete the session locks set itself
-        r.delete(session_locks_key)
+        
+        # Use pipeline for atomic cleanup
+        with r.pipeline() as pipe:
+            for lock_key in lock_keys:
+                # Delete the lock and its metadata
+                pipe.delete(lock_key)
+                metadata_key = (
+                    f"{lock_key}:metadata"
+                    if isinstance(lock_key, str)
+                    else lock_key + b":metadata"
+                )
+                pipe.delete(metadata_key)
+                log.debug(f"Cleaned up orphaned lock '{lock_key}'")
+            
+            # Delete the session locks set itself
+            pipe.delete(session_locks_key)
+            pipe.execute()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f68410 and 81eb0ed.

📒 Files selected for processing (2)
  • src/zndraw/app/events.py
  • src/zndraw/socket_manager.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: If sensible, implement collections.abc interfaces for classes, such as MutableMapping or MutableSequence
Use numpy style docstrings
Docstrings must be concise and to the point
Use type hints wherever possible. Import typing as t if necessary, but use list[int|float] | None instead of t.Optional[t.List[int|float]]
Imports should always be at the top of the file

Files:

  • src/zndraw/socket_manager.py
  • src/zndraw/app/events.py
🧬 Code graph analysis (2)
src/zndraw/socket_manager.py (4)
src/zndraw/frame_cache.py (2)
  • get (16-17)
  • pop (25-26)
src/zndraw/api_manager.py (2)
  • get_bookmark (1025-1053)
  • get_all_bookmarks (1008-1023)
src/zndraw/app/bookmark_routes.py (2)
  • get_bookmark (37-58)
  • get_all_bookmarks (22-33)
src/zndraw/app/redis_keys.py (1)
  • bookmarks (300-302)
src/zndraw/app/events.py (2)
src/zndraw/app/frame_index_manager.py (1)
  • delete (153-162)
src/zndraw/app/metadata_manager.py (1)
  • delete (107-121)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pytest (3.11, ubuntu-latest)
  • GitHub Check: pytest (3.12, ubuntu-latest)
  • GitHub Check: pytest (3.13, ubuntu-latest)
🔇 Additional comments (3)
src/zndraw/app/events.py (3)

379-379: Good improvement to structured logging.

Replacing print statements with log.info improves observability and follows logging best practices.


384-387: LGTM - consistent structured logging.

The conversion to structured logging is consistent with best practices and improves observability.


539-539: Good resolution of previous feedback.

This addresses the past review comment by preserving the log entry (helpful for debugging connection issues) while using an appropriate debug level. The lazy evaluation with %s formatting is also a good practice.

Comment thread src/zndraw/socket_manager.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81eb0ed and 67bb710.

📒 Files selected for processing (1)
  • src/zndraw/socket_manager.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: If sensible, implement collections.abc interfaces for classes, such as MutableMapping or MutableSequence
Use numpy style docstrings
Docstrings must be concise and to the point
Use type hints wherever possible. Import typing as t if necessary, but use list[int|float] | None instead of t.Optional[t.List[int|float]]
Imports should always be at the top of the file

Files:

  • src/zndraw/socket_manager.py
🧬 Code graph analysis (1)
src/zndraw/socket_manager.py (6)
src/zndraw/zndraw.py (7)
  • get (198-211)
  • get (956-956)
  • get (958-960)
  • get (962-964)
  • get (993-1086)
  • log (1748-1754)
  • bookmarks (844-862)
src/zndraw/storage/asebytes_backend.py (1)
  • get (46-103)
src/zndraw/api_manager.py (2)
  • get_bookmark (1025-1053)
  • get_all_bookmarks (1008-1023)
src/zndraw/app/bookmark_routes.py (2)
  • get_bookmark (37-58)
  • get_all_bookmarks (22-33)
src/zndraw/bookmarks_manager.py (1)
  • pop (217-261)
src/zndraw/app/redis_keys.py (1)
  • bookmarks (300-302)
🪛 Ruff (0.14.8)
src/zndraw/socket_manager.py

139-139: Do not catch blind exception: Exception

(BLE001)


144-144: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


159-159: Do not catch blind exception: Exception

(BLE001)


160-160: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pytest (3.11, ubuntu-latest)
  • GitHub Check: pytest (3.12, ubuntu-latest)
  • GitHub Check: pytest (3.13, ubuntu-latest)

Comment on lines 127 to +153
def _on_bookmarks_invalidate(self, data):
"""Handle bookmark invalidation by refetching from server."""
# Refetch all bookmarks from server to update local cache
bookmarks = self.zndraw.api.get_all_bookmarks()
self.zndraw._bookmarks = bookmarks
"""Handle bookmark invalidation by updating specific entry or refetching."""
index = data.get("index")
operation = data.get("operation")

if operation == "set" and index is not None:
# Targeted update - only fetch and update the specific bookmark
try:
response = self.zndraw.api.get_bookmark(index)
label = response.get("label")
if label is not None:
self.zndraw._bookmarks[index] = label
except Exception as e:
# Check for 404 - bookmark was deleted
if hasattr(e, "response") and e.response.status_code == 404:
self.zndraw._bookmarks.pop(index, None)
else:
log.error("Failed to fetch bookmark %s: %s", index, e)
# Fallback to full refresh
self._refresh_all_bookmarks()
elif operation == "delete" and index is not None:
# Remove the specific bookmark from cache
self.zndraw._bookmarks.pop(index, None)
else:
# Full refresh for bulk operations (clear, shift, etc.)
self._refresh_all_bookmarks()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Improve exception handling for bookmark updates.

The targeted update logic is sound, but the exception handling could be more robust:

  1. Fragile 404 detection (lines 141-142): Using hasattr(e, "response") is fragile because not all exceptions have a response attribute, and accessing e.response.status_code could fail if response exists but lacks status_code.

  2. Broad exception catch (line 139): Catching Exception is too broad. Based on the get_bookmark implementation, it raises requests.HTTPError on failures.

  3. Logging best practice (line 144): Use logging.exception instead of logging.error to include the traceback automatically.

🔎 Proposed fix with specific exception handling
+import requests
+
 def _on_bookmarks_invalidate(self, data):
     """Handle bookmark invalidation by updating specific entry or refetching."""
     index = data.get("index")
     operation = data.get("operation")
 
     if operation == "set" and index is not None:
         # Targeted update - only fetch and update the specific bookmark
         try:
             response = self.zndraw.api.get_bookmark(index)
             label = response.get("label")
             if label is not None:
                 self.zndraw._bookmarks[index] = label
-        except Exception as e:
-            # Check for 404 - bookmark was deleted
-            if hasattr(e, "response") and e.response.status_code == 404:
+        except requests.HTTPError as e:
+            # Check for 404 - bookmark was deleted
+            if e.response.status_code == 404:
                 self.zndraw._bookmarks.pop(index, None)
             else:
-                log.error("Failed to fetch bookmark %s: %s", index, e)
+                log.exception("Failed to fetch bookmark %s", index)
                 # Fallback to full refresh
                 self._refresh_all_bookmarks()
+        except Exception:
+            log.exception("Unexpected error fetching bookmark %s", index)
+            self._refresh_all_bookmarks()
     elif operation == "delete" and index is not None:
         # Remove the specific bookmark from cache
         self.zndraw._bookmarks.pop(index, None)
     else:
         # Full refresh for bulk operations (clear, shift, etc.)
         self._refresh_all_bookmarks()

Based on static analysis hints and best practices.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.14.8)

139-139: Do not catch blind exception: Exception

(BLE001)


144-144: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In src/zndraw/socket_manager.py around lines 127 to 153, the exception handling
in _on_bookmarks_invalidate is too broad and fragile: narrow the except to
requests.HTTPError (import requests if not already), and when inspecting a 404
use a safe attribute access pattern (e.g. resp = getattr(e, "response", None);
status = getattr(resp, "status_code", None)) and check status == 404 before
popping the index; for all other HTTPError or unexpected exceptions call
logging.exception(...) to log traceback and then call the existing
_refresh_all_bookmarks() fallback; keep the existing targeted-update and delete
logic unchanged.

Comment on lines +154 to +160
def _refresh_all_bookmarks(self):
"""Refresh all bookmarks from server, retaining cache on failure."""
try:
bookmarks = self.zndraw.api.get_all_bookmarks()
self.zndraw._bookmarks = bookmarks
except Exception as e:
log.error("Failed to refresh bookmarks: %s", e)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use logging.exception for better error diagnostics.

Line 160 should use logging.exception instead of logging.error to automatically include the exception traceback, which is essential for debugging bookmark refresh failures.

🔎 Proposed fix
 def _refresh_all_bookmarks(self):
     """Refresh all bookmarks from server, retaining cache on failure."""
     try:
         bookmarks = self.zndraw.api.get_all_bookmarks()
         self.zndraw._bookmarks = bookmarks
     except Exception as e:
-        log.error("Failed to refresh bookmarks: %s", e)
+        log.exception("Failed to refresh bookmarks")

Based on static analysis hints and best practices.

🧰 Tools
🪛 Ruff (0.14.8)

159-159: Do not catch blind exception: Exception

(BLE001)


160-160: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In src/zndraw/socket_manager.py around lines 154 to 160, the exception handler
logs failures with log.error which omits the traceback; change the call to
log.exception("Failed to refresh bookmarks: %s", e) (or simply
log.exception("Failed to refresh bookmarks")) so the traceback is captured
automatically, keeping the existing message/context and leaving the rest of the
try/except logic unchanged.

@PythonFZ PythonFZ merged commit c9f9a7a into main Dec 22, 2025
6 checks passed
@PythonFZ PythonFZ deleted the fix/redis-scan-iter-optimization branch December 22, 2025 13:43
@coderabbitai coderabbitai Bot mentioned this pull request Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants