-
Notifications
You must be signed in to change notification settings - Fork 2
Improve node_type caching #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Save NodeType name to file - Read all previously used string-formats from cache-file - Fix CIRCLE_PLUS not appearing in cache-file - Make sure to rewrite the full contents at every write-to-file
WalkthroughCache serialization now writes enum names (e.g., "CIRCLE") instead of full enum strings. Cache restore accepts enum names, full enum strings (e.g., "NodeType.CIRCLE"), or numeric codes and logs warnings for unknown values. Logging and write_cache call signature were updated; tests and CHANGELOG adjusted accordingly. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant App
participant Cache as Cache I/O
participant Parser as NodeType Parser
App->>Cache: save_cache(nodes)
Cache->>Cache: for each node -> use node_type.name
Cache->>Cache: write_cache(..., overwrite=True)
Cache-->>App: save complete
App->>Cache: restore_cache()
Cache->>Cache: read cached value for mac
alt value starts with "NodeType."
Cache->>Parser: strip prefix -> name
Parser-->>Cache: NodeType via __members__ lookup
else value is plain name
Parser-->>Cache: NodeType via __members__ lookup
else value is numeric
Parser-->>Cache: NodeType(int(value))
else unknown
Cache-->>App: log warning and skip
end
Cache-->>App: restored nodes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #342 +/- ##
==========================================
- Coverage 81.33% 81.25% -0.08%
==========================================
Files 36 36
Lines 8107 8110 +3
==========================================
- Hits 6594 6590 -4
- Misses 1513 1520 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
plugwise_usb/network/cache.py (3)
31-35: Enum-name serialization + full rewrite: good; logging nitWriting
node_type.nameis the right call andrewrite=Trueguarantees deterministic persistence. Minor nit:str()aroundlen(...)is redundant; also consider lowercasing "nodes" in the log message.Apply:
- _LOGGER.debug("Save NodeTypes for %s Nodes", str(len(cache_data_to_save))) + _LOGGER.debug("Save NodeTypes for %s nodes", len(cache_data_to_save))
48-57: Harden deserialization: case-insensitive names; simplify numeric parsingCurrent logic works and is backward compatible. You can make it a bit more robust and simpler by:
- normalizing enum names case-insensitively
- relying on
int(val)with try/except (covers non-digits without a pre-check)- val = node_value.strip() - key = val.split(".", 1)[1] if val.startswith("NodeType.") else val - node_type = NodeType.__members__.get(key) # e.g., "CIRCLE" - if node_type is None and val.isdigit(): # e.g., "2" - try: - node_type = NodeType(int(val)) - except ValueError: - node_type = None + val = node_value.strip() + key = (val.split(".", 1)[1] if val.startswith("NodeType.") else val).upper() + node_type = NodeType.__members__.get(key) # e.g., "CIRCLE" + if node_type is None: + try: + node_type = NodeType(int(val)) + except ValueError: + node_type = None
87-96: Skip disk I/O on no-op prune
prune_cachealways rewrites the file even when nothing changes. Compare before saving to avoid unnecessary I/O (aligns with earlier guidance about saving only on updates).async def prune_cache(self, registry: list[str]) -> None: """Remove items from cache which are not found in registry scan.""" new_nodetypes: dict[str, NodeType] = {} for mac in registry: if mac == "": continue if (node_type := self.get_nodetype(mac)) is not None: new_nodetypes[mac] = node_type - self._nodetypes = new_nodetypes - await self.save_cache() + if new_nodetypes == self._nodetypes: + _LOGGER.debug("prune_cache: no changes; skipping save") + return + self._nodetypes = new_nodetypes + await self.save_cache()tests/test_usb.py (1)
1736-1739: Assertions updated for enum-name serialization: LGTMExpecting plain names in the cache is correct. Consider adding a small extra read test to cover numeric values too, since restore now supports them.
Example to append near the existing restore tests:
+ # also accept plain names and numeric codes (back-compat) + mock_read_data = [ + "AAAAAAAAAAAAAAAA;CIRCLE_PLUS", + "BBBBBBBBBBBBBBBB;2", + ] + file_chunks_iter = iter(mock_read_data) + mock_file_stream = MagicMock(readlines=lambda *args, **kwargs: file_chunks_iter) + with patch("aiofiles.threadpool.sync_open", return_value=mock_file_stream): + await pw_nw_cache.restore_cache() + assert pw_nw_cache.nodetypes["AAAAAAAAAAAAAAAA"] == pw_api.NodeType.CIRCLE_PLUS + assert pw_nw_cache.nodetypes["BBBBBBBBBBBBBBBB"] == pw_api.NodeType.CIRCLE
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugwise_usb/network/cache.py(2 hunks)tests/test_usb.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-30T10:46:42.649Z
Learnt from: bouwew
PR: plugwise/python-plugwise-usb#329
File: plugwise_usb/nodes/circle.py:0-0
Timestamp: 2025-08-30T10:46:42.649Z
Learning: In plugwise_usb/nodes/circle.py, the save_cache() method should only be called when both cache_updated AND save_cache parameters are True. If no cache updates occurred (cache_updated = False), there's no meaningful data to persist to disk, even if the caller requested saving (save_cache = True). This prevents unnecessary disk I/O operations.
Applied to files:
plugwise_usb/network/cache.py
🧬 Code graph analysis (1)
plugwise_usb/network/cache.py (3)
plugwise_usb/api.py (2)
name(328-329)NodeType(62-73)plugwise_usb/messages/responses.py (1)
node_type(633-635)plugwise_usb/helpers/cache.py (1)
write_cache(82-121)
dirixmjm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
plugwise_usb/network/cache.py (2)
29-35: Prefer keyword arg for write_cache and tighten serialization loop
- Use a keyword for the boolean to avoid positional ambiguity.
- Replace the loop with a dict-comprehension for brevity.
Apply this diff:
- cache_data_to_save: dict[str, str] = {} - for mac, node_type in self._nodetypes.items(): - cache_data_to_save[mac] = node_type.name - _LOGGER.debug("Save NodeTypes for %s Nodes", len(cache_data_to_save)) - await self.write_cache( - cache_data_to_save, True - ) # rewrite set to True to make sure the cache-contents is actual + cache_data_to_save: dict[str, str] = { + mac: node_type.name for mac, node_type in self._nodetypes.items() + } + _LOGGER.debug("Save NodeTypes for %s Nodes", len(cache_data_to_save)) + await self.write_cache(cache_data_to_save, rewrite=True)Optional: consider making writes atomic in helpers.cache (write to a temp file then rename) to avoid truncated files on crashes during full rewrites.
58-60: Include MAC in invalid-node warning for easier triageAdds key context to the log line.
Apply this diff:
- if node_type is None: - _LOGGER.warning("Invalid NodeType in cache: %s", node_value) - continue + if node_type is None: + _LOGGER.warning("Invalid NodeType in cache for mac %s: %s", mac, node_value) + continue
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
plugwise_usb/network/cache.py(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-30T10:46:42.649Z
Learnt from: bouwew
PR: plugwise/python-plugwise-usb#329
File: plugwise_usb/nodes/circle.py:0-0
Timestamp: 2025-08-30T10:46:42.649Z
Learning: In plugwise_usb/nodes/circle.py, the save_cache() method should only be called when both cache_updated AND save_cache parameters are True. If no cache updates occurred (cache_updated = False), there's no meaningful data to persist to disk, even if the caller requested saving (save_cache = True). This prevents unnecessary disk I/O operations.
Applied to files:
plugwise_usb/network/cache.py
🧬 Code graph analysis (1)
plugwise_usb/network/cache.py (3)
plugwise_usb/api.py (2)
name(328-329)NodeType(62-73)plugwise_usb/messages/responses.py (1)
node_type(633-635)plugwise_usb/helpers/cache.py (1)
write_cache(82-121)
🔇 Additional comments (2)
plugwise_usb/network/cache.py (2)
48-57: Backward‑compatible parsing looks solidCovers enum names, "NodeType.X" strings, and numeric values without raising. Nice.
96-99: Good early exit prevents unnecessary disk I/OSkipping save when nothing changed aligns with previous guidance to avoid redundant writes.
|
@coderabbitai Please create a code-example for this suggestion: making writes atomic in helpers.cache (write to a temp file then rename) |
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@bouwew Here's a code example for implementing atomic writes in the cache functionality using the temporary file + rename pattern: import os
import tempfile
from pathlib import Path
from typing import Dict
def write_cache_atomic(cache_file: Path, data: Dict[str, str]) -> None:
"""
Atomically write cache data to file using temporary file + rename pattern.
This prevents corruption if the process is interrupted during writing.
"""
# Ensure the parent directory exists
cache_file.parent.mkdir(parents=True, exist_ok=True)
# Create a temporary file in the same directory as the target file
# This ensures the temp file is on the same filesystem for atomic rename
temp_fd = None
temp_path = None
try:
# Create temporary file in same directory as cache file
temp_fd, temp_path = tempfile.mkstemp(
prefix=f".{cache_file.name}_tmp_",
suffix=".tmp",
dir=cache_file.parent
)
# Write data to temporary file
with os.fdopen(temp_fd, 'w', encoding='utf-8') as temp_file:
temp_fd = None # File descriptor is now owned by temp_file
for mac, node_type in data.items():
temp_file.write(f"{mac};{node_type}\n")
temp_file.flush()
os.fsync(temp_file.fileno()) # Ensure data is written to disk
# Atomic rename: move temp file to final location
# This is atomic on most filesystems
if os.name == 'nt': # Windows
# Windows doesn't allow atomic replace if target exists
if cache_file.exists():
cache_file.unlink()
Path(temp_path).replace(cache_file)
temp_path = None # Successfully renamed, don't cleanup
except Exception as e:
# Cleanup temporary file on error
if temp_fd is not None:
try:
os.close(temp_fd)
except OSError:
pass
if temp_path and Path(temp_path).exists():
try:
Path(temp_path).unlink()
except OSError:
pass
raise e
# Modified save_cache method for the NetworkCache class
def save_cache(self) -> None:
"""Save node types to cache file atomically."""
if not self._cache_file_path:
return
if not self._nodetypes:
_LOGGER.debug("save_cache: no NodeTypes to save")
return
_LOGGER.debug("Save NodeTypes for %s Nodes", len(self._nodetypes))
# Prepare data for atomic write
cache_data = {}
for mac, node_type in self._nodetypes.items():
cache_data[mac] = node_type.name # Use .name instead of str(node_type)
try:
write_cache_atomic(Path(self._cache_file_path), cache_data)
_LOGGER.debug("Successfully saved cache to %s", self._cache_file_path)
except Exception as e:
_LOGGER.error("Failed to save cache to %s: %s", self._cache_file_path, e)Key benefits of this atomic write approach:
This replaces the direct file writing approach with a safer atomic operation that prevents cache corruption during write operations. 🏁 Script executed: Length of output: 291 🏁 Script executed: Length of output: 230 🏁 Script executed: Length of output: 3657 🏁 Script executed: Length of output: 399 🏁 Script executed: Length of output: 951 🏁 Script executed: Length of output: 0 Based on the cache implementation I can see, here's a code example for implementing atomic writes in the import os
import tempfile
from pathlib import Path
import aiofiles
import aiofiles.os
async def write_cache_atomic(self, data: dict[str, str], rewrite: bool = False) -> None:
"""Save information to cache file atomically using temp file + rename."""
if not self._initialized:
raise CacheError(
f"Unable to save cache. Initialize cache file '{self._file_name}' first."
)
current_data: dict[str, str] = {}
if not rewrite:
current_data = await self.read_cache()
# Merge current data with new data
processed_keys: list[str] = []
for key, value in data.items():
current_data[key] = value
processed_keys.append(key)
# Prepare cache file path
cache_file_path = Path(self._cache_file_path)
# Create temporary file in same directory for atomic rename
temp_fd = None
temp_path = None
try:
# Create temporary file in same directory as cache file
temp_fd, temp_path = tempfile.mkstemp(
prefix=f".{cache_file_path.name}_tmp_",
suffix=".tmp",
dir=cache_file_path.parent
)
# Close the file descriptor since we'll use aiofiles
os.close(temp_fd)
temp_fd = None
# Write data to temporary file using aiofiles for async compatibility
async with aiofiles.open(temp_path, mode='w', encoding='utf-8') as temp_file:
for key, value in current_data.items():
await temp_file.write(f"{key};{value}\n")
await temp_file.flush()
# Note: aiofiles doesn't have fsync, but the OS will ensure consistency
# Atomic rename: move temp file to final location
# This operation is atomic on most filesystems
if os.name == 'nt': # Windows compatibility
# Windows doesn't allow atomic replace if target exists
if cache_file_path.exists():
await aiofiles.os.remove(cache_file_path)
# Perform atomic rename
await aiofiles.os.rename(temp_path, cache_file_path)
temp_path = None # Successfully renamed, don't cleanup
_LOGGER.debug(
"Cache file '%s' updated, %s keys processed",
self._file_name,
len(processed_keys),
)
except Exception as e:
# Cleanup temporary file on error
if temp_fd is not None:
try:
os.close(temp_fd)
except OSError:
pass
if temp_path and Path(temp_path).exists():
try:
await aiofiles.os.remove(temp_path)
except OSError:
pass
_LOGGER.error("Failed to write cache file '%s': %s", self._file_name, e)
raise CacheError(f"Unable to save cache file '{self._file_name}': {e}") from e
# Alternative synchronous version if aiofiles.os operations are not available
async def write_cache_atomic_sync_rename(self, data: dict[str, str], rewrite: bool = False) -> None:
"""Save information to cache file atomically using sync rename for compatibility."""
if not self._initialized:
raise CacheError(
f"Unable to save cache. Initialize cache file '{self._file_name}' first."
)
current_data: dict[str, str] = {}
if not rewrite:
current_data = await self.read_cache()
# Merge current data with new data
processed_keys: list[str] = []
for key, value in data.items():
current_data[key] = value
processed_keys.append(key)
# Prepare cache file path
cache_file_path = Path(self._cache_file_path)
# Create temporary file in same directory
temp_path = cache_file_path.with_suffix(f"{cache_file_path.suffix}.tmp")
try:
# Write data to temporary file
async with aiofiles.open(temp_path, mode='w', encoding='utf-8') as temp_file:
for key, value in current_data.items():
await temp_file.write(f"{key};{value}\n")
await temp_file.flush()
# Atomic rename using synchronous os.rename (most reliable cross-platform)
if os.name == 'nt' and cache_file_path.exists():
# Windows: remove target file first
cache_file_path.unlink()
# Perform atomic rename using pathlib (internally uses os.rename)
temp_path.replace(cache_file_path)
_LOGGER.debug(
"Cache file '%s' updated atomically, %s keys processed",
self._file_name,
len(processed_keys),
)
except Exception as e:
# Cleanup temporary file on error
if temp_path.exists():
try:
temp_path.unlink()
except OSError:
pass
_LOGGER.error("Failed to write cache file '%s' atomically: %s", self._file_name, e)
raise CacheError(f"Unable to save cache file '{self._file_name}': {e}") from eKey benefits of this atomic approach:
You would replace the existing 🏁 Script executed: Length of output: 6409 🏁 Script executed: Length of output: 5899 🏁 Script executed: Length of output: 1740 🏁 Script executed: Length of output: 0 Now I have the complete picture of the current implementation. Here's an atomic write example that integrates with your existing import tempfile
from pathlib import Path
async def write_cache(self, data: dict[str, str], rewrite: bool = False) -> None:
"""Save information to cache file atomically using temp file + rename."""
if not self._initialized:
raise CacheError(
f"Unable to save cache. Initialize cache file '{self._file_name}' first."
)
current_data: dict[str, str] = {}
if not rewrite:
current_data = await self.read_cache()
processed_keys: list[str] = []
data_to_write: list[str] = []
# Prepare data exactly as in original implementation
for _cur_key, _cur_val in current_data.items():
_write_val = _cur_val
if _cur_key in data:
_write_val = data[_cur_key]
processed_keys.append(_cur_key)
data_to_write.append(f"{_cur_key}{CACHE_KEY_SEPARATOR}{_write_val}\n")
# Write remaining new data
for _key, _value in data.items():
if _key not in processed_keys:
data_to_write.append(f"{_key}{CACHE_KEY_SEPARATOR}{_value}\n")
# Atomic write implementation
cache_file_path = Path(self._cache_file)
temp_path = None
try:
# Create temporary file in same directory as cache file
# This ensures atomic rename works (same filesystem)
with tempfile.NamedTemporaryFile(
mode='w',
encoding=UTF8,
prefix=f".{cache_file_path.name}_tmp_",
suffix=".tmp",
dir=cache_file_path.parent,
delete=False
) as temp_file:
temp_path = Path(temp_file.name)
# Write all data to temporary file using regular file operations
# (aiofiles doesn't work well with tempfile.NamedTemporaryFile)
temp_file.writelines(data_to_write)
temp_file.flush()
import os
os.fsync(temp_file.fileno()) # Ensure data is written to disk
# Perform atomic rename
if os.name == 'nt': # Windows compatibility
# Windows doesn't allow rename over existing file
if cache_file_path.exists():
cache_file_path.unlink()
# Atomic rename operation
temp_path.replace(cache_file_path)
temp_path = None # Successfully renamed, don't cleanup
if not self._cache_file_exists:
self._cache_file_exists = True
_LOGGER.debug(
"Saved %s lines to cache file %s (atomic write)",
str(len(data)),
self._cache_file
)
except OSError as exc:
_LOGGER.warning(
"%s while writing data to cache file %s (atomic write)",
exc,
str(self._cache_file)
)
# Don't re-raise to maintain compatibility with original behavior
except Exception as exc:
_LOGGER.error(
"Unexpected error during atomic cache write: %s", exc
)
# Don't re-raise to maintain compatibility
finally:
# Cleanup temporary file if it still exists (error case)
if temp_path and temp_path.exists():
try:
temp_path.unlink()
except OSError:
_LOGGER.warning("Failed to cleanup temporary cache file: %s", temp_path)
# Alternative version using aiofiles for consistency (if you prefer)
async def write_cache_aiofiles_atomic(self, data: dict[str, str], rewrite: bool = False) -> None:
"""Save information to cache file atomically using aiofiles + temp file."""
if not self._initialized:
raise CacheError(
f"Unable to save cache. Initialize cache file '{self._file_name}' first."
)
current_data: dict[str, str] = {}
if not rewrite:
current_data = await self.read_cache()
processed_keys: list[str] = []
data_to_write: list[str] = []
# Prepare data exactly as in original implementation
for _cur_key, _cur_val in current_data.items():
_write_val = _cur_val
if _cur_key in data:
_write_val = data[_cur_key]
processed_keys.append(_cur_key)
data_to_write.append(f"{_cur_key}{CACHE_KEY_SEPARATOR}{_write_val}\n")
# Write remaining new data
for _key, _value in data.items():
if _key not in processed_keys:
data_to_write.append(f"{_key}{CACHE_KEY_SEPARATOR}{_value}\n")
# Atomic write using aiofiles with temporary file
cache_file_path = Path(self._cache_file)
temp_path = cache_file_path.with_name(f".{cache_file_path.name}.tmp.{os.getpid()}")
try:
# Write to temporary file using aiofiles
async with aiofiles_open(
file=str(temp_path),
mode="w",
encoding=UTF8,
) as temp_file:
await temp_file.writelines(data_to_write)
await temp_file.flush()
# Atomic rename
if os.name == 'nt' and cache_file_path.exists():
cache_file_path.unlink()
temp_path.replace(cache_file_path)
temp_path = None # Successfully renamed
if not self._cache_file_exists:
self._cache_file_exists = True
_LOGGER.debug(
"Saved %s lines to cache file %s (aiofiles atomic write)",
str(len(data)),
self._cache_file
)
except OSError as exc:
_LOGGER.warning(
"%s while writing data to cache file %s (aiofiles atomic write)",
exc,
str(self._cache_file)
)
finally:
# Cleanup on error
if temp_path and temp_path.exists():
try:
temp_path.unlink()
except OSError:
passKey improvements in the atomic approach:
Integration: The atomic write ensures that if the process is interrupted during cache writing (power loss, process kill, etc.), you'll never end up with a corrupted cache file, which improves the reliability of your node type caching system. |



All changes:
Summary by CodeRabbit
Bug Fixes
Tests
Documentation