Enhance error handling for journal v2 migration#21514
Enhance error handling for journal v2 migration#21514stelfrag merged 6 commits intonetdata:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR enhances error handling throughout the journal v2 migration and metric collection paths to safely handle deleted or invalid metrics. The core change is making mrg_metric_dup return NULL when metric acquisition fails (due to deletion), with corresponding NULL checks and cleanup logic added across the storage engine, cache, and query paths.
Key changes:
- Modified
mrg_metric_dupto return NULL on failed metric acquisition instead of assuming success - Added NULL handle guards in storage engine operations to prevent crashes when metrics are deleted
- Implemented cleanup paths in migration, query preload, and collection initialization when encountering NULL metrics
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/database/engine/mrg.c | Returns NULL from mrg_metric_dup when metric_acquire fails due to deleted metrics |
| src/database/engine/rrdengineapi.c | Handles NULL return from mrg_metric_dup by cleaning up writer state and returning NULL |
| src/database/engine/pdc.c | Guards metric release in PDC destructor to prevent NULL pointer dereference |
| src/database/engine/pagecache.c | Adds NULL metric handling in query preload with completion initialization and cleanup |
| src/database/engine/cache.c | Checks for NULL metric and NULL UUID during journal v2 migration with page cleanup |
| src/database/storage-engine.h | Adds NULL checks to storage engine API functions for missing handles |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add checks to safely handle deleted metrics during page migration and cache operations. - Return NULL for invalid metric duplications in `mrg_metric_dup`. - Ensure proper resource cleanup and state marking when encountering deleted metrics. Enhance error handling for journal v2 migration: skip metrics with NULL UUID
…tialization logic - Safely return when `STORAGE_COLLECT_HANDLE` is null in key storage engine methods. - Reorder and streamline `pdc` initialization to avoid redundancy and improve clarity.
…error paths - Add early UUID validity checks during metric migration to avoid unnecessary operations. - Adjust inflight query counters in `pagecache` on query completion for accurate tracking.
7cc7418 to
e70599e
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
thiagoftsm
left a comment
There was a problem hiding this comment.
No issues found running for more than one hour. LGTM!
Summary
mrg_metric_dup.Summary by cubic
Hardened journal v2 migration, cache, and storage engine paths to safely handle deleted or invalid metrics and null handles, preventing crashes and stuck queries. mrg_metric_dup now returns NULL on failed acquire, and callers skip work and clean up resources.
Written for commit e70599e. Summary will update on new commits.