Add configurable size_layers and size_derived_metrics for layer-specific size metric handling#47
Merged
izzet merged 3 commits intollnl:developfrom Mar 5, 2026
Conversation
…te configuration for size layers
Contributor
There was a problem hiding this comment.
Pull request overview
This PR makes “size” metric handling configurable per analyzer preset and propagates that configuration through the analyzer pipeline, while also improving image_size extraction in IO event parsing.
Changes:
- Added preset config fields
size_layersandsize_derived_metrics(with POSIX/DLIO defaults) and centralized POSIX size-derived metric names. - Updated analyzer logic to keep/drop size columns by configured layers and to treat only configured derived metrics as “size” metrics.
- Adjusted IO parsing to populate
sizefromimage_sizeonly for non-openoperations.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| python/dftracer/analyzer/dftracer.py | Updates image_size → size attribution logic during IO event parsing. |
| python/dftracer/analyzer/config.py | Introduces configurable size layer/metric preset fields and a POSIX size metrics constant. |
| python/dftracer/analyzer/analyzer.py | Threads new size config into main view computation and derived metric column creation. |
Comments suppressed due to low confidence (1)
python/dftracer/analyzer/config.py:1
- These fields are typed as
Optional[...]but default to empty containers viadefault_factory, so they’ll never beNoneunless explicitly passed asNone. This makes downstream code more complex (extraor {}/or []) and makes the API ambiguous. Prefer either (a) making them non-optional (Dict[...]/List[...]) with the current defaults, or (b) defaulting toNonewithoutdefault_factoryifNoneis a meaningful “unset” state.
import dataclasses as dc
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Collaborator
|
@izzet let me know if i want to merge this first or u still need to test using Karim's new traces |
Collaborator
Author
|
I am merging this @rayandrew thanks |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a more flexible system for handling "size" metrics and layers in the analyzer configuration and processing logic. The main improvements are the ability to specify which layers and metrics should be treated as "size" related, and to propagate these settings throughout the analyzer pipeline. This enhances configurability for different storage backends and analysis presets.
Configuration enhancements:
size_derived_metricsandsize_layersfields to theAnalyzerPresetConfigdataclass, with appropriate defaults for POSIX and DLIO presets, allowing each preset to specify which layers and metrics are considered "size"-related. [1] [2] [3]DERIVED_POSIX_SIZE_METRICSconstant to centralize the list of metrics considered as size-related for POSIX.Analyzer logic updates:
set_layer_metricsmethod inanalyzer.pyto accept asize_derived_metricsargument, and use it to determine which metrics should be treated as size-related, improving flexibility and reducing hardcoding._compute_main_viewmethod to use the new configuration fields, ensuring that size columns and metrics are handled correctly based on the current layer and preset.Bug fix / data extraction improvement:
image_sizeinio_functionto only assign thesizefield when the operation name does not include "open", ensuring more accurate data attribution.