Skip to content

fix superblock struct; fix extended attribute parse#8

Merged
Eeems merged 23 commits intoEeems:mainfrom
007gzs:main
Aug 16, 2025
Merged

fix superblock struct; fix extended attribute parse#8
Eeems merged 23 commits intoEeems:mainfrom
007gzs:main

Conversation

@007gzs
Copy link
Copy Markdown
Contributor

@007gzs 007gzs commented Aug 6, 2025

fix superblock struct;
fix extended attribute parse;

Summary by CodeRabbit

  • New Features

    • Enhanced extended attribute support, including values stored in separate inodes and safer value offset handling.
    • Added compatibility with 64-bit ext4 filesystems via dynamic descriptor sizing.
    • New option to ignore unknown attribute name indices for more tolerant parsing.
  • Bug Fixes

    • Corrected alignment and size calculations for extended attributes.
    • Fixed inline attribute size computation for accurate reads.
  • Tests

    • Expanded coverage to 32-bit and 64-bit images with 100 files and extensive xattr validation.
  • Chores

    • Updated test assets and CI artifacts to include both 32-bit and 64-bit images.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 6, 2025

Warning

Rate limit exceeded

@Eeems has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 112743e and 20f0586.

📒 Files selected for processing (3)
  • .github/workflows/build.yaml (2 hunks)
  • ext4/xattr.py (8 hunks)
  • pyproject.toml (1 hunks)

Walkthrough

Ext4 metadata handling was updated: superblock fields extended with a dynamic descriptor size, volume uses that size and a new constructor flag, inode exposes block_size and adjusts xattr inline size, and extended-attribute parsing/reading was reworked to support EA-inode values and tolerant name indices. Test images now build 32-bit and 64-bit variants; tests and CI updated accordingly.

Changes

Cohort / File(s) Summary
XAttr and inode integration
ext4/inode.py, ext4/xattr.py
Inode now exposes block_size; xattr inline-size computation uses inode offset; xattr headers unify; xattr parsing refactored with value_offset helpers, EA-inode support via value_inum, safer name index handling, corrected entry stepping, charset specified; imports adjusted.
Superblock and volume descriptor sizing
ext4/superblock.py, ext4/volume.py
Superblock adds s_mkfs_time, expands s_backup_bgs to 2 entries, and introduces desc_size (32 or s_desc_size if 64bit). Volume ctor adds ignore_attr_name_index flag; block descriptor offset computation switches to superblock.desc_size.
Test image generation and CI artifacts
_test_image.sh, test.sh, .github/workflows/build.yaml
Test image script now generates many files with xattrs and creates 32-bit/64-bit ext4 images; test.sh handles new files and cleanup; CI uploads four artifacts (32/64 and tmp variants).
Tests
test.py
Tests iterate over test32/test64 images, verify offset gating, read /test.txt behaviors, and validate 100 files’ contents and 20 user xattrs each, including EA reading paths.

Sequence Diagram(s)

sequenceDiagram
  participant T as Test
  participant V as Volume
  participant I as Inode
  participant XH as XAttr Header (IBody/Block)
  participant XE as XAttr Entry
  participant RV as Raw Volume Read

  T->>V: open(image, offset)
  T->>I: get_inode(path)
  T->>I: list/read xattrs
  I->>XH: parse header at offsets
  loop entries
    XH->>XE: parse entry
    alt XE.value_inum != 0
      I->>V: get inode by inum
      V-->>I: referenced inode
      I-->>T: value from referenced inode
    else XE.e_value_size > 0
      XH->>XH: compute value_offset(entry)
      I->>RV: read at value_offset, size
      RV-->>I: value bytes
      I-->>T: value bytes
    else
      I-->>T: b""
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Unify test image creation logic #9 — Also modifies test image creation flow and related scripts/workflow, aligning with the new 32/64-bit image handling.
  • Fix #6 #7 — Adjusts test harness to use prefixed-byte images and offset logic, similar to the updated test and image generation approach in this PR.

Poem

A rabbit with paws in the inode’s lore,
Nibbles xattrs, now cleaner than before.
Two moons of images, 32 and 64,
Superblocks whisper their descriptor lore.
Hop, skip, seek—values fetched in tune,
Bytes like clover beneath a silver moon. 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ext4/xattr.py (1)

161-172: Fix Yoda condition and add stacklevel to warning.

The logic for handling unknown attribute indices is good, but there are two minor issues to address:

  1. The Yoda condition makes the code less readable
  2. The warning should include stacklevel=2 to point to the caller
     @property
     def name_str(self):
         name_index = self.e_name_index
-        if 0 > name_index or name_index >= len(
+        if name_index < 0 or name_index >= len(
             ExtendedAttributeEntry.NAME_INDICES
         ):
             warnings.warn(
-                f"Unknown attribute prefix {self.e_name_index:d}"
+                f"Unknown attribute prefix {self.e_name_index:d}",
+                stacklevel=2
             )
             name_index = 0
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd1b8e1 and 53860fd.

📒 Files selected for processing (6)
  • ext4/blockdescriptor.py (1 hunks)
  • ext4/directory.py (1 hunks)
  • ext4/inode.py (4 hunks)
  • ext4/superblock.py (3 hunks)
  • ext4/volume.py (1 hunks)
  • ext4/xattr.py (7 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
ext4/xattr.py

162-162: Yoda condition detected

Rewrite as name_index < 0

(SIM300)


165-165: No explicit stacklevel keyword argument found

Set stacklevel=2

(B028)

🔇 Additional comments (18)
ext4/superblock.py (3)

79-79: LGTM! Standard filesystem field addition.

The s_mkfs_time field addition aligns with EXT4 specification for tracking filesystem creation time.


188-193: Well-implemented conditional property.

The desc_size property correctly encapsulates the logic for determining descriptor size based on the IS64BIT feature flag, providing a clean interface for callers.


114-114: Please verify the EXT4 superblock layout for s_backup_bgs

The change from c_uint32 to c_uint32 * 2 alters the on-disk struct layout. We didn’t find any in-repo documentation to confirm this against the official EXT4 specification. According to the Linux kernel’s ext4_super_block (e.g. in include/uapi/linux/ext4_fs.h), s_backup_bgs is indeed defined as a two-element 32-bit array. Please cross-check with the upstream EXT4 spec or kernel headers to ensure this change remains binary-compatible with existing filesystem images.

• File: ext4/superblock.py, line 114

ext4/directory.py (1)

23-23: LGTM! Formatting standardization.

Removing extraneous spaces around slice indices improves code consistency across the codebase.

ext4/volume.py (1)

86-86: LGTM! Proper usage of encapsulated property.

Using self.superblock.desc_size instead of direct field access ensures the descriptor size calculation respects the 64-bit feature flag logic encapsulated in the Superblock class.

ext4/blockdescriptor.py (1)

123-123: LGTM! Consistent slice formatting.

Standardizing slice formatting across the codebase improves readability and consistency.

ext4/inode.py (5)

35-35: LGTM! Import addition for extended attribute improvements.

Adding ExtendedAttributeHeader import supports the enhanced extended attribute handling in this file.


236-236: LGTM! Consistent slice formatting.

Standardizing slice formatting across the checksum calculation improves code consistency.

Also applies to: 241-241


303-303: Critical fix for inline extended attribute size calculation.

The addition of self.offset to the inline_size calculation corrects the offset computation, ensuring proper extended attribute parsing within inode bodies.


315-315: LGTM! Consistent block size access pattern.

Using self.volume.block_size instead of self.block_size aligns with the volume-centric approach used throughout the codebase.


317-319: Enhanced extended attribute handling for block-based xattrs.

Switching from ExtendedAttributeIBodyHeader to ExtendedAttributeHeader for block-based extended attributes improves compatibility and parsing accuracy.

ext4/xattr.py (7)

10-10: LGTM! Import addition is necessary.

The import of EXT4_FEATURE_INCOMPAT is required for the new value_inum property implementation.


45-47: LGTM! Value offset calculation method added.

The value_offset method correctly calculates the offset for attribute values relative to the entry position.


52-83: Excellent improvements to attribute iteration logic.

The changes improve the robustness of extended attribute handling:

  • Correctly uses the new value_inum property which respects feature flags
  • Adds proper bounds checking for value reads (lines 74-78)
  • Fixes the iteration step size calculation (line 83)

94-94: LGTM! Reserved field size corrected.

The h_reserved field size has been corrected to 3 c_uint32 elements, aligning with the ext4 specification.


105-107: LGTM! Value offset method for block-based attributes.

The value_offset method correctly calculates the offset relative to the header position for block-based extended attributes.


134-134: LGTM! Added fallback for unknown attribute prefix.

Adding an empty string at index 5 provides a fallback for handling unknown attribute prefixes.


174-179: LGTM! Feature-aware value_inum property.

The value_inum property correctly checks the EA_INODE feature flag before returning the inode number, ensuring compatibility with the filesystem's capabilities.

Comment thread ext4/xattr.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ext4/xattr.py (1)

159-170: Good error handling improvement, but fix style issues.

The change from raising exceptions to issuing warnings makes the code more resilient when encountering unknown attribute prefixes. The encoding change to iso-8859-2 likely aligns better with ext4 specifications.

However, address these style issues:

-        if 0 > name_index or name_index >= len(
+        if name_index < 0 or name_index >= len(
             ExtendedAttributeEntry.NAME_INDICES
         ):
             warnings.warn(
-                f"Unknown attribute prefix {self.e_name_index:d}"
+                f"Unknown attribute prefix {self.e_name_index:d}",
+                stacklevel=2
             )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 53860fd and 9151cb5.

📒 Files selected for processing (1)
  • ext4/xattr.py (7 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
ext4/xattr.py (2)
ext4/enum.py (2)
  • EXT4_FEATURE_INCOMPAT (141-157)
  • EXT4_FL (302-329)
ext4/inode.py (1)
  • Inode (121-326)
🪛 Ruff (0.12.2)
ext4/xattr.py

160-160: Yoda condition detected

Rewrite as name_index < 0

(SIM300)


163-163: No explicit stacklevel keyword argument found

Set stacklevel=2

(B028)

🔇 Additional comments (11)
ext4/xattr.py (11)

10-10: LGTM: Import addition supports new feature checking.

The addition of EXT4_FEATURE_INCOMPAT is necessary for the new value_inum property that checks EA_INODE feature compatibility.


18-21: Good fix: Resolves initialization order issue by requiring size parameter.

This change addresses the past review concern about accessing self.size before super().__init__(). Making the size parameter required eliminates the initialization order problem and simplifies the constructor logic.

Note: This is a breaking change for any code that relied on the optional size parameter.


43-44: LGTM: Proper abstraction for inline attribute value offsets.

The value_offset method correctly calculates the absolute offset for attribute values in inline storage by adding the entry's offset to its value offset.


103-104: LGTM: Proper abstraction for block-stored attribute value offsets.

The value_offset method correctly calculates the absolute offset for attribute values in block storage by adding the header's offset to the entry's value offset.


50-50: Good: Size-limited entry construction prevents buffer overruns.

Passing the remaining data size to the ExtendedAttributeEntry constructor helps prevent reading beyond the allocated attribute space.


55-55: Good: Using property respects feature flag checking.

Using the value_inum property instead of direct e_value_inum access ensures that EA_INODE feature compatibility is properly checked.


59-81: Excellent: Robust value reading logic with proper bounds checking.

The three-case value reading logic properly handles different attribute storage scenarios:

  1. EA_INODE case: Validates inode flags and reads from referenced inode
  2. Inline value case: Includes crucial bounds checking to prevent reading beyond allocated space
  3. Empty value case: Correctly handles zero-length attributes

The bounds checking on Line 72 is particularly important for security, preventing potential buffer overruns.


81-81: Critical fix: Corrected step size calculation prevents infinite loops.

The corrected step size calculation 4 * ((entry.size + 3) // 4) ensures proper 4-byte alignment and prevents potential infinite loops in the iterator. This is a crucial correctness fix.


92-92: Important: Structure correction aligns with ext4 specification.

Increasing the reserved field size from 2 to 3 c_uint32 elements corrects the header structure to match the actual ext4 extended attribute header format. This is essential for proper parsing of on-disk data.


132-132: Good: Fallback entry improves error handling robustness.

Adding an empty string at index 5 provides a safe fallback for unknown attribute name prefixes, preventing crashes when encountering unexpected or corrupted extended attribute data.


172-177: Excellent: Proper feature flag checking for EA_INODE functionality.

The value_inum property correctly checks the superblock's incompatible features before allowing access to e_value_inum. This ensures EA_INODE functionality is only used when the filesystem actually supports it, preventing incorrect behavior on older filesystems.

This aligns perfectly with the PR objectives to fix extended attribute parsing.

Copy link
Copy Markdown
Owner

@Eeems Eeems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update test.py to test these changes.

Comment thread ext4/xattr.py Outdated
Comment thread ext4/xattr.py Outdated
Comment thread ext4/xattr.py
Comment thread ext4/inode.py Outdated
Comment thread ext4/superblock.py
Comment thread ext4/superblock.py
Comment thread ext4/inode.py Outdated
007gzs and others added 4 commits August 6, 2025 22:15
Co-authored-by: Nathaniel van Diepen <Eeems@users.noreply.github.com>
Co-authored-by: Nathaniel van Diepen <Eeems@users.noreply.github.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ext4/xattr.py (1)

160-174: Fix Yoda condition and add stacklevel to warning.

The code correctly handles unknown attribute name indices with proper fallback behavior, but has minor style issues flagged by static analysis.

Apply this diff to fix the style issues:

-        if 0 > name_index or name_index >= len(
+        if name_index < 0 or name_index >= len(
             ExtendedAttributeEntry.NAME_INDICES
         ):
             msg = f"Unknown attribute prefix {self.e_name_index:d}"
             if self.volume.ignore_attr_name_index:
-                warnings.warn(msg, RuntimeWarning)
+                warnings.warn(msg, RuntimeWarning, stacklevel=2)
                 name_index = 0
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30461bd and 9a6528f.

📒 Files selected for processing (2)
  • ext4/volume.py (3 hunks)
  • ext4/xattr.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • ext4/volume.py
🧰 Additional context used
🪛 Ruff (0.12.2)
ext4/xattr.py

161-161: Yoda condition detected

Rewrite as name_index < 0

(SIM300)


166-166: No explicit stacklevel keyword argument found

Set stacklevel=2

(B028)

🔇 Additional comments (11)
ext4/xattr.py (11)

11-11: LGTM: Import added for EA_INODE feature support.

The addition of EXT4_FEATURE_INCOMPAT import is necessary for the new value_inum property implementation that checks for the EA_INODE incompatibility feature.


44-46: LGTM: Value offset calculation method added.

The value_offset method correctly calculates the absolute offset for attribute values by adding the entry's relative offset (e_value_offs) to the entry's position in the inode body.


51-51: LGTM: Entry instantiation now includes remaining data size.

The change to pass self.data_size - i as the size parameter ensures that each entry is aware of the remaining available space, which helps with bounds checking during parsing.


56-56: LGTM: Using value_inum property instead of direct field access.

The change from entry.e_value_inum to entry.value_inum is correct as it now properly respects the EA_INODE feature flag before returning the inode number.


60-80: LGTM: Improved attribute value reading logic with bounds checking.

The restructured conditional logic properly handles all three cases:

  1. Value stored in external inode (when value_inum != 0)
  2. Value stored inline with bounds checking (when e_value_size != 0)
  3. Empty value (fallback case)

The bounds checking on Line 73-74 prevents reading beyond the available data, which is a good defensive programming practice.


82-82: LGTM: Corrected entry size calculation for iteration.

The multiplication by 4 after rounding up to a multiple of 4 bytes is the correct way to calculate the padded entry size for iteration. The previous logic was likely causing incorrect offset calculations.


93-93: LGTM: Extended reserved field size.

Increasing the reserved field from 2 to 3 c_uint32 elements aligns with the superblock struct fixes mentioned in the PR objectives and likely reflects the correct EXT4 specification.


104-106: LGTM: Consistent value offset calculation for header-based attributes.

The value_offset method for ExtendedAttributeHeader correctly calculates the absolute offset by adding e_value_offs to the header's offset, maintaining consistency with the inode body implementation.


133-133: LGTM: Added empty string for unknown attribute prefixes.

Based on the past review discussion, this change handles the case where name index 5 (reserved for Lustre filesystem) is encountered. The empty string provides a safe fallback for unknown indices.


173-173: LGTM: Proper charset specified for attribute name decoding.

Using "iso-8859-2" as the decoding charset is appropriate for extended attribute names and provides better compatibility than the default UTF-8.


175-181: LGTM: Feature-aware value_inum property implementation.

The value_inum property correctly checks for the EA_INODE incompatibility feature before returning the inode number. This prevents accessing invalid inode numbers when the feature is not enabled, addressing the core issue mentioned in the PR objectives.

@007gzs
Copy link
Copy Markdown
Contributor Author

007gzs commented Aug 7, 2025

Please update test.py to test these changes.

I found the problem while reading the Android file system, and I don't know how to build such an image.

  • xattrs: Previously, it was completely unavailable
  • desc_size: Previously, in the case of non 64 bit support, the information of the first group was always obtained
  • superblock: I think all fields after s_first_sta_bg have problems

@Eeems
Copy link
Copy Markdown
Owner

Eeems commented Aug 7, 2025

Please update test.py to test these changes.

I found the problem while reading the Android file system, and I don't know how to build such an image.

* `xattrs`: Previously, it was completely unavailable

I've just merged a change that unifies how the test image is created. You can just update the _create_image.sh script to create new files and set attributes on them before they get placed in the image for testing. test.py can then be updated to ensure that those attributes exist, as well as expected error handling works, etc.

* `desc_size`: Previously, in the case of non 64 bit support, the information of the first group was always obtained

My only use-case for this library at the moment has been system images for a 32bit arm device. You can adjust the image creation to create a 64bit and 32bit version of the image for the test.

* `superblock`: I think all fields after `s_first_sta_bg` have problems

Could you expand on this?

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
_test_image.sh (1)

6-11: Use > for initial file creation

>> appends; if the script is rerun without cleaning $tmp_dir, stale data could accumulate. Prefer > when writing the file the first time:

-  echo "hello world$i" >> "$tmp_dir"/test$i.txt
+  echo "hello world$i" > "$tmp_dir"/test$i.txt
ext4/xattr.py (1)

160-169: Minor style & diagnostics improvements

if name_index < 0 or name_index >= len(ExtendedAttributeEntry.NAME_INDICES):
    warnings.warn(msg, RuntimeWarning, stacklevel=2)

– Regular comparison reads clearer than the Yoda condition.
– Supplying stacklevel=2 points the warning at the caller.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a6528f and 023f802.

📒 Files selected for processing (4)
  • _test_image.sh (1 hunks)
  • ext4/xattr.py (7 hunks)
  • test.py (1 hunks)
  • test.sh (0 hunks)
💤 Files with no reviewable changes (1)
  • test.sh
🧰 Additional context used
🧬 Code Graph Analysis (2)
test.py (2)
ext4/volume.py (1)
  • inode_at (186-208)
ext4/inode.py (3)
  • xattrs (306-331)
  • open (302-303)
  • open (352-355)
ext4/xattr.py (5)
ext4/enum.py (1)
  • EXT4_FEATURE_INCOMPAT (141-157)
ext4/volume.py (4)
  • offset (43-48)
  • seek (144-158)
  • read (160-162)
  • superblock (29-30)
ext4/block.py (3)
  • volume (16-17)
  • seek (73-87)
  • read (94-103)
ext4/struct.py (1)
  • size (49-50)
ext4/inode.py (2)
  • superblock (193-194)
  • Inode (121-326)
🪛 Shellcheck (0.10.0)
_test_image.sh

[warning] 15-15: Use single quotes, otherwise this expands now rather than when signalled.

(SC2064)

🪛 Ruff (0.12.2)
ext4/xattr.py

161-161: Yoda condition detected

Rewrite as name_index < 0

(SIM300)


166-166: No explicit stacklevel keyword argument found

Set stacklevel=2

(B028)

🔇 Additional comments (2)
ext4/xattr.py (2)

71-78: Silently returning empty value hides truncated attributes

If value_offset + e_value_size exceeds the header bounds the code sets value = b"" and proceeds. This discards data corruption silently and the caller cannot distinguish between an empty attribute and a truncated one.

Consider raising ExtendedAttributeError (unless an ignore_truncation flag is set) so that filesystem inconsistencies surface during testing.


85-94: h_reserved size change requires on-disk compatibility check

The reserved field grew from 2 to 3 × uint32. Ensure this matches the ext4 spec for all supported kernel versions; otherwise parsing older images may fail.

Confirm with ext4 documentation or add fallback handling.

Comment thread _test_image.sh Outdated
Comment thread test.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
_test_image.sh (1)

12-17: Trap logic is broken: second trap overwrites the first, and final images are deleted on EXIT

  • The second trap (Line 17) overwrites the first (Line 12).
  • The first trap removes the final images test{32,64}.ext4, which are required by test.py post-run.
  • Also, use single quotes for trap to avoid premature expansion (SC2064).

Fix by:

  • Pre-cleaning any stale outputs before creation.
  • Using a single, single-quoted trap that removes only the temp dir and the .tmp images (keep final .ext4).
- trap "rm -r test{32,64}.ext4{,.tmp}" EXIT
+rm -f test{32,64}.ext4{,.tmp}
+trap 'rm -rf -- "$tmp_dir" test{32,64}.ext4.tmp' EXIT
@@
-mkfs.ext4 -g 1024 -O 64bit test64.ext4.tmp -d "$tmp_dir"
-mkfs.ext4 -g 1024 -O ^64bit test32.ext4.tmp -d "$tmp_dir"
+mkfs.ext4 -F -g 1024 -O 64bit test64.ext4.tmp -d "$tmp_dir"
+mkfs.ext4 -F -g 1024 -O ^64bit test32.ext4.tmp -d "$tmp_dir"
@@
-trap "rm -r \"$tmp_dir\"" EXIT

Notes:

  • -F ensures mkfs.ext4 will not prompt when formatting a regular file. Without it, the script may hang or fail non-interactively.
test.py (1)

49-49: Replace eval-based assertions with direct asserts; tighten the offset check

Using eval obscures failures and is unnecessary. Direct asserts are clearer and safer. Also, given the known single-byte prefix, assert offset == 1.

-    _assert("offset > 0")
+    assert offset == 1
@@
-        _assert("isinstance(inode, ext4.File)")
+        assert isinstance(inode, ext4.File)
-        b = inode.open()
-        _assert("isinstance(b, ext4.BlockIO)")
-        _assert("b.readable()")
-        _assert("b.seekable()")
-        _assert("b.seek(1) == 1")
-        _assert("b.seek(0) == 0")
-        _assert("b.seek(10) == 10")
+        b = inode.open()
+        assert isinstance(b, ext4.BlockIO)
+        assert b.readable()
+        assert b.seekable()
+        assert b.seek(1) == 1
+        assert b.seek(0) == 0
+        assert b.seek(10) == 10
@@
-            for j in range(1, 21):
-                _assert(f'attrs["user.name{j}"] == b"value{i}_{j}"')
+            for j in range(1, 21):
+                assert attrs[f"user.name{j}"] == f"value{i}_{j}".encode()
             data = inode.open().read()
-            _assert(f'data == b"hello world{i}\\n"')
+            assert data == f"hello world{i}\n".encode()

Also applies to: 69-76, 81-83

🧹 Nitpick comments (3)
_test_image.sh (2)

7-10: Use ‘>’ instead of ‘>>’ when creating per-file contents

In a fresh tmp_dir, redirection with ‘>’ is clearer and avoids accidental multiple lines if the loop is rerun in the same directory.

-  echo "hello world$i" >> "$tmp_dir"/test$i.txt
+  echo "hello world$i" > "$tmp_dir"/test$i.txt

18-21: Prefer printf over echo -n for portability

echo -n can be shell-dependent. printf is more predictable.

-echo -n F > test32.ext4
+printf 'F' > test32.ext4
@@
-echo -n F > test64.ext4
+printf 'F' > test64.ext4
test.py (1)

47-53: Optional: Verify the prefix byte explicitly for clarity

Since the image is prefixed with a single 'F' byte, consider asserting the prefix directly for readability (and it documents why offset is 1).

 for img_file in ("test32.ext4", "test64.ext4"):
     offset = os.path.getsize(img_file) - os.path.getsize(f"{img_file}.tmp")
-    _assert("offset > 0")
+    with open(img_file, "rb") as pf:
+        assert pf.read(1) == b"F"
+    assert offset == 1
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 023f802 and af2ca98.

📒 Files selected for processing (2)
  • _test_image.sh (1 hunks)
  • test.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
test.py (4)
ext4/volume.py (4)
  • offset (43-48)
  • Volume (56-208)
  • inode_at (186-208)
  • read (160-162)
ext4/inode.py (5)
  • open (302-303)
  • open (352-355)
  • File (350-355)
  • xattrs (306-331)
  • Inode (121-326)
ext4/struct.py (1)
  • MagicError (12-13)
ext4/block.py (2)
  • volume (16-17)
  • read (94-103)
🪛 Shellcheck (0.10.0)
_test_image.sh

[warning] 17-17: Use single quotes, otherwise this expands now rather than when signalled.

(SC2064)

@007gzs
Copy link
Copy Markdown
Contributor Author

007gzs commented Aug 8, 2025

Please update test.py to test these changes.

I found the problem while reading the Android file system, and I don't know how to build such an image.

* `xattrs`: Previously, it was completely unavailable

I've just merged a change that unifies how the test image is created. You can just update the _create_image.sh script to create new files and set attributes on them before they get placed in the image for testing. test.py can then be updated to ensure that those attributes exist, as well as expected error handling works, etc.

023f802

* `desc_size`: Previously, in the case of non 64 bit support, the information of the first group was always obtained

My only use-case for this library at the moment has been system images for a 32bit arm device. You can adjust the image creation to create a 64bit and 32bit version of the image for the test.

af2ca98

* `superblock`: I think all fields after `s_first_sta_bg` have problems

Could you expand on this?

This is a machine translation issue. What I want to talk about is the fields after s_mkfs_time

Comment thread test.sh
Comment thread _test_image.sh Outdated
Comment thread _test_image.sh
Comment thread test.sh Outdated
@Eeems
Copy link
Copy Markdown
Owner

Eeems commented Aug 8, 2025

* `superblock`: I think all fields after `s_first_sta_bg` have problems

Could you expand on this?

This is a machine translation issue. What I want to talk about is the fields after s_mkfs_time

What are the issue you expect them to have?

Comment thread _test_image.sh Outdated
Copy link
Copy Markdown
Owner

@Eeems Eeems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.github/workflows/build.yml will also need to be updated to use the new test file paths

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/build.yaml (1)

27-30: Artifact contents updated; consider step naming and .tmp necessity.

You now bundle both 32/64 images (and their .tmp) under a single artifact name "test.ext4". Functionally fine, but:

  • Step label “Generate test.ext4” is misleading now that multiple images are produced.
  • If .tmp files are not used by test.sh, exclude them to reduce artifact size.

Would you like to:

  • Rename the step titles to “Generate test images” / “Download test images”?
  • Confirm whether test.sh needs the .tmp files? If not, upload only test32.ext4 and test64.ext4.

If you decide to drop .tmp files, update just the path list:

           path: |
-            test32.ext4
-            test32.ext4.tmp
-            test64.ext4
-            test64.ext4.tmp
+            test32.ext4
+            test64.ext4
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 889e5f1 and 074f8c4.

📒 Files selected for processing (1)
  • .github/workflows/build.yaml (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: Eeems
PR: Eeems/python-ext4#8
File: test.py:0-0
Timestamp: 2025-08-08T14:41:18.073Z
Learning: Repository: Eeems/python-ext4 — In test.py, the test framework intentionally uses an eval-based _assert(source: str) to display the expression alongside pass/fail. Avoid suggesting replacements with direct assert statements in this file to preserve the intended output format.

Comment thread .github/workflows/build.yaml Outdated
@007gzs
Copy link
Copy Markdown
Contributor Author

007gzs commented Aug 11, 2025

* `superblock`: I think all fields after `s_first_sta_bg` have problems

Could you expand on this?

This is a machine translation issue. What I want to talk about is the fields after s_mkfs_time

What are the issue you expect them to have?

data

s_first_meta_bg: 0
s_mkfs_time: 1
s_jnl_blocks: [10,...,16,17]
s_blocks_count_hi: 3
s_r_blocks_count_hi: 4
s_free_blocks_count_hi: 5
s_min_extra_isize: 6

if use code

("s_first_meta_bg", c_uint32),
#("s_mkfs_time", c_uint32),
("s_jnl_blocks", c_uint32 * 17),
("s_blocks_count_hi", c_uint32),
("s_r_blocks_count_hi", c_uint32),
("s_free_blocks_count_hi", c_uint32),
("s_min_extra_isize", c_uint16),

will got

s_first_meta_bg: 0
s_jnl_blocks: [1,10,...,16]
s_blocks_count_hi: 17
s_r_blocks_count_hi: 3
s_free_blocks_count_hi: 4
s_min_extra_isize: 5

@Eeems
Copy link
Copy Markdown
Owner

Eeems commented Aug 11, 2025

Yes, but you've added s_mkfs_time to this PR, so I guess I'm not entirely following the concern.

Comment thread .github/workflows/build.yaml Outdated
@007gzs
Copy link
Copy Markdown
Contributor Author

007gzs commented Aug 11, 2025

Yes, but you've added s_mkfs_time to this PR, so I guess I'm not entirely following the concern.

What I mean is that there was a problem with the code before the modification

007gzs and others added 3 commits August 12, 2025 13:41
Co-authored-by: Nathaniel van Diepen <Eeems@users.noreply.github.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test.sh (1)

18-18: Set trap before image creation and only clean up files created by this run

As written, if _test_image.sh fails before the trap is set, partially created artifacts won’t be cleaned up. Also, the current trap removes all test artifacts even if some existed prior to the run (though it’s only set when artifacts were missing). Safer approach: compute which files are missing first, set the trap to delete only those, then run _test_image.sh. This preserves the intended behavior (“don’t delete pre-existing artifacts”) and guarantees cleanup on failure.

Apply this diff to the block to compute missing artifacts up front, set the trap, and then create images:

-if ! [ -f test32.ext4 ] || ! [ -f test32.ext4.tmp ] || ! [ -f test64.ext4 ] || ! [ -f test64.ext4.tmp ] ;then
-  ./_test_image.sh
-  trap "rm -f test{32,64}.ext4{,.tmp}" EXIT
-fi
+missing=()
+for f in test32.ext4 test32.ext4.tmp test64.ext4 test64.ext4.tmp; do
+  if ! [ -f "$f" ]; then
+    missing+=("$f")
+  fi
+done
+if [ ${#missing[@]} -gt 0 ]; then
+  trap 'rm -f "${missing[@]}"' EXIT
+  ./_test_image.sh
+fi
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 576457c and 96b2bc9.

📒 Files selected for processing (2)
  • .github/workflows/build.yaml (1 hunks)
  • test.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/build.yaml
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Eeems
PR: Eeems/python-ext4#8
File: test.py:0-0
Timestamp: 2025-08-08T14:41:18.095Z
Learning: Repository: Eeems/python-ext4 — In test.py, the test framework intentionally uses an eval-based _assert(source: str) to display the expression alongside pass/fail. Avoid suggesting replacements with direct assert statements in this file to preserve the intended output format.
📚 Learning: 2025-08-08T14:41:18.095Z
Learnt from: Eeems
PR: Eeems/python-ext4#8
File: test.py:0-0
Timestamp: 2025-08-08T14:41:18.095Z
Learning: Repository: Eeems/python-ext4 — In test.py, the test framework intentionally uses an eval-based _assert(source: str) to display the expression alongside pass/fail. Avoid suggesting replacements with direct assert statements in this file to preserve the intended output format.

Applied to files:

  • test.sh
🔇 Additional comments (1)
test.sh (1)

16-16: Existence check for 32/64-bit images looks correct

The OR chain ensures images are (re)generated if any of the four artifacts are missing. This matches the updated two-image test strategy.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
ext4/xattr.py (1)

60-67: Fix inverted EA_INODE flag check in ext4/xattr.py

The code currently raises or warns when the EA_INODE bit is set, but the message—and the intended behavior—apply when it’s not set. Also, the warning should include a stacklevel for proper tracebacks. Note that ignore_flags is defined on Volume (defaults to False), so no extra guard is needed.

Location: ext4/xattr.py lines 62–67
Change:

-                if (inode.i_flags & EXT4_FL.EA_INODE) != 0:
+                if (inode.i_flags & EXT4_FL.EA_INODE) == 0:
                     message = f"Inode {inode.i_no:d} is not marked as a large extended attribute value"
                     if not self.volume.ignore_flags:
                         raise ExtendedAttributeError(message)
-                    warnings.warn(message, RuntimeWarning)
+                    warnings.warn(message, RuntimeWarning, stacklevel=2)
🧹 Nitpick comments (3)
ext4/xattr.py (3)

71-79: Bounds check is good; add a warning for diagnostics

Silently returning an empty value on overflow can hide bad images/corruption. Emit a warning (with stacklevel) to aid debugging; keep the tolerant behavior.

-                if value_offset + entry.e_value_size > self.offset + self.data_size:
-                    value = b""
+                end = value_offset + entry.e_value_size
+                if end > self.offset + self.data_size:
+                    warnings.warn(
+                        f"xattr value overruns header bounds (offset=0x{value_offset:X}, size={entry.e_value_size})",
+                        RuntimeWarning,
+                        stacklevel=2,
+                    )
+                    value = b""
                 else:
                     self.volume.seek(value_offset)
                     value = self.volume.read(entry.e_value_size)
             else:
                 value = b""

133-133: Index 5 placeholder makes name-index table complete

Index 5 is reserved (used by Lustre in practice). Keeping it as an empty prefix is reasonable for general ext4 parsing. Consider adding a brief comment to document the reservation.

Apply this minimal doc tweak (optional):

-        "",
+        "",  # 5: reserved (e.g., Lustre)

160-171: Tighten bounds check, improve warning context, and prefer robust decoding

  • Fix Yoda condition.
  • Add stacklevel to warnings.
  • ext4 xattr names are byte strings; prefer UTF-8 (common) with a safe latin-1 fallback to preserve bytes one-to-one if needed. Decoding as ISO-8859-2 is surprising and may mangle names on non-Central/Eastern European images.
-        name_index = self.e_name_index
-        if 0 > name_index or name_index >= len(ExtendedAttributeEntry.NAME_INDICES):
+        name_index = self.e_name_index
+        if name_index < 0 or name_index >= len(ExtendedAttributeEntry.NAME_INDICES):
             msg = f"Unknown attribute prefix {self.e_name_index:d}"
             if self.volume.ignore_attr_name_index:
-                warnings.warn(msg, RuntimeWarning)
+                warnings.warn(msg, RuntimeWarning, stacklevel=2)
                 name_index = 0
             else:
                 raise ExtendedAttributeError(msg)
-
-        return ExtendedAttributeEntry.NAME_INDICES[name_index] + self.e_name.decode(
-            "iso-8859-2"
-        )
+        # Try UTF-8 first; fall back to latin-1 for lossless byte mapping
+        try:
+            suffix = self.e_name.decode("utf-8")
+        except UnicodeDecodeError:
+            suffix = self.e_name.decode("latin-1")
+        return ExtendedAttributeEntry.NAME_INDICES[name_index] + suffix

If you prefer a single codec, confirm kernel/userland expectations for xattr name encoding. Most tooling treats names as ASCII/UTF-8.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 96b2bc9 and 112743e.

📒 Files selected for processing (5)
  • .github/workflows/build.yaml (1 hunks)
  • ext4/inode.py (4 hunks)
  • ext4/superblock.py (3 hunks)
  • ext4/xattr.py (7 hunks)
  • test.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • .github/workflows/build.yaml
  • test.sh
  • ext4/superblock.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Eeems
PR: Eeems/python-ext4#8
File: test.py:0-0
Timestamp: 2025-08-08T14:41:18.095Z
Learning: Repository: Eeems/python-ext4 — In test.py, the test framework intentionally uses an eval-based _assert(source: str) to display the expression alongside pass/fail. Avoid suggesting replacements with direct assert statements in this file to preserve the intended output format.
📚 Learning: 2025-08-08T14:41:18.095Z
Learnt from: Eeems
PR: Eeems/python-ext4#8
File: test.py:0-0
Timestamp: 2025-08-08T14:41:18.095Z
Learning: Repository: Eeems/python-ext4 — In test.py, the test framework intentionally uses an eval-based _assert(source: str) to display the expression alongside pass/fail. Avoid suggesting replacements with direct assert statements in this file to preserve the intended output format.

Applied to files:

  • ext4/xattr.py
🧬 Code Graph Analysis (2)
ext4/inode.py (5)
ext4/xattr.py (1)
  • ExtendedAttributeHeader (85-123)
ext4/volume.py (4)
  • block_size (33-34)
  • block_size (141-142)
  • offset (43-48)
  • superblock (29-30)
ext4/block.py (3)
  • block_size (12-13)
  • block_size (61-62)
  • volume (16-17)
ext4/extent.py (5)
  • block_size (14-15)
  • block_size (195-196)
  • volume (18-19)
  • volume (237-238)
  • offset (241-242)
ext4/blockdescriptor.py (1)
  • superblock (113-114)
ext4/xattr.py (6)
ext4/enum.py (2)
  • EXT4_FEATURE_INCOMPAT (141-157)
  • EXT4_FL (302-329)
ext4/volume.py (4)
  • offset (43-48)
  • seek (144-158)
  • read (160-162)
  • superblock (29-30)
ext4/block.py (3)
  • volume (16-17)
  • seek (73-87)
  • read (94-103)
ext4/struct.py (1)
  • size (49-50)
ext4/blockdescriptor.py (1)
  • superblock (113-114)
ext4/inode.py (1)
  • superblock (193-194)
🪛 Ruff (0.12.2)
ext4/xattr.py

161-161: Yoda condition detected

Rewrite as name_index < 0

(SIM300)


164-164: No explicit stacklevel keyword argument found

Set stacklevel=2

(B028)

🔇 Additional comments (11)
ext4/inode.py (4)

36-36: Importing ExtendedAttributeHeader is correct and necessary

This aligns inode with the updated xattr header types and enables block-based xattr parsing. No issues.


196-199: Good addition: centralize block_size on Inode

Providing block_size on Inode reduces duplication (e.g., Directory no longer needs its own) and makes call sites cleaner.


308-308: Fix inline xattr size calculation

Using self.offset + self.superblock.s_inode_size - inline_offset correctly computes the remaining bytes in the inode after the inline xattr area’s start, fixing earlier off-by-offset errors for non-128B inode sizes.


320-323: Correct header type and block_size usage for block xattrs

  • Using self.block_size keeps logic local to Inode and consistent with other classes.
  • Switching to ExtendedAttributeHeader is correct for block-backed xattrs.
ext4/xattr.py (7)

11-11: Import of EXT4_FEATURE_INCOMPAT is appropriate

Required for EA_INODE handling in the entry logic. No concerns.


51-51: Constructing entries with remaining size is a good correctness improvement

Passing self.data_size - i helps keep bounds checks accurate when parsing entries.


56-57: Sentinel check includes value_inum — sensible guard

Using entry.value_inum (masked by EA_INODE feature) in the zero-entry sentinel is appropriate and avoids misinterpreting the terminator when only the inum would otherwise be non-zero.


82-82: Correct 4-byte alignment for entry stepping

Advancing by 4 * ((entry.size + 3) // 4) matches ext4 xattr entry alignment. Good fix.


93-93: Header reserved size matches kernel definition

h_reserved being 3×u32 is correct per ext4_xattr_header.


104-106: Block-header value_offset is correct

Value offset relative to header start matches ext4 semantics. This further highlights the ibody variant needs the same base (see earlier comment).


173-181: Feature-gated value_inum is correct

Gate usage of e_value_inum behind EA_INODE; otherwise return 0. This matches ext4 behavior and keeps parsing compatible with non-EA_INODE filesystems.

Comment thread ext4/xattr.py
@Eeems Eeems merged commit 9765766 into Eeems:main Aug 16, 2025
2 checks passed
This was referenced Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants