treewide: improved handling of storage device by NickCao · Pull Request #356 · jumpstarter-dev/jumpstarter-python

NickCao · 2025-03-18T13:25:21Z

Summary by CodeRabbit

Refactor
- Streamlined the writing and reading processes for storage devices by removing unnecessary checks and utilizing new utility functions.
- Introduced robust error handling and logging for storage operations in the new storage module.
New Features
- Added new asynchronous functions for managing storage devices, including checking availability, writing, and reading data.
- Enhanced configuration options for storage operations with new timeout parameters.

coderabbitai · 2025-03-18T13:25:29Z

Walkthrough

The changes in this update involve the removal of the wait_for_storage_device method from two driver modules, simplifying the control flow for reading and writing operations. The write and read methods now directly call new utility functions from the jumpstarter.common.storage module, which handle the storage device's readiness and data operations. A new storage.py file has been introduced, containing these utility functions with robust error handling and logging. Overall, the modifications streamline file operations by eliminating unnecessary checks and intermediate steps.

Changes

File	Change Summary
packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py	Removed `wait_for_storage_device` method; updated `write` and `read` methods to call `write_to_storage_device` and `read_from_storage_device` directly. Added new attributes: `storage_timeout`, `storage_leeway`, `storage_fsync_timeout`.
packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py	Removed `wait_for_storage_device` method; refactored `write` and `read` methods to utilize `write_to_storage_device` and `read_from_storage_device`. Added new fields in `DutlinkConfig`: `storage_timeout`, `storage_leeway`, `storage_fsync_timeout`.
packages/jumpstarter/jumpstarter/common/storage.py	Introduced new file with `wait_for_storage_device`, `write_to_storage_device`, and `read_from_storage_device` functions, implementing error handling and logging.

Possibly related PRs

Add jumpstarter-driver-esp32 to jumpstarter-all jumpstarter#340: The changes in this PR are related as both involve the removal of the wait_for_storage_device method and the refactoring of the write and read methods to utilize the new storage functions.
Bump github.com/cert-manager/cert-manager from 1.18.4 to 1.20.0 in /controller/deploy/operator jumpstarter#324: This PR also involves the removal of the wait_for_storage_device method and the refactoring of write and read methods to call new storage utility functions.

Suggested reviewers

mangelajo

Poem

I'm a rabbit who loves to code,
Hopping through changes on this winding road.
With storage functions now in play,
Writing and reading just a hop away.
No more waiting, just a swift glide,
Celebrating clean code, with joy and pride! 🐇✨

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)

We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
- To enable this feature, set early_access to true under in the settings.

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

netlify · 2025-03-18T13:25:34Z

✅ Deploy Preview for jumpstarter-docs ready!

Name	Link
🔨 Latest commit	`9c1b7d8`
🔍 Latest deploy log	https://app.netlify.com/sites/jumpstarter-docs/deploys/67d99e7d3308de00080d1ce1
😎 Deploy Preview	https://deploy-preview-356--jumpstarter-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (2)

105-106: Consider adding an info log before writing.
Logging a short message prior to invoking write_to_storage_device can help with debugging or progress tracking.
 async def write(self, src: str):
     self.host()
+    self.logger.info(f"Starting write operation to {self.effective_storage_device()}")
     async with self.resource(src) as res:
         await write_to_storage_device(self.effective_storage_device(), res, logger=self.logger)
111-112: Recommend validating read completion metrics.
Aligning the read procedure with a similar level of logging as the write operation could aid in tracking total bytes read or potential device issues.
 async def read(self, dst: str):
     self.host()
+    self.logger.info(f"Starting read operation from {self.effective_storage_device()}")
     async with self.resource(dst) as res:
         await read_from_storage_device(self.effective_storage_device(), res, logger=self.logger)
+    self.logger.info(f"Read operation complete from {self.effective_storage_device()}")

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (2)

190-193: Optional: add a pre-write log statement.
Similar to the pattern suggested in the SDWire driver, logging helps with situational awareness.

    @export
    async def write(self, src: str):
        self.host()
+       self.logger.info(f"Starting write operation to {self.storage_device}")
        async with self.resource(src) as res:
            await write_to_storage_device(self.storage_device, res, logger=self.logger)

196-199: Add read-completion logging.
Aligns the read flow with the write flow for consistent instrumentation and debugging.

    @export
    async def read(self, dst: str):
        self.host()
+       self.logger.info(f"Starting read operation from {self.storage_device}")
        async with self.resource(dst) as res:
            await read_from_storage_device(self.storage_device, res, logger=self.logger)
+       self.logger.info(f"Read operation complete from {self.storage_device}")

packages/jumpstarter/jumpstarter/common/storage.py (2)

52-95: Extract logging increment (50 MB) into a named constant.
Centralizing this magic number in a single place can clarify intent and ease maintainability.

+LOG_INCREMENT_BYTES = 50 * 1024 * 1024

 async def write_to_storage_device(
     storage_device: os.PathLike,
     ...
 ):
     ...
     next_print = 0
     async for chunk in resource:
         await stream.send(chunk)
         if logger:
             total_bytes += len(chunk)
             if total_bytes > next_print:
                 logger.debug(
                     ...
                 )
-                next_print += 50 * 1024 * 1024
+                next_print += LOG_INCREMENT_BYTES

96-113: Mirror logging progress in read operations.
Currently, write_to_storage_device logs bytes transferred, but read_from_storage_device lacks tracking statements. Aligning both might simplify debugging large-file reads.

 async def read_from_storage_device(
     storage_device: os.PathLike,
     ...
 ):
     path = await wait_for_storage_device(
         ...
     )
     with os.fdopen(os.open(path, os.O_RDONLY), "rb") as file:
         async with FileReadStream(file) as stream:
+            total_bytes = 0
+            next_print = 0
             async for chunk in stream:
                 await resource.send(chunk)
+                if logger:
+                    total_bytes += len(chunk)
+                    if total_bytes > next_print:
+                        logger.debug(
+                            "read {} MB from {}".format(
+                                total_bytes / (1024 * 1024),
+                                storage_device,
+                            )
+                        )
+                        next_print += 50 * 1024 * 1024

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 299e17d and 6927223.

📒 Files selected for processing (3)

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (2 hunks)
packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (2 hunks)
packages/jumpstarter/jumpstarter/common/storage.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (8)

GitHub Check: pytest-matrix (3.11)
GitHub Check: pytest-matrix (3.13)
GitHub Check: pytest-matrix (3.12)
GitHub Check: e2e
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-dev .devfile/Containerfile)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-devspace .devfile/Containerfile.client)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-utils Dockerfile.utils)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter Dockerfile)

🔇 Additional comments (4)

packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (1)

10-10: Good import for streamlined storage operations.
This direct import ensures that read/write calls are neatly handled by the new utility functions. No issues observed.

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (1)

10-17: Imports look correct.
Both anyio.sleep and the storage utilities are properly referenced to facilitate asynchronous wait and storage interactions.

packages/jumpstarter/jumpstarter/common/storage.py (2)

1-10: Imports align well with the new storage approach.
The chosen libraries and imports set up a solid foundation for asynchronous file access and error handling.

11-49: Handle zero-sized but valid devices.
The current logic mandates a nonzero device size before considering it "ready." Consider whether zero-sized devices might be valid in some edge cases (e.g., newly formatted). If so, you may want to relax this condition or provide an alternative check.

Would you like a script to search references to zero-sized devices in your codebase to confirm if the assumption holds?

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

packages/jumpstarter/jumpstarter/common/storage.py (2)

11-49: Consider allowing zero-size storage device or making that check configurable.
Currently, the loop waits for a nonzero size. For brand-new or freshly formatted devices, the size might be zero, leading to a potential indefinite wait within the 10s timeout. If that is intentional, please confirm. Otherwise, you could make this behavior configurable.

52-97: Further refine fsync retry for large or slow operations.
As noted in a previous review comment (referencing managerlajo’s feedback about fsync potentially taking minutes), the hard-coded 10s timeout plus 6s leeway may be too short for some SD cards or large writes. Consider making it more configurable or implementing a user-overrideable timeout. Additionally, decide whether to break out of the fsync retry loop if the device repeatedly returns EIO.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5055db9 and d3ae30f.

📒 Files selected for processing (3)

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (2 hunks)
packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (2 hunks)
packages/jumpstarter/jumpstarter/common/storage.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py

⏰ Context from checks skipped due to timeout of 90000ms (8)

GitHub Check: pytest-matrix (3.12)
GitHub Check: pytest-matrix (3.13)
GitHub Check: pytest-matrix (3.11)
GitHub Check: e2e
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-devspace .devfile/Containerfile.client)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-utils Dockerfile.utils)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-dev .devfile/Containerfile)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter Dockerfile)

🔇 Additional comments (7)

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (4)

10-10: No issues with the new import statement.
The sleep function is indeed used within this file and elsewhere as needed.

17-17: Good introduction of storage utility imports.
This clearly reflects the reorganization and simplifies the read/write code in this driver.

192-193: Streamlined approach for write operations.
By delegating to write_to_storage_device, you remove redundant readiness checks and centralize error handling. This should enhance maintainability.

198-199: Consistent read logic.
Delegating the reading process to read_from_storage_device helps unify behavior for storage access and maintain a consistent interface.

packages/jumpstarter/jumpstarter/common/storage.py (3)

1-4: Initial imports look good.
The typed Literal and Logger usage provides clarity when specifying read/write modes and optional logging.

6-8: Imports from anyio are appropriate.
These are used consistently for async stream handling and concurrency control.

99-115: Robust reading logic.
Your approach to waiting for the device’s readiness before reading, then streaming chunks asynchronously, is clean and efficient.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

packages/jumpstarter/jumpstarter/common/storage.py (2)

52-100: Comprehensive write implementation with robust error handling

The write_to_storage_device function handles writing data to storage devices with proper error handling and progress reporting. The default 15-minute (900 seconds) fsync timeout is appropriate given that SD card sync operations can be very slow, as noted in previous comments.

Consider making the progress reporting chunk size (currently hardcoded at 50MB) a configurable parameter:

 async def write_to_storage_device(
     storage_device: os.PathLike,
     resource: AnyByteStream,
     timeout: int = 10,
     fsync_timeout: int = 900,
     leeway: int = 6,
+    progress_chunk_mb: int = 50,
     *,
     logger: Logger | None = None,
 ):
     # ...
             if total_bytes > next_print:
                 logger.info(
                     "written {} MB to storage device {}".format(
                         total_bytes / (1024 * 1024),
                         storage_device,
                     )
                 )
-                next_print += 50 * 1024 * 1024
+                next_print += progress_chunk_mb * 1024 * 1024

102-119: Simple and effective read implementation

The read_from_storage_device function is concise and follows the same pattern as the write function.

Consider adding progress reporting for read operations similar to what's provided in the write function. This would be helpful for large files:

 async def read_from_storage_device(
     storage_device: os.PathLike,
     resource: AnyByteStream,
     timeout: int = 10,
+    progress_chunk_mb: int = 50,
     *,
     logger: Logger | None = None,
 ):
     path = await wait_for_storage_device(
         storage_device,
         mode="rb",
         timeout=timeout,
         logger=logger,
     )
     with os.fdopen(os.open(path, os.O_RDONLY), "rb") as file:
         async with FileReadStream(file) as stream:
+            total_bytes = 0
+            next_print = 0
             async for chunk in stream:
                 await resource.send(chunk)
+                if logger:
+                    total_bytes += len(chunk)
+                    if total_bytes > next_print:
+                        logger.info(
+                            "read {} MB from storage device {}".format(
+                                total_bytes / (1024 * 1024),
+                                storage_device,
+                            )
+                        )
+                        next_print += progress_chunk_mb * 1024 * 1024

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3ae30f and 221d3ee.

📒 Files selected for processing (3)

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (2 hunks)
packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (2 hunks)
packages/jumpstarter/jumpstarter/common/storage.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py

🧰 Additional context used

🧬 Code Definitions (1)

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (1)

packages/jumpstarter/jumpstarter/common/storage.py (2) (2)

read_from_storage_device (102:118)

write_to_storage_device (52:99)

⏰ Context from checks skipped due to timeout of 90000ms (8)

GitHub Check: e2e
GitHub Check: pytest-matrix (3.11)
GitHub Check: pytest-matrix (3.13)
GitHub Check: pytest-matrix (3.12)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-utils Dockerfile.utils)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter Dockerfile)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-devspace .devfile/Containerfile.client)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-dev .devfile/Containerfile)

🔇 Additional comments (5)

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (4)

10-10: Import change aligned with refactoring

The sleep import from anyio replaces the previous asyncio-based sleep used in the code.

17-17: Clean import of new storage utilities

Good refactoring approach - importing the new utility functions that will replace the removed wait_for_storage_device method.

191-194: Streamlined write implementation

The code has been simplified by delegating the storage device handling to the new write_to_storage_device function, which eliminates the need for the previous manual checking and handling.

197-199: Streamlined read implementation

Similar to the write method, this simplifies the code by delegating the storage device handling to the new read_from_storage_device function.
packages/jumpstarter/jumpstarter/common/storage.py (1)

11-49:
Details

✅ Verification successful

Robust implementation of storage device readiness check

The wait_for_storage_device function effectively centralizes the logic for checking storage device readiness with appropriate error handling.

A few observations:

Using noqa: C901 acknowledges the complexity of the function, which is necessary given the different error conditions that must be handled.

The pattern matching for OSError handling is a good use of Python 3.10+ features.

The function properly prevents file descriptor leaks using the with statement.

🏁 Script executed:
#!/bin/bash
# Verify the error codes being handled in the wait_for_storage_device function

# Check if the ENOMEDIUM and EIO error codes are correctly defined
python -c "import errno; print(f'ENOMEDIUM: {errno.ENOMEDIUM}, EIO: {errno.EIO}')"

# Look for previous implementations of wait_for_storage_device to compare the error handling
rg -p "def wait_for_storage_device" --type py
Length of output: 264
Review Approved: Storage device readiness check implementation is robust

The code in packages/jumpstarter/jumpstarter/common/storage.py for the wait_for_storage_device function has been verified. The error codes, ENOMEDIUM (123) and EIO (5), are correctly referenced, and the logic properly manages file descriptor lifecycles. The use of pattern matching and appropriate logging for different error conditions is solid.

noqa: C901 is appropriately used to bypass complexity warnings.

Pattern matching for OSError is correctly implemented.

The function’s context management effectively prevents file descriptor leaks.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (2)

21-23: Consider clarifying the meaning of storage timeouts
Document the rationale for storage_timeout, storage_leeway, and storage_fsync_timeout to help maintainers and future contributors. For instance, 900 seconds might be large for certain workflows.

108-116: Consider robust error handling or retries during writes
The direct usage of write_to_storage_device is correct. However, since storage operations can fail due to device instability, implementing retries or additional error handling could improve reliability.

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (2)

25-27: Clarify default storage timeouts
Just like in SDWire, adding docstrings or comments explaining the choice of these timeouts helps maintainers understand expected usage and prevents misconfiguration.

195-203: Potential for transient write failures
While the new approach is simpler, you might consider a retry or additional error logging in write_to_storage_device to handle flaky connections or device unavailability.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 221d3ee and 44170c8.

📒 Files selected for processing (2)

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (2 hunks)
packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (8)

GitHub Check: e2e
GitHub Check: pytest-matrix (3.12)
GitHub Check: pytest-matrix (3.11)
GitHub Check: pytest-matrix (3.13)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter Dockerfile)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-devspace .devfile/Containerfile.client)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-dev .devfile/Containerfile)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-utils Dockerfile.utils)

🔇 Additional comments (5)

packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py (2)

10-10: Import usage recognized
No issues found with the new import statement referencing the storage utility functions.

121-127: Straightforward read implementation
Reading from storage is consistent with the new approach and properly references the updated utility method.

packages/jumpstarter-driver-dutlink/jumpstarter_driver_dutlink/driver.py (3)

10-10: Async sleep import
Importing sleep from anyio is appropriate for asynchronous time-based functions.

17-17: Storage utilities import
Importing the storage utility methods aligns well with the new error handling and direct device read/write approach.

208-214: Proper read usage
Reading from storage is implemented reliably and matches the updated utility function workflow.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

packages/jumpstarter/jumpstarter/common/storage.py (3)

11-50: Robust implementation for device readiness detection

The wait_for_storage_device function handles multiple error conditions appropriately:

Properly handles FileNotFoundError when the device hasn't appeared yet
Correctly handles specific OSError types (ENOMEDIUM, EIO) with appropriate retries
Uses fail_after for timeout management
Ensures file descriptors don't leak by using context managers
Provides useful logging at different stages

One minor consideration:

Consider extracting the mode-specific file opening logic into a helper function to reduce the complexity that triggered the noqa: C901 directive.

-    match mode:
-        case "wb":
-            fd = os.open(storage_device, os.O_WRONLY)
-        case "rb":
-            fd = os.open(storage_device, os.O_RDONLY)
-        case _:
-            raise ValueError("invalid mode: {}".format(mode))
+    fd = _get_file_descriptor(storage_device, mode)

With a helper function:

def _get_file_descriptor(storage_device, mode):
    match mode:
        case "wb":
            return os.open(storage_device, os.O_WRONLY)
        case "rb":
            return os.open(storage_device, os.O_RDONLY)
        case _:
            raise ValueError("invalid mode: {}".format(mode))

84-97: Consider adding more detailed logging for fsync retries

While the error handling for fsync is robust, a user might benefit from knowing how many retries have occurred during a long fsync operation.

+               retry_count = 0
                while True:
                    try:
                        if logger:
-                           logger.info("fsyncing storage device {}, please wait".format(storage_device))
+                           logger.info("fsyncing storage device {}, attempt {}, please wait".format(storage_device, retry_count + 1))
                        os.fsync(file.fileno())
                    except OSError as e:
                        if e.errno == errno.EIO:
+                           retry_count += 1
                            await sleep(1)
                            continue
                        else:
                            raise
                    else:
                        break

102-131: Consider adding similar retry logic for read operations

The read implementation is well-structured but lacks the retry logic present in the write implementation for handling transient I/O errors during reading.

Consider implementing similar retry logic for handling OSError with errno.EIO during read operations, as storage devices can also experience transient read errors:

async def read_from_storage_device(
    storage_device: os.PathLike,
    resource: AnyByteStream,
    timeout: int = 10,
+   read_retry_timeout: int = 60,
    *,
    logger: Logger | None = None,
):
    # ... existing code ...
    
    with os.fdopen(os.open(path, os.O_RDONLY), "rb") as file:
        async with FileReadStream(file) as stream:
            total_bytes = 0
            next_print = 0
-           async for chunk in stream:
-               await resource.send(chunk)
+           with fail_after(read_retry_timeout):
+               while True:
+                   try:
+                       async for chunk in stream:
+                           await resource.send(chunk)
+                           if logger:
+                               total_bytes += len(chunk)
+                               if total_bytes > next_print:
+                                   logger.info(
+                                       "read {} MB from storage device {}".format(
+                                           total_bytes / (1024 * 1024),
+                                           storage_device,
+                                       )
+                                   )
+                                   next_print += 50 * 1024 * 1024
+                       break  # Exit loop if reading completes without errors
+                   except OSError as e:
+                       if e.errno == errno.EIO:
+                           if logger:
+                               logger.warning("I/O error while reading, retrying...")
+                           await sleep(1)
+                           continue
+                       else:
+                           raise
-           if logger:
-               total_bytes += len(chunk)
-               if total_bytes > next_print:
-                   logger.info(
-                       "read {} MB from storage device {}".format(
-                           total_bytes / (1024 * 1024),
-                           storage_device,
-                       )
-                   )
-                   next_print += 50 * 1024 * 1024

This provides consistency between read and write operations and makes the code more resilient to transient I/O errors.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 44170c8 and 9c1b7d8.

📒 Files selected for processing (1)

packages/jumpstarter/jumpstarter/common/storage.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (11)

GitHub Check: Redirect rules - jumpstarter-docs
GitHub Check: Header rules - jumpstarter-docs
GitHub Check: Pages changed - jumpstarter-docs
GitHub Check: pytest-matrix (3.11)
GitHub Check: pytest-matrix (3.13)
GitHub Check: pytest-matrix (3.12)
GitHub Check: e2e
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-devspace .devfile/Containerfile.client)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-dev .devfile/Containerfile)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter Dockerfile)
GitHub Check: build-and-push-image (jumpstarter-dev/jumpstarter-utils Dockerfile.utils)

🔇 Additional comments (3)

packages/jumpstarter/jumpstarter/common/storage.py (3)

1-9: Well-chosen imports for async I/O operations

The imports are appropriate for implementing non-blocking I/O operations using the anyio library. The code correctly imports the necessary components for file operations, error handling, and async control flow.

52-100: Well-implemented write operation with proper fsync handling

The implementation is thorough with:

Device readiness check before writing

Chunked writing with progress logging

Proper fsync with timeout and error handling

Leeway period after writing to allow device stabilization

The 15-minute fsync timeout (900 seconds) aligns with the reviewer comment about slow storage devices like SD cards potentially taking 5-10 minutes to sync, addressing the performance concerns identified previously.

1-131: Overall well-designed storage utility module

This new storage module effectively centralizes storage device handling with:

Clear separation of concerns (device readiness, reading, writing)

Consistent error handling and logging patterns

Proper resource management with context managers

Configurable timeouts for various operations

Progress reporting for long-running operations

The implementation aligns well with the PR objectives of improving storage device handling across the codebase.

mangelajo

Nice, We need to document the new config fields in the driver docs.

mangelajo reviewed Mar 18, 2025

View reviewed changes

Comment thread packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py Outdated

NickCao marked this pull request as draft March 18, 2025 13:49

NickCao force-pushed the fsync branch from 737661b to 6927223 Compare March 18, 2025 15:19

NickCao changed the title ~~SDWire: call fsync after flashing~~ treewide: improved handling of storage device Mar 18, 2025

NickCao marked this pull request as ready for review March 18, 2025 15:20

coderabbitai Bot reviewed Mar 18, 2025

View reviewed changes

NickCao force-pushed the fsync branch from 6927223 to 5055db9 Compare March 18, 2025 15:26

mangelajo reviewed Mar 18, 2025

View reviewed changes

Comment thread packages/jumpstarter/jumpstarter/common/storage.py

NickCao force-pushed the fsync branch from 5055db9 to d3ae30f Compare March 18, 2025 15:49

mangelajo reviewed Mar 18, 2025

View reviewed changes

Comment thread packages/jumpstarter-driver-sdwire/jumpstarter_driver_sdwire/driver.py Outdated

treewide: improved handling of storage device

221d3ee

NickCao force-pushed the fsync branch from d3ae30f to 221d3ee Compare March 18, 2025 15:52

coderabbitai Bot reviewed Mar 18, 2025

View reviewed changes

Make storage timeout configurable

44170c8

NickCao force-pushed the fsync branch from 10085fc to 44170c8 Compare March 18, 2025 15:59

coderabbitai Bot reviewed Mar 18, 2025

View reviewed changes

Also log read progress

9c1b7d8

coderabbitai Bot reviewed Mar 18, 2025

View reviewed changes

NickCao requested a review from mangelajo March 18, 2025 16:32

mangelajo approved these changes Mar 18, 2025

View reviewed changes

NickCao merged commit 8410d19 into main Mar 18, 2025
16 checks passed

NickCao deleted the fsync branch March 18, 2025 16:37

NickCao mentioned this pull request Mar 19, 2025

Storage flash using sdwire creates non-fully bootable images #354

Closed

mangelajo added this to the 0.6.0 milestone May 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

treewide: improved handling of storage device#356

treewide: improved handling of storage device#356
NickCao merged 3 commits intomainfrom
fsync

NickCao commented Mar 18, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 18, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

netlify Bot commented Mar 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

mangelajo left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NickCao commented Mar 18, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

netlify Bot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for jumpstarter-docs ready!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

mangelajo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NickCao commented Mar 18, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 18, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

netlify Bot commented Mar 18, 2025 •

edited

Loading