-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add download_file utility with resume and progress support #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughIntroduces a reusable async download_file utility with resumable and progress-aware streaming; refactors asset, download, and user API wrappers to use it (adds show_progress and play_asset_video_to_file); updates README and pre-commit bandit filter; adds tests for download behavior. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Wrapper as API Wrapper
participant Utils as download_file
participant HTTP as HTTP Client
participant Disk as File System
Client->>Wrapper: download_asset_to_file(id, out_dir, ...)
Wrapper->>Wrapper: build make_request(headers?)
Wrapper->>Utils: download_file(make_request, out_dir, resolve_filename, show_progress)
rect rgb(230, 240, 255)
Note over Utils,HTTP: Initial request (no Range)
Utils->>HTTP: make_request()
HTTP-->>Utils: Response (headers, Content-Length)
end
Utils->>Utils: resolve_filename(resp.headers) → filename
Utils->>Disk: check for existing .temp (partial)
alt Partial .temp exists
rect rgb(240, 230, 240)
Note over Utils,HTTP: Resume attempt with Range
Utils->>HTTP: make_request(Range header)
HTTP-->>Utils: 206 Partial / 200 Full
Utils->>Utils: set offset accordingly
end
end
rect rgb(230, 255, 230)
Note over Utils,Disk: Stream and write
loop while chunks
Utils->>HTTP: read chunk
Utils->>Disk: append to .temp
Utils->>Utils: update progress bar (if enabled)
end
end
Utils->>Disk: rename .temp → final filename
Utils-->>Wrapper: return Path
Wrapper-->>Client: Path to file
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
Comment |
|
@CodeRabbit update the README.md to mention the resumable downloads |
This comment was marked as spam.
This comment was marked as spam.
|
@coderabbitai analyze the download_file specifically for unhandled exceptions and resource leaks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @immich/utils.py:
- Around line 134-140: Initialize resp and temp_path to None before the try
block and change the out_dir validation to only error if out_dir.exists() and
not out_dir.is_dir(); otherwise create it with out_dir.mkdir(...). Update
exception handlers and the finally block to check for resp is not None and
temp_path is not None (and temp_path.exists() where appropriate) before using
them, and avoid referencing temp_path or resp directly when they may not have
been assigned. Ensure references to make_request, resp, temp_path, out_dir, and
the try/except/finally blocks are updated accordingly.
🧹 Nitpick comments (3)
bin/generate.py (1)
111-111: Consider validating therefparameter before use.While the
nosec: B603directive is correctly placed, the underlying security concern flagged by Ruff (S603) about untrusted input remains valid. Theargs.refparameter is user-controllable (via CLI or environment variable) and is incorporated into the URL without validation. Although the security risk is low for a build script, consider adding basic validation to ensure the ref contains only expected characters (e.g., alphanumerics, hyphens, underscores, dots, slashes).🔒 Proposed validation for the ref parameter
Add validation after parsing arguments:
import re def main() -> int: parser = argparse.ArgumentParser( description="Generate Immich OpenAPI client and rewrite imports." ) parser.add_argument( "--ref", default=os.environ.get("IMMICH_OPENAPI_REF", "main"), help="Immich git ref for OpenAPI spec (default: IMMICH_OPENAPI_REF or 'main')", ) args = parser.parse_args() # Validate ref to prevent injection attempts if not re.match(r'^[a-zA-Z0-9._/-]+$', args.ref): print(f"Invalid ref format: {args.ref}") return 1 root = project_root() # ... rest of the codeimmich/client_wrapper/download_api_wrapped.py (1)
69-99: Late-binding closure captures loop variables by reference.Static analysis (Ruff B023) correctly flags that
make_requestand theresolve_filenamelambda captureasset_ids_dtoandfilenameby reference. While this currently works because the closure is consumed before the next iteration, it's fragile and will break ifdownload_fileever defers the callback.Additionally, the
headersparameter in theresolve_filenamelambda is unused (Ruff ARG005) since the filename is pre-determined.♻️ Proposed fix: bind variables explicitly
for asset_ids_dto, expected_size in archive_requests: filename = f"archive-{uuid.uuid4()}.zip" - def make_request(extra_headers: Optional[dict[str, str]]): + def make_request( + extra_headers: Optional[dict[str, str]], + *, + _asset_ids_dto: AssetIdsDto = asset_ids_dto, + ): return self.download_archive_without_preload_content( - asset_ids_dto=asset_ids_dto, + asset_ids_dto=_asset_ids_dto, key=key, slug=slug, _headers=kwargs.get("_headers", {}) | (extra_headers or {}), **kwargs, ) pbar = tqdm.tqdm( total=expected_size or None, unit="B", unit_scale=True, desc=str(filename), position=1, leave=False, dynamic_ncols=True, disable=not show_progress, ) + # Bind filename at definition time; headers unused for archives + bound_filename = filename await download_file( make_request=make_request, out_dir=out_dir, - resolve_filename=lambda headers: filename, + resolve_filename=lambda _headers, fn=bound_filename: fn, show_progress=show_progress, pbar=pbar, resumeable=False, # zip files are not resumable )immich/utils.py (1)
206-213: Consider preserving caller-provided progress bars.When the caller passes their own
pbar, thefinallyblock closes it unconditionally. This may be intentional, but if the caller wants to reuse or manage the progress bar's lifecycle, this could be unexpected. The fix proposed above includes trackinguser_pbarto optionally skip closing caller-provided bars.Also,
temp_path.replace(out_path)on line 212 is outside the try block, so if this fails (e.g., cross-filesystem rename), no cleanup occurs. Consider wrapping this in error handling or usingshutil.movefor robustness.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
.pre-commit-config.yamlREADME.mdbin/generate.pyimmich/client_wrapper/assets_api_wrapped.pyimmich/client_wrapper/download_api_wrapped.pyimmich/client_wrapper/users_api_wrapped.pyimmich/utils.py
🧰 Additional context used
🧬 Code graph analysis (3)
immich/utils.py (4)
immich/client/rest.py (1)
headers(43-45)immich/client_wrapper/assets_api_wrapped.py (3)
make_request(41-48)make_request(87-94)make_request(135-143)immich/client_wrapper/download_api_wrapped.py (1)
make_request(72-79)immich/client_wrapper/users_api_wrapped.py (1)
make_request(35-40)
immich/client_wrapper/users_api_wrapped.py (4)
immich/utils.py (2)
download_file(114-213)resolve_output_filename(62-111)immich/client_wrapper/assets_api_wrapped.py (3)
make_request(41-48)make_request(87-94)make_request(135-143)immich/client_wrapper/download_api_wrapped.py (1)
make_request(72-79)immich/client/api/users_api.py (1)
get_profile_image_without_preload_content(1544-1601)
immich/client_wrapper/download_api_wrapped.py (3)
immich/utils.py (1)
download_file(114-213)immich/client_wrapper/assets_api_wrapped.py (3)
make_request(41-48)make_request(87-94)make_request(135-143)immich/client_wrapper/users_api_wrapped.py (1)
make_request(35-40)
🪛 LanguageTool
README.md
[style] ~57-~57: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... asset thumbnail directly to disk . - assets.play_asset_video_to_file: Download an...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 Ruff (0.14.10)
bin/generate.py
111-111: subprocess call: check for execution of untrusted input
(S603)
immich/utils.py
139-139: Abstract raise to an inner function
(TRY301)
139-139: Avoid specifying long messages outside the exception class
(TRY003)
immich/client_wrapper/download_api_wrapped.py
74-74: Function definition does not bind loop variable asset_ids_dto
(B023)
94-94: Unused lambda argument: headers
(ARG005)
94-94: Function definition does not bind loop variable filename
(B023)
🔇 Additional comments (9)
bin/generate.py (1)
14-14: LGTM: Appropriate security linter suppression.The
nosec: B404directive correctly suppresses the Bandit warning for importing subprocess. The module is used legitimately in this build script to invoke openapi-generator-cli..pre-commit-config.yaml (1)
18-18: LGTM!Excluding the auto-generated client code (
immich/client/) from Bandit scans is appropriate. Excludingtests/is also a common practice, though ensure any custom test fixtures handling sensitive data are still reviewed manually.README.md (1)
57-57: LGTM!Documentation correctly reflects the new
play_asset_video_to_filemethod. The repetitive sentence structure flagged by static analysis is acceptable here—consistent formatting across API method descriptions aids readability.immich/client_wrapper/users_api_wrapped.py (1)
12-51: LGTM!Clean refactoring to leverage the shared
download_fileutility. The implementation correctly:
- Merges headers via the
make_requesthelper- Uses
resolve_output_filenamewith an appropriate default base- Propagates
show_progressto control the progress barMinor note:
out_dir.mkdir(parents=True, exist_ok=True)on line 33 is also called insidedownload_file(line 140 of utils.py), making the call here redundant but harmless.immich/client_wrapper/assets_api_wrapped.py (3)
61-105: LGTM!The new
play_asset_video_to_filemethod follows the established pattern used bydownload_asset_to_fileandview_asset_to_file. The implementation correctly:
- Uses
video-{id}as the default filename base- Delegates to
play_asset_video_without_preload_contentviamake_request- Propagates all relevant parameters including
show_progress
15-59: Clean refactoring to the shared download utility.The
download_asset_to_filemethod now consistently uses thedownload_fileabstraction with proper header merging and filename resolution.
107-154: Consistent implementation.
view_asset_to_filefollows the same pattern, correctly usingthumb-{id}as the default base and including thesizeparameter in the request.immich/utils.py (2)
16-20: LGTM!Simple and correct case-insensitive header lookup. Linear search is appropriate given typical header counts.
154-181: Well-implemented resume logic.The resumable download handling is solid:
- Properly attempts Range request for partial files
- Falls back to restart if server doesn't return 206
- Cleans up temp files larger than expected size
This comment was marked as resolved.
This comment was marked as resolved.
|
@coderabbitai review the latest commit |
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @immich/utils.py:
- Around line 216-217: The temp_path.replace(out_path) call is outside the
try/except that deletes temp_path on failure, so if replace raises
(cross-device, perms) the temp file is left behind; move the replace call into
the try block (or add a nested try/except around temp_path.replace in the same
function) so that any exception during rename is caught and the except/finally
cleanup still runs to remove temp_path; reference the temp_path and out_path
variables and ensure the cleanup code that removes temp_path executes on any
error from replace.
🧹 Nitpick comments (3)
immich/client_wrapper/download_api_wrapped.py (1)
69-99: Closures inside loop don't bind loop variables — fragile if refactored.The
make_requestfunction and theresolve_filenamelambda referenceasset_ids_dtoandfilenamewithout binding them as default arguments. While the current code awaits immediately within the same iteration (so it works), this is fragile and will break if these closures are ever stored or deferred.♻️ Proposed fix to bind loop variables
for asset_ids_dto, expected_size in archive_requests: filename = f"archive-{uuid.uuid4()}.zip" - def make_request(extra_headers: Optional[dict[str, str]]): + def make_request(extra_headers: Optional[dict[str, str]], _dto=asset_ids_dto): return self.download_archive_without_preload_content( - asset_ids_dto=asset_ids_dto, + asset_ids_dto=_dto, key=key, slug=slug, _headers=kwargs.get("_headers", {}) | (extra_headers or {}), **kwargs, ) pbar = tqdm.tqdm( # ... ) await download_file( make_request=make_request, out_dir=out_dir, - resolve_filename=lambda headers: filename, + resolve_filename=lambda headers, _fn=filename: _fn, show_progress=show_progress, pbar=pbar, resumeable=False, )tests/test_utils.py (2)
394-416: Consider adding test for valid non-existentout_dir.The test
test_download_file_raises_error_if_out_dir_is_not_directorycorrectly tests that a file raisesValueError. However, there's no test for the case whereout_diris a valid path that simply doesn't exist yet. Once the validation bug inutils.pyis fixed, this scenario should be tested.💡 Example test to add
@pytest.mark.asyncio async def test_download_file_creates_nonexistent_out_dir(tmp_path: Path) -> None: """Test that non-existent out_dir is created.""" out_dir = tmp_path / "new" / "nested" / "dir" assert not out_dir.exists() content_data = b"test" headers = {"Content-Length": str(len(content_data))} async def make_request(headers_arg): return MockResponse(headers, content_data=content_data) def resolve_filename(h): return "test.txt" result = await utils.download_file( make_request, out_dir, resolve_filename, show_progress=False ) assert out_dir.exists() assert result.read_bytes() == content_data
554-567: Consider verifying caller-providedpbaris not closed.The docstring states caller-provided progress bars "must be closed by the caller." The test verifies
updatewas called but doesn't assert thatclosewas NOT called. A quick addition would strengthen this test.# Should update with partial data size, then with remaining chunks assert mock_pbar.update.called assert result.read_bytes() == full_content + # Caller-provided pbar should not be closed by download_file + mock_pbar.close.assert_not_called()
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.mdimmich/client_wrapper/download_api_wrapped.pyimmich/utils.pytests/test_utils.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/test_utils.py (1)
immich/utils.py (2)
h(16-20)download_file(114-229)
immich/utils.py (4)
immich/client/rest.py (1)
headers(43-45)immich/client_wrapper/download_api_wrapped.py (1)
make_request(72-79)tests/test_utils.py (27)
make_request(174-175)make_request(204-208)make_request(238-242)make_request(273-281)make_request(311-318)make_request(349-350)make_request(377-380)make_request(404-405)make_request(461-462)make_request(496-497)make_request(520-521)make_request(546-549)resolve_filename(177-178)resolve_filename(210-211)resolve_filename(244-245)resolve_filename(283-284)resolve_filename(320-321)resolve_filename(352-353)resolve_filename(382-383)resolve_filename(407-408)resolve_filename(464-465)resolve_filename(499-500)resolve_filename(523-524)resolve_filename(551-552)content(147-162)content(448-459)content(485-494)immich/client_wrapper/assets_api_wrapped.py (3)
make_request(41-48)make_request(87-94)make_request(135-143)
🪛 LanguageTool
README.md
[style] ~57-~57: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... asset thumbnail directly to disk . - assets.play_asset_video_to_file: Download an...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 Ruff (0.14.10)
immich/client_wrapper/download_api_wrapped.py
74-74: Function definition does not bind loop variable asset_ids_dto
(B023)
94-94: Unused lambda argument: headers
(ARG005)
94-94: Function definition does not bind loop variable filename
(B023)
tests/test_utils.py
174-174: Unused function argument: headers_arg
(ARG001)
177-177: Unused function argument: h
(ARG001)
204-204: Unused function argument: headers_arg
(ARG001)
210-210: Unused function argument: h
(ARG001)
238-238: Unused function argument: headers_arg
(ARG001)
244-244: Unused function argument: h
(ARG001)
283-283: Unused function argument: h
(ARG001)
320-320: Unused function argument: h
(ARG001)
349-349: Unused function argument: headers_arg
(ARG001)
352-352: Unused function argument: h
(ARG001)
382-382: Unused function argument: h
(ARG001)
404-404: Unused function argument: headers_arg
(ARG001)
407-407: Unused function argument: h
(ARG001)
450-450: Unused method argument: size
(ARG002)
461-461: Unused function argument: headers_arg
(ARG001)
464-464: Unused function argument: h
(ARG001)
487-487: Unused method argument: size
(ARG002)
490-490: Avoid specifying long messages outside the exception class
(TRY003)
496-496: Unused function argument: headers_arg
(ARG001)
499-499: Unused function argument: h
(ARG001)
520-520: Unused function argument: headers_arg
(ARG001)
523-523: Unused function argument: h
(ARG001)
551-551: Unused function argument: h
(ARG001)
immich/utils.py
143-143: Abstract raise to an inner function
(TRY301)
143-143: Avoid specifying long messages outside the exception class
(TRY003)
217-217: Consider moving this statement to an else block
(TRY300)
🔇 Additional comments (6)
immich/utils.py (4)
152-166: Early return paths correctly close response — looks good.Both early-return branches (size match and size mismatch) properly close
respbefore returning, preventing resource leaks. The security-conscious approach of not re-downloading mismatched files is documented appropriately.
168-194: Resume logic handles Range request fallback correctly.The code properly handles servers that don't support Range requests by checking for status 206 and falling back to a full restart. Response objects are closed before making new requests, preventing leaks.
219-229: Exception and cleanup handling properly guards againstNameError.The code correctly initializes
resp,temp_path, andpbar_createdbefore the try block, preventing theNameErrorissues flagged in past reviews. Caller-provided progress bars are respected via thepbar_createdflag.
16-20: Clean helper for case-insensitive header lookup.Simple and correct implementation. This standardizes header access across the download flow.
tests/test_utils.py (1)
117-162: Well-structured mock response for async testing.The
MockResponseclass properly implements the async context manager protocol and simulates chunked content iteration. This provides a solid foundation for testing the download_file behavior.README.md (1)
53-66: Documentation clearly explains resumable downloads and limitations.The new documentation accurately describes the added
play_asset_video_to_filemethod, the automatic resume behavior via HTTP Range requests, and the limitation for ZIP archives. The.tempfile naming convention is well documented.
@coderabbitai review
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.