MNT: Modernize export-archive#6745
Conversation
|
meh, shit, this seems to break something in the metadata code. edit: found the cause. |
This change contains maintenance work on export-archive. This includes
a complete switch from os.path to Pathlib, as well as abandoning old
almost never used and @normalize_path-polluted repo methods
get_indexed_files() and is_under_annex(), and replacing them with
get_content_{annex,}info(). This allows us to simplify the logic in
the for loop over all files further.
Coincidentally, because the switch from normlize_path functions
to get_content_{annex,}info() prevents faulty path handling when
export-archive is called from subdirectories inside of datasets,
this change fixes datalad#6741
mih
left a comment
There was a problem hiding this comment.
Awesome! LGTM!
From my POV this calls for an immediate move of GitRepo.get_indexed_files() into datalad/tests/utils.py only leaving a deprecated shim in GitRepo -- but no TODO for this PR.
Moreover, it seems that there is only a single user for file_has_content()left (in metadata.py) -- an important milestone for #4595.
Likewise time is running out for is_under_annex() only download_url(), add_archive_content() and that same metadata.py line are left TODO.
Wonderful contribution, thx much!
| filename = path.join(filename, default_filename) # under given directory | ||
| if not filename.endswith(file_extension): | ||
| filename += file_extension | ||
| filename = Path(default_filename) # in current directory |
There was a problem hiding this comment.
FTR: Was wondering whether default_filename should become Path already on line 112, but this rest of the logic seems to encourage not making this change.
Codecov Report
@@ Coverage Diff @@
## master #6745 +/- ##
==========================================
+ Coverage 88.82% 90.41% +1.58%
==========================================
Files 353 353
Lines 45761 45747 -14
==========================================
+ Hits 40647 41360 +713
+ Misses 5114 4387 -727
Continue to review full report at Codecov.
|
yarikoptic
left a comment
There was a problem hiding this comment.
looks good! one notable suggestion to keep behavior consistent and logic "centralized" on "what is basename" when working with archives.
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
|
committed your suggestions, @yarikoptic |
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
|
Code Climate has analyzed commit df977b6 and detected 0 issues on this pull request. View more on Code Climate. |
|
one appveyor run |
This change contains maintenance work on export-archive. This includes
a complete switch from os.path to Pathlib, as well as abandoning old
almost never used and @normalize_path-polluted repo methods
get_indexed_files() and is_under_annex(), and replacing them with
get_content_{annex,}info(). This allows us to simplify the logic in
the for loop over all files further.
Coincidentally, because the switch from normlize_path functions
to get_content_{annex,}info() prevents faulty path handling when
export-archive is called from subdirectories inside of datasets,
this change fixes #6741
Changelog
🐛 Bug Fixes
export-archivedoesn't rely onnormalize_path()methods anymore and became more robust when called from subdirectories (fixesexport-archivefails on dataset with subdatasets, only if run from a subdirectory #6741)