Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate ArchiveOperations tests, remove duplicated code #431

Conversation

christian-monch
Copy link
Contributor

@christian-monch christian-monch commented Jun 22, 2023

This PR factors out common test code for archive operations, i.e. for tar-archive operations and zip-archive operations.

This PR is in draft mode, because:

  1. The relevant change is in commit: 9ca4923. Everything else is either merged or cherry-picked to get around not yet merged PRs).
  2. To keep the common code clean, this PR relies on PR Fix directory name handling in zip-iterator #430. Because PR 430 has not been merged yet, the commit of PR 430 has been cherry-picked into the branch of this PR. Once PR 430 is merged, this PR will be rebased on the main-branch and the cherry-picked commit will be removed. That is the reason for putting this PR in draft mode.
  3. The consolidation of tests requires the archive code to be available. Since the respective PRs, i.e. PR Add the class ZipArchiveOperations, which implements archive operations on zip-files #407, and PR Fix directory name handling in zip-iterator #430, are not merged into main yet, I merged them into this branch to be able to run the tests. This branch will be rebased and fixed, once PR 407, and PR 430 are merged.

@mih
Copy link
Member

mih commented Jul 27, 2023

This PR has been in draft mode for a long time. I am closing it now as part of a cleanup operation. Feel free to reopen, once work on it restarts. Sorry.

@mih mih closed this Jul 27, 2023
@christian-monch christian-monch self-assigned this Aug 25, 2023
This commit adds `ZipArchiveOperations`,
a class that implements `ArchiveOperations`
for zip files. It makes use of
`datalad_next.iter_collections.zipfile.iter_zip`
to implement iteration.

The commit comes with tests. The test
code includes a pytest-fixture called `sample_zip`,
which creates a sample zip-file with known content,
much like the `sample_tar_xz`-fixture.
This commit fixes a bug in the tests
where Unix path-separators were always
used to identify directory entries in
a zip-archive. That is wrong on
Windows, where Windows-path separators
have to be used.
This commit ensures that all zip-file names
that are used as parameters in methods of
ZipArchiveOperations use '/' in names
of zip-archive items.

That is due to the fact that zip enforces
the use '/' in the names of its items
(zip-APPNOTE.TXT 6.3.3 from September 1st,
2012).

Until PR datalad#409
is merged this will ensure that the proper
file names are used to check for items in
zip-archives. The commit can be reverted
once PR  datalad#409
is merged.
This commit removes duplicated code for the
sample_zip-fixture. It makes the sample_zip-
fixture avaiable through conftest.py and
re-uses it on the ZipArchiveOperations tests.
This commit uses PurePosixPath to
properly read the path of a zip-archive
on Windows in test_open()
This commit renames the file
`datalad_next/archive_operations/tests/test_zipfile_archive_operations.py`
to `datalad_next/archive_operations/tests/test_zipfile.py`. This
is better readable and in sync with the name of the tests for
`datalad_next/archive_operations/tarfile.py`
This commit adapts the ZipArchiveOperations
code to the current type of `ZipfileItem.name`,
i.e. to `PurePosixPath`.

The commit also adapts the tests for
ZipArchiveOperations to the updated types,
and aligns the tests to the test code
for TarArchiveOperations. This is in
preparation for a test code consolidation
This commit factors out the common logic for
tests on archive operations and used them in
the tests of tar-archive operations and of
zip-archive operations.

The code relies on the ZipFileDirPath to
properly work. Therefore the commit that
adds ZipFileDirPath will be cherry picked
into this branch
This commit adds the class `ZipFileDirPath`. The
class is a direct subclass of `PurePosixPath`.
Instances of the class represent directory names
in zip-archives. The difference to `PurePosixPath`
is that rendering appends a '/' to the name.

This allows us to use the generic way of testing
for containment in ZipArchiveOperations, i.e.

  if item.name in archive:

Otherwise we would have to discriminate
between directory- and file-names and manually
add a '/' to directory names, e.g. something
like this:

  if (item.name + '/' if item.type == FileSystemItemType.directory else item.name) in archive:
@christian-monch
Copy link
Contributor Author

christian-monch commented Aug 25, 2023

Reopening and adapting it to the current status of main

@codecov
Copy link

codecov bot commented Sep 28, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (0cb44b0) 92.04% compared to head (bbee58e) 91.35%.
Report is 29 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #431      +/-   ##
==========================================
- Coverage   92.04%   91.35%   -0.69%     
==========================================
  Files         122      125       +3     
  Lines        9100     9442     +342     
==========================================
+ Hits         8376     8626     +250     
- Misses        724      816      +92     
Files Coverage Δ
datalad_next/archive_operations/tests/common.py 100.00% <100.00%> (ø)
...alad_next/archive_operations/tests/test_tarfile.py 100.00% <100.00%> (ø)
...alad_next/archive_operations/tests/test_zipfile.py 100.00% <100.00%> (ø)
datalad_next/conftest.py 100.00% <100.00%> (ø)
...atalad_next/iter_collections/tests/test_iterzip.py 100.00% <100.00%> (ø)
datalad_next/archive_operations/zipfile.py 97.72% <97.72%> (ø)
datalad_next/iter_collections/zipfile.py 96.77% <85.71%> (-3.23%) ⬇️

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mih
Copy link
Member

mih commented Dec 18, 2023

The PR is still in draft mode, hence closing it again after some months have passed. #407 seems to have most of these changes, and is not in draft mode. I will look into merging that on instead.

@mih mih closed this Dec 18, 2023
@christian-monch christian-monch deleted the enh-consolidate-archiveops-tests branch July 16, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants