Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

* Add support for reading and extracting encrypted archives (memory a… #34

Closed
wants to merge 4 commits into from

Conversation

udgover
Copy link

@udgover udgover commented Apr 19, 2019

Hello, this PR adds support to read and extract encrypted archives. Tests were successful for each libarchive.public reader.

Decrypting such archives is as simple as providing passphrase argument to functions.

@dsoprea
Copy link
Owner

dsoprea commented Apr 22, 2019

That API wasn't added until libarchive 3.1.900 (libarchive/libarchive@6c222e5) and Ubuntu didn't move past 3.1.2 until 3.2.1-2 was backported into Xenial, and, even then, it's in scope but not the default installed version. So, I updated the distribution, specifically installed the correct version of the library, and added a bunch of version and API verbosity in the CI output for quicker triaging in the future: b4f19a0

I also had to drop Python 3.3 support (since it's not available for Xenial), add support for Python 3.6 and 3.7, and fix an encoding issue with some follow-up commits. Now everything should be green.

I need you to rebase your change on top of latest upstream.

@udgover
Copy link
Author

udgover commented Apr 22, 2019

Hello,

Thanks for your feedback. I've just rebased my commit.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 37

  • 18 of 25 (72.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.01%) to 70.455%

Changes Missing Coverage Covered Lines Changed/Added Lines %
libarchive/adapters/archive_read_add_passphrase.py 2 3 66.67%
libarchive/adapters/archive_read.py 6 9 66.67%
libarchive/calls/archive_read_add_passphrase.py 7 10 70.0%
Totals Coverage Status
Change from base Build 36: 0.01%
Covered Lines: 682
Relevant Lines: 968

💛 - Coveralls

3 similar comments
@coveralls
Copy link

Pull Request Test Coverage Report for Build 37

  • 18 of 25 (72.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.01%) to 70.455%

Changes Missing Coverage Covered Lines Changed/Added Lines %
libarchive/adapters/archive_read_add_passphrase.py 2 3 66.67%
libarchive/adapters/archive_read.py 6 9 66.67%
libarchive/calls/archive_read_add_passphrase.py 7 10 70.0%
Totals Coverage Status
Change from base Build 36: 0.01%
Covered Lines: 682
Relevant Lines: 968

💛 - Coveralls

@coveralls
Copy link

Pull Request Test Coverage Report for Build 37

  • 18 of 25 (72.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.01%) to 70.455%

Changes Missing Coverage Covered Lines Changed/Added Lines %
libarchive/adapters/archive_read_add_passphrase.py 2 3 66.67%
libarchive/adapters/archive_read.py 6 9 66.67%
libarchive/calls/archive_read_add_passphrase.py 7 10 70.0%
Totals Coverage Status
Change from base Build 36: 0.01%
Covered Lines: 682
Relevant Lines: 968

💛 - Coveralls

@coveralls
Copy link

Pull Request Test Coverage Report for Build 37

  • 18 of 25 (72.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.01%) to 70.455%

Changes Missing Coverage Covered Lines Changed/Added Lines %
libarchive/adapters/archive_read_add_passphrase.py 2 3 66.67%
libarchive/adapters/archive_read.py 6 9 66.67%
libarchive/calls/archive_read_add_passphrase.py 7 10 70.0%
Totals Coverage Status
Change from base Build 36: 0.01%
Covered Lines: 682
Relevant Lines: 968

💛 - Coveralls

@coveralls
Copy link

coveralls commented Apr 22, 2019

Pull Request Test Coverage Report for Build 44

  • 54 of 71 (76.06%) changed or added relevant lines in 10 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.6%) to 71.045%

Changes Missing Coverage Covered Lines Changed/Added Lines %
libarchive/adapters/archive_read_add_passphrase.py 2 3 66.67%
libarchive/adapters/archive_write_set_passphrase.py 2 3 66.67%
libarchive/adapters/archive_read.py 8 11 72.73%
libarchive/calls/archive_read_add_passphrase.py 7 10 70.0%
libarchive/calls/archive_write_set_passphrase.py 7 10 70.0%
libarchive/adapters/archive_write.py 12 18 66.67%
Totals Coverage Status
Change from base Build 36: 0.6%
Covered Lines: 1428
Relevant Lines: 2010

💛 - Coveralls

Copy link
Owner

@dsoprea dsoprea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the changes requested, and then add the interfaces for adding a passphrase while creating/updating, and then add a unit-test that adds a passphrase during the write and then checks it by reading it with a passphrase. Put this test in test_archive_write.py and name it "test_create_file__with_passphrase". Then, duplicate it into test_archive_read.py and name it "test_enumerate_from_file__with_passphrase". Assert that exactly the right filenames are returned.

@@ -0,0 +1,14 @@
from ctypes import *
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not necessary. It's already defined as an interface in calls/archive_read.py (where the read interfaces are all defined).

@@ -0,0 +1,5 @@
import libarchive.calls.archive_read_add_passphrase
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not necessary. It's already integrated in adapters/archive_read.py (where higher-level read operations are integrated).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just mimicked your original code for other API call. Let me know how to handle it.

"""Return an archive enumerator from a user-defined source, using a user-
defined entry type.
"""

archive_res = _archive_read_new()

try:
r = _archive_read_add_passphrase(archive_res, passphrase)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only call this if not None.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I write a condition to check if None but after doing some tests, providing None to the API call does not raise an error.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments:

  • By saying "only call this if not "None", do you really mean if passphrase:?
  • 'r' is a success/failure code, but it isn't checked for failure. But I'm not sure how a failure would be handled here.
  • The call below this (which was not modified) never returns a value, so this 'r' should be omitted:
    r = _set_read_context(archive_res, format_code, filter_code)

@@ -49,6 +49,14 @@ def _archive_read_support_format_all(archive):
message = c_archive_error_string(archive)
raise libarchive.exception.ArchiveError(message)

def _archive_read_add_passphrase(archive, passphrase):
try:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encode passphrase to bytes first: passphrase.encode('utf-8')

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will cause Python 3 to error ...

@@ -12,7 +12,7 @@
import libarchive.adapters.archive_read_support_filter_all
import libarchive.adapters.archive_read_set_format
import libarchive.adapters.archive_read_support_format_all

import libarchive.adapters.archive_read_add_passphrase
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary. See below.

@udgover
Copy link
Author

udgover commented Apr 23, 2019

I will add the interfaces for adding a passphrase while creating/updating and provide some tests.

@kbalk
Copy link

kbalk commented Jun 27, 2019

The C libarchive permits multiple passphrases - could you add support for more than one passphrase, i.e., instead of passing a single passphrase, allow for a list of passphrases?

@dsoprea
Copy link
Owner

dsoprea commented Jul 1, 2019

@kbalk Interesting idea.
@udgover Bump.

@udgover
Copy link
Author

udgover commented Dec 27, 2019

Hello,

Long time no see, sorry... I've finally got some time to implement write support with archive_write_set_passphrase. It works like a charm and I just need to write tests and I'll submit the PR.

…nd file based)

* Add bytes conversion to passphrase
* Check if passphrase is not None before calling add_passphrase

* Add encryption write support for Zip files (traditional, aes128 and aes256)

* Add test for writing with set_passphrase for Zip files
@udgover udgover requested a review from dsoprea January 3, 2020 08:38
@udgover
Copy link
Author

udgover commented May 28, 2020

I'm about to clean the mess in this PR, add all necessary tests and provide examples.

Currently, I've also added support to provide several passphrases at once. I've replaced passphrase with passphrases in archive_read._enumerator function which then iterate on the provided list of passphrases and call _archive_read_add_passphrase for each element.

I wonder what is the best implementation between completely dropping the possibility to provide only one passphrase as a string and / or providing a list of passphrases.

Both solutions could also coexist but the prototype of the function would be weird:

def _enumerator(opener, entry_cls, passphrase=None, passphrases=None, format_code=None, filter_code=None):
How to handle the case where both are provided? Take into account both? Raise an error?

Thanks for your feedback.

@dsoprea
Copy link
Owner

dsoprea commented May 28, 2020 via email

@udgover
Copy link
Author

udgover commented May 29, 2020

I'm not a fan with the single parameter for which I have to check type, I could implement an object Passphrases with setter and getter but like you, I think it's over-programming. The list solution looks like to be the best one and I've already done this way for testing.

Concerning the list for encrypted archives, the first match wins.

You could provide a single parameter that accepts both a scalar or a list, but I'm not a fan of that. You could provide a single parameter that accepts an object with a "get_passphrases" method, but that's over-prohramming the solution. Better to always just require a list. That works for 0, 1, or N passphrases. Is the use-case to attempt to match one of the passphrases to the archive? The first match wins?

@udgover
Copy link
Author

udgover commented Jun 2, 2020

Closing this old messy PR to create another one based on latest commits and taking into account comments of this PR.

@udgover udgover closed this Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants