Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use libarchive instead of calling out to tar #327

Merged
merged 2 commits into from Dec 1, 2017

Conversation

matthewrsj
Copy link
Contributor

Use libarchive in order to make use of its security features and avoid
calling out to tar via a shell. The TAR_COMMAND is still used in
staging.c to complete the copy when a hardlink fails.

Signed-off-by: Matthew Johnson matthew.johnson@intel.com

src/archives.c Outdated

r = archive_write_header(ext, entry);
if (r < ARCHIVE_OK) {
fprintf(stderr, "%s\n", archive_error_string(ext));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling looks wrong here, We get an error from archive_write_header, which we print out, skip the copying, then call archive_write_finish_entry and carry on....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it is missing a if (r < ARCHIVE_WARN) return r;

src/archives.c Outdated
}
}

archive_read_close(a);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like resource leaks whenever we do a 'return r' above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I'll fix that.

src/archives.c Outdated
}
}

int extract(const char *filename, const char *target)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least this need some documentation to say what the parameters are, and that passing NULL as target is valid (but magic).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'll add a docstring

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of the name "extract", given it has two parameters, the tarfile and the directory to unpack into. Perhaps "untarto". I would expect that something called "extract" would have the name of the thing to extract as a parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'm thinking extract_to(...) then

src/archives.c Outdated
fprintf(stderr, "%s\n", archive_error_string(ext));
} else if (archive_entry_size(entry) > 0) {
r = copy_data(a, ext);
if (r < ARCHIVE_OK) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_data already prints an error in the write case. Probably should make copy_data silent and leave this message.

@matthewrsj
Copy link
Contributor Author

Having trouble getting libarchive installed in the Travis environment.

src/archives.c Outdated
r = archive_write_header(ext, entry);
if (r < ARCHIVE_OK) {
fprintf(stderr, "%s\n", archive_error_string(ext));
} else if (archive_entry_size(entry) > 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this code, which skips copy_data for zero length mean we can't create zero length files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick test shows me we can create zero length files. From the man page:

The function archive_entry_size() returns the file size, if it has been set, and 0 otherwise. archive_entry_size() can be used to query that status.

When I created an archive with several empty files I was able to extract them using this code.

@matthewrsj matthewrsj force-pushed the libarchive branch 4 times, most recently from c32398b to ec14482 Compare October 31, 2017 23:17
@matthewrsj matthewrsj closed this Oct 31, 2017
@matthewrsj matthewrsj reopened this Oct 31, 2017
@matthewrsj
Copy link
Contributor Author

*closed by mistake, oops

Fixes #285

Copy link

@icarus-sparry icarus-sparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

src/archives.c Outdated
int r;
const void *buffer;
size_t size;
la_int64_t offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be pedantic, from the man pages the function below is defined archive_read_data_block(struct archive *, const void **buff, size_t *len, off_t *offset);
I know off_t 'cause it's a standard type, so it would make sense to match them up and make it: off_t offset IMHO, then all the types are obvious and matching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@tmarcu
Copy link
Contributor

tmarcu commented Nov 1, 2017

+1

@matthewrsj
Copy link
Contributor Author

Resolves #280 as well.

char *outputdir;
string_or_die(&outputdir, "%s/staged", state_dir);
err = extract_to(tarfile, outputdir);
free(outputdir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a large FIXME in the if block below that relates to using GNU tar. That should probably be updated to reflect the new use of libarchive, or removed if it's no longer relevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, other comments addressed as well.

src/archives.c Outdated
/* set up read */
a = archive_read_new();
archive_read_support_format_all(a);
archive_read_support_filter_all(a);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two functions return int, so consider adding error handling if appropriate, or cast to void if they should be ignored.

src/archives.c Outdated
/* set up write */
ext = archive_write_disk_new();
archive_write_disk_set_options(ext, flags);
archive_write_disk_set_standard_lookup(ext);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comments for these two functions as for archive_read_*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they can technically fail with ARCHIVE_FATAL, so might as well check that since I have no idea how often that can happen.

src/archives.c Outdated
r = archive_read_open_filename(a, tarfile, 10240);
if (r) {
/* could not open archive for read */
fprintf(stderr, "%s\n", archive_error_string(a));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe prefix the string with Error: ? Unless libarchive does that already.

src/archives.c Outdated
/* archive_check_err(ar, ret)
*
* Check and print archive errors */
static int archive_check_err(struct archive *ar, int ret)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function name confused me at first because I thought it was part of libarchive. I would prefer a different name, or at least a leading underscore.

src/archives.c Outdated
}

/* just a warning */
return 0;
Copy link
Contributor

@phmccarty phmccarty Nov 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is any special handling needed for ARCHIVE_RETRY?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, since ARCHIVE_RETRY indicates the operation failed, but is -10 while ARCHIVE_WARN is -20, so would not be caught by this.

I think we should just treat ARCHIVE_RETRY as an extraction failure rather than trying to figure out how many times we want to continue retrying for each separate operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine to me. Implementing extraction retries based on this status code sounds overkill.

src/archives.c Outdated
/* set which attributes we want to restore */
flags = ARCHIVE_EXTRACT_TIME;
flags |= ARCHIVE_EXTRACT_PERM;
flags |= ARCHIVE_EXTRACT_ACL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why this flag is set. swupd doesn't handle extended ACLs (afaik).

src/archives.c Outdated
flags = ARCHIVE_EXTRACT_TIME;
flags |= ARCHIVE_EXTRACT_PERM;
flags |= ARCHIVE_EXTRACT_ACL;
flags |= ARCHIVE_EXTRACT_FFLAGS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure about this flag... doesn't look relevant for swupd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the flags the libarchive documentation (on their wiki) provided as an example to preserve file attributes. I'll look at the flags closer and modify.

src/archives.c Outdated
flags |= ARCHIVE_EXTRACT_PERM;
flags |= ARCHIVE_EXTRACT_ACL;
flags |= ARCHIVE_EXTRACT_FFLAGS;
flags |= ARCHIVE_EXTRACT_XATTR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the flag list, I think ARCHIVE_EXTRACT_OWNER is notably missing.

src/hash.c Outdated
string_or_die(&tar, TAR_COMMAND " -C %s/%i -xf %s/%i/Manifest.%s.tar 2> /dev/null",
state_dir, current->last_change, state_dir,
current->last_change, current->filename);
ret = extract_to(filename, state_dir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manifest tars live in the versioned directories under statedir, not directly under statedir.

src/archives.c Outdated

/* set up read */
a = archive_read_new();
r = archive_read_support_format_all(a);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can save some CPU cycles by limiting the format to what will actually be present in the archive, i.e. tar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you're right. I'll do that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I thought it was deliberate that you were being flexible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the flexible part is archive_read_support_filter_all, so we can support whatever compression is sent to it.

src/archives.c Outdated
}

out:
archive_write_close(ext);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no error handling here. archive_write_close() sets deferred directory attributes, so it can still fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should we be doing if this fails? Just set r and continue trying to free the write and close and free the read? That way the calling function will see the error but we will still be attempting to clean up after ourselves.

@pohly
Copy link
Contributor

pohly commented Nov 20, 2017 via email

Copy link
Contributor

@phmccarty phmccarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the libarchive usage here, and everything looks fine to me. Just one comment regarding the new copyright header.

src/archives.c Outdated
/*
* Software Updater - client side
*
* Copyright © 2012-2017 Intel Corporation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only list 2017 for the copyright year, since it's brand new code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Use libarchive in order to make use of its security features and avoid
calling out to tar via a shell. The TAR_COMMAND is still used in
staging.c to complete the copy when a hardlink fails.

Signed-off-by: Matthew Johnson <matthew.johnson@intel.com>
Signed-off-by: Matthew Johnson <matthew.johnson@intel.com>
@matthewrsj matthewrsj merged commit 08f8614 into clearlinux:master Dec 1, 2017
@matthewrsj matthewrsj deleted the libarchive branch December 1, 2017 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants