Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a method to write a link name without canonicalization #274

Merged
merged 1 commit into from Dec 14, 2021

Conversation

cgwalters
Copy link
Contributor

@cgwalters cgwalters commented Dec 10, 2021

In https://github.com/ostreedev/ostree we generate a cryptographic
checksum over files and symlinks, and directories.

ostree does not currently perform any canonicalization on symlinks;
we'll respect and honor whatever bytes we're provided as input,
and replicate that on the target.

I'm using this crate to do tar serialization, which has so far worked
fine...except, I hit this corner case:

[root@cosa-devsh ~]# rpm -qf /usr/lib/systemd/systemd-sysv-install
chkconfig-1.13-2.el8.x86_64
[root@cosa-devsh ~]# ll /usr/lib/systemd/systemd-sysv-install
lrwxrwxrwx. 2 root root 24 Nov 29 18:08 /usr/lib/systemd/systemd-sysv-install -> ../../..//sbin/chkconfig
[root@cosa-devsh ~]#

But, using set_link_name to write the tarball, we end up with
the canonicalized path ../../../sbin/chkconfig - i.e. without the
double //. This breaks the checksum.

Now, I am a bit tempted to change ostree to do canonicalization. But
even if we did, I'd need to exactly match what this crate is doing.

In the end, what I think is useful here is a method which skips
the canonicalization and internal error checking and just writes
out into the tar stream what's requested.

(I may of course also try to change the rhel8 systemd package, but
that's going to take a while to propagate and this corner case isn't
the only one I'm sure)

cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this pull request Dec 10, 2021
Requires: alexcrichton/tar-rs#274

And I'll just copy/paste the commit message from there, lightly edited:

In https://github.com/ostreedev/ostree we generate a cryptographic
checksum over files and symlinks, and directories.

ostree does not currently perform any canonicalization on symlinks;
we'll respect and honor whatever bytes we're provided as input,
and replicate that on the target.

We're using the Rust tar crate to do tar serialization,
which has so far worked fine...except, I hit this corner case:

```
[root@cosa-devsh ~]# rpm -qf /usr/lib/systemd/systemd-sysv-install
chkconfig-1.13-2.el8.x86_64
[root@cosa-devsh ~]# ll /usr/lib/systemd/systemd-sysv-install
lrwxrwxrwx. 2 root root 24 Nov 29 18:08 /usr/lib/systemd/systemd-sysv-install -> ../../..//sbin/chkconfig
[root@cosa-devsh ~]#
```

But, using `set_link_name` to write the tarball, we end up with
the canonicalized path `../../../sbin/chkconfig` - i.e. without the
double `//`.  This breaks the checksum.

Now, I am a bit tempted to change ostree to do canonicalization.  But
even if we did, I'd need to *exactly* match what tar-rs is doing.

(I may of course also try to change the rhel8 systemd package, but
 that's going to take a while to propagate and this corner case isn't
 the only one I'm sure)
@cgwalters
Copy link
Contributor Author

cgwalters commented Dec 10, 2021

At a higher level, there's obviously a spectrum here around:

  • Friendly high level library that provides useful errors if you e.g. try to add obviously broken things into a tarball, and cleans up paths etc.
  • Medium level library that will allow writing valid tarballs; e.g. I can use the API to exactly replicate something that e.g. GNU tar would unpack
  • Low level library that would allow writing even corrupted tarballs

I'd say probably this crate doesn't need to go to the low level, but unfortunately because tar is so pervasive and so old, I think there's a need for the "medium level" to ensure we can handle all sorts of (valid but not necessarily canonical) cases.

@cgwalters
Copy link
Contributor Author

Also, the astute reader here will note that I actually just wrote a different PR around symlinks: #273

And so the question arises - do we need to handle long and non-canonical symlink targets? As of right now, I don't need it. But it could make sense.

@alexcrichton
Copy link
Owner

Seems reasonable to me! Given that this is a raw thing though perhaps this method could take T: AsRef<[u8]> instead to be clear it's just shipping bytes from one location to another?

In https://github.com/ostreedev/ostree we generate a cryptographic
checksum over files and symlinks, and directories.

ostree does not currently perform any canonicalization on symlinks;
we'll respect and honor whatever bytes we're provided as input,
and replicate that on the target.

I'm using this crate to do tar serialization, which has so far worked
fine...except, I hit this corner case:

```
[root@cosa-devsh ~]# rpm -qf /usr/lib/systemd/systemd-sysv-install
chkconfig-1.13-2.el8.x86_64
[root@cosa-devsh ~]# ll /usr/lib/systemd/systemd-sysv-install
lrwxrwxrwx. 2 root root 24 Nov 29 18:08 /usr/lib/systemd/systemd-sysv-install -> ../../..//sbin/chkconfig
[root@cosa-devsh ~]#
```

But, using `set_link_name` to write the tarball, we end up with
the canonicalized path `../../../sbin/chkconfig` - i.e. without the
double `//`.  This breaks the checksum.

Now, I am a bit tempted to change ostree to do canonicalization.  But
even if we did, I'd need to *exactly* match what this crate is doing.

In the end, what I think is useful here is a method which skips
the canonicalization and internal error checking and just writes
out into the tar stream what's requested.

(I may of course also try to change the rhel8 systemd package, but
 that's going to take a while to propagate and this corner case isn't
 the only one I'm sure)
@cgwalters
Copy link
Contributor Author

Given that this is a raw thing though perhaps this method could take T: AsRef<[u8]> instead to be clear it's just shipping bytes from one location to another?

Good point; done!

@cgwalters
Copy link
Contributor Author

After this merges; when you have a chance a new release of this crate would be nice!

@alexcrichton alexcrichton merged commit de72a30 into alexcrichton:master Dec 14, 2021
7 checks passed
cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this pull request Dec 14, 2021
Requires: alexcrichton/tar-rs#274

And I'll just copy/paste the commit message from there, lightly edited:

In https://github.com/ostreedev/ostree we generate a cryptographic
checksum over files and symlinks, and directories.

ostree does not currently perform any canonicalization on symlinks;
we'll respect and honor whatever bytes we're provided as input,
and replicate that on the target.

We're using the Rust tar crate to do tar serialization,
which has so far worked fine...except, I hit this corner case:

```
[root@cosa-devsh ~]# rpm -qf /usr/lib/systemd/systemd-sysv-install
chkconfig-1.13-2.el8.x86_64
[root@cosa-devsh ~]# ll /usr/lib/systemd/systemd-sysv-install
lrwxrwxrwx. 2 root root 24 Nov 29 18:08 /usr/lib/systemd/systemd-sysv-install -> ../../..//sbin/chkconfig
[root@cosa-devsh ~]#
```

But, using `set_link_name` to write the tarball, we end up with
the canonicalized path `../../../sbin/chkconfig` - i.e. without the
double `//`.  This breaks the checksum.

Now, I am a bit tempted to change ostree to do canonicalization.  But
even if we did, I'd need to *exactly* match what tar-rs is doing.

(I may of course also try to change the rhel8 systemd package, but
 that's going to take a while to propagate and this corner case isn't
 the only one I'm sure)
cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this pull request Dec 14, 2021
Requires: alexcrichton/tar-rs#274

And I'll just copy/paste the commit message from there, lightly edited:

In https://github.com/ostreedev/ostree we generate a cryptographic
checksum over files and symlinks, and directories.

ostree does not currently perform any canonicalization on symlinks;
we'll respect and honor whatever bytes we're provided as input,
and replicate that on the target.

We're using the Rust tar crate to do tar serialization,
which has so far worked fine...except, I hit this corner case:

```
[root@cosa-devsh ~]# rpm -qf /usr/lib/systemd/systemd-sysv-install
chkconfig-1.13-2.el8.x86_64
[root@cosa-devsh ~]# ll /usr/lib/systemd/systemd-sysv-install
lrwxrwxrwx. 2 root root 24 Nov 29 18:08 /usr/lib/systemd/systemd-sysv-install -> ../../..//sbin/chkconfig
[root@cosa-devsh ~]#
```

But, using `set_link_name` to write the tarball, we end up with
the canonicalized path `../../../sbin/chkconfig` - i.e. without the
double `//`.  This breaks the checksum.

Now, I am a bit tempted to change ostree to do canonicalization.  But
even if we did, I'd need to *exactly* match what tar-rs is doing.

(I may of course also try to change the rhel8 systemd package, but
 that's going to take a while to propagate and this corner case isn't
 the only one I'm sure)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants