Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow having broken symlinks in the packaged cargo archive #11203

Open
Ekleog opened this issue Oct 9, 2022 · 6 comments
Open

Allow having broken symlinks in the packaged cargo archive #11203

Ekleog opened this issue Oct 9, 2022 · 6 comments
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-package

Comments

@Ekleog
Copy link
Contributor

Ekleog commented Oct 9, 2022

Problem

In https://github.com/Ekleog/kannader/tree/main/smtp-queue-fs, I have a res/ folder that intentionally contains a broken symlink. This folder is properly tracked by git, and used by tests to validate that smtp-queue-fs behaves correctly when faced with a queue that has a broken symlink

Proposed Solution

It would be nice if cargo were able to store the same types of files as git, so including broken symlinks. This would make it possible to upload this kind of crates with its tests on crates.io

Notes

I guess an alternative would be to not have the broken symlink be checked-in, and instead handle the test creation and validation differently. However, it makes it harder to just add tests, given that it means manual care needs to be given to each test that has a broken symlink.

I would understand if you think that'd be a non-feature and you didn't want to allow including broken symlinks in the crate on crates.io, but I thought it was worth opening this issue to gather feedback over whether it'd be of interest. Sorry if it has already been discussed elsewhere!

@Ekleog Ekleog added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Oct 9, 2022
@weihanglo
Copy link
Member

Thank you for the proposal!

I totally understand what you're trying to solve. Also have no idea why Cargo implemented like this, as the logic dates back to Cargo v0.0.1-pre😆. However, I do have concerns on this being implemented.

Cargo tries its best to make every .crate file always compile. Based on that fact, Cargo resolves all symlinks, and copies target contents over into tarball. A common use case is sharing root README and LICENSE file within a workspace, so that each member gets a real copy of that file and published to crates.io with a sense of integrity.

If Cargo now switches the behaviour to align with git, some projects will start failing as they cannot find the actual contents during compiling. If we keep the current behaviour and only treat broken symlinks as you proposed, the inconsistency sounds even more confusing, at least to me.

We may loose the restriction a bit, such like when Cargo.toml includes explicitly includes a broken symlink, just includes it and emits a warning. I am not entirely sure what it might cause if we go this route. People might find their files missing after publish and that is really unfortunate.

@Ekleog
Copy link
Contributor Author

Ekleog commented Oct 12, 2022

Hmmm… as you're saying, I guess silently changing the behavior or reusing include wouldn't make much sense.

Maybe introducing an as-symlink = ["file1", ..., "fileN"] option, that would copy the symlink verbatim without dereferencing, would make sense? And then it could error out if file* ever were to not be a symlink, or to be excluded.

@weihanglo
Copy link
Member

Hmm… It could be a solution but not optimal in my opinion. There are some concerns around it.

  • The manifest has an already-too-long list for each field. Personally, I will avoid adding more fields onto it if possible. Sorry for saying so but as-symlink seems not a field everybody needs.
  • The real file walk happens in PathSource::list_files, which then determines to walk through either git index or heuristic filesystem walk. If a symlink is in git index it may be fine to include. However, the caller doesn't know what kind of walk they did, so we need to return something telling that, and doing so like leaking some implementation details. In addition, git-walk also follows symlinks, so it may end up duplicating filter algorithm in several places.

Though I don't have any better idea than what you propose so far. 😞

@Ekleog
Copy link
Contributor Author

Ekleog commented Oct 14, 2022

Hmm you're right, but I don't see any better option that'd avoid backwards-compatibility hazards either :/ guess I'll leave this open hoping for other people to come in with ideas!

@sourcefrog
Copy link
Contributor

Would you be open to a PR that mentions in the cargo package and cargo publish docs that symlinks are followed and symlinks are never(?) included in the tarball?

@weihanglo
Copy link
Member

Would you be open to a PR that mentions in the cargo package and cargo publish docs that symlinks are followed

Yeah, I think that is appreciated! Perhaps adding one simple sentence to the second step of cargo package is good enough?

Just checked the implementation again. I believe File::open does follow symlinks, so tar sees real contents and copies over during archiving. So yeah symlinks are followed and files will be duplicated.

let mut file = File::open(&disk_path).with_context(|| {
format!("failed to open for archiving: `{}`", disk_path.display())
})?;
let metadata = file.metadata().with_context(|| {
format!("could not learn metadata for: `{}`", disk_path.display())
})?;
header.set_metadata_in_mode(&metadata, HeaderMode::Deterministic);
header.set_cksum();
ar.append_data(&mut header, &ar_path, &mut file)
.with_context(|| {
format!("could not archive source file `{}`", disk_path.display())
})?;
uncompressed_size += metadata.len() as u64;

and symlinks are never(?) included in the tarball?

I would personally avoid this kind of statement, as I may miss some details or there are workarounds to bypass how Cargo follows symlinks.

bors added a commit that referenced this issue Oct 15, 2024
docs: More information on what is and isn't included by cargo package

<!-- homu-ignore:start -->
<!--
Thanks for submitting a pull request 🎉! Here are some tips for you:

* If this is your first contribution, read "Cargo Contribution Guide" first:
  https://doc.crates.io/contrib/
* Run `cargo fmt --all` to format your code changes.
* Small commits and pull requests are always preferable and easy to review.
* If your idea is large and needs feedback from the community, read how:
  https://doc.crates.io/contrib/process/#working-on-large-features
* Cargo takes care of compatibility. Read our design principles:
  https://doc.crates.io/contrib/design.html
* When changing help text of cargo commands, follow the steps to generate docs:
  https://github.com/rust-lang/cargo/tree/master/src/doc#building-the-man-pages
* If your PR is not finished, set it as "draft" PR or add "WIP" in its title.
* It's ok to use the CI resources to test your PR, but please don't abuse them.
-->

### What does this PR try to resolve?

Add more detail to `cargo package` explaining what is and isn't included.

This doesn't add the feature requested by #11203 (shipping symlinks) but it does at least make it more clear that it's not supported.

### How should we test and review this PR?

Docs only change.

I wrote this by reading [src/cargo/sources/path.rs](https://github.com/rust-lang/cargo/blob/82c489f1c612096c2dffc48a4091d21af4b8e046/src/cargo/sources/path.rs), in addition to my observations of cargo behavior, so it might be helpful for reviewers to check the new docs against the code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-package
Projects
None yet
Development

No branches or pull requests

3 participants