Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why fmt must use internet to fmt code? #3811

Closed
driftluo opened this issue Sep 27, 2019 · 9 comments · Fixed by #3813
Closed

Why fmt must use internet to fmt code? #3811

driftluo opened this issue Sep 27, 2019 · 9 comments · Fixed by #3813

Comments

@driftluo
Copy link

driftluo commented Sep 27, 2019

rustfmt use cargo-metadata to get all kinds of metadata, but default use online mode and can not change to offline.

But it is well known that because of various network problems, cargo added the --offline option, metadata also supports this option, but fmt does not support.

I strongly recommend that fmt also have the offline parameter or default with offline.

fn get_cargo_metadata(
manifest_path: Option<&Path>,
include_deps: bool,
) -> Result<cargo_metadata::Metadata, io::Error> {
let mut cmd = cargo_metadata::MetadataCommand::new();
if !include_deps {
cmd.no_deps();
}
if let Some(manifest_path) = manifest_path {
cmd.manifest_path(manifest_path);
}
match cmd.exec() {
Ok(metadata) => Ok(metadata),
Err(error) => Err(io::Error::new(io::ErrorKind::Other, error.to_string())),
}
}

@calebcartwright
Copy link
Member

Sounds reasonable to me. I'll work on adding this.

FWIW, rustfmt itself doesn't use cargo_metadata, so as a work around in the short term you could use rustfmt directly instead of cargo fmt if you're having network issues

@driftluo
Copy link
Author

driftluo commented Oct 3, 2019

@calebcartwright @topecongiro I think you don't understand what a bad network state is.

I have seen the final solution of pr, but from the results, it can only solve the situation that there is no network at all.

However, in China, a bad network does not mean that there is no network. In this case, the current #3813 cannot be solved.(Unless the network cable is disconnected or WiFi is turned off before calling cargo fmt, this is very unhumanized and stupid.)

The use of cargo metadata in China is as follows:

client request send -> gfw -> server response -> gfw -- some data can't send to client -> client

Therefore, in the case of an appeal, the tcp access does not end, the client is waiting for a response, the server has sent a response message, but the message is intercepted by gfw, the client attempts to re-send, gfw continues to intercept。On Linux, default timeout seems 180 second,so client will wait util tcp shutdown(Maybe, waiting here is more than three minutes, it may have received some news, but not all), during this time, the user can't do anything.

Fallback offline #3813 is no practical help for the appeal. Metadata does not return an error immediately, but waits for the message to return, while fmt is waiting for a response from metadata.

To simulate this situation is very simple, you can use the firewall to intercept the return message or start a server, sleep for more than 3 minutes for each request. This way, you may understand why I need the default offline or provide options.


Let's talk a little bit about the security of cargo not being connected to the Internet.

I admit that under certain commands, if there is no default networking, there will be security issues, such as cargo build.

However, we must be clear that fmt is a lint tool that does not modify the source logic itself and does not output compilation results. Its function is limited to adjusting the code format.

The source file is local, it does not require any resources on the network. I think the reason why fmt calls metadata may be that it need to confirm the scope of the current project, such as workspace, and this work does not need to understand the source code of deps. I can't imagine that there will be security problems due to offline.

In summary, I think there is no problem with offline as the default choice.

@calebcartwright
Copy link
Member

calebcartwright commented Oct 3, 2019

@calebcartwright @topecongiro I think you don't understand what a bad network state is.

Hi @driftluo! Thanks for following up. First I just want to note that with any changes, it's really important to make sure that those changes don't break anything. The decision to stick with online mode as the default in #3813 was done out of caution for that very reason.

I can definitely understand how this issue is frustrating for you, and I would certainly like to help get this resolved! However, we do need to be careful that the solution for this particular use case doesn't cause different problems for other users.

IMO, using offline by default would not be ideal. The call to cargo_metadata has always used the default online mode, and AFAICT this is the first time an issue has been reported. Cargo/other cargo subcommands also use online by default, and it might be a little confusing for folks if cargo fmt did the exact opposite (regardless of the differences between the subcommands). In my experience, users typically expect to find similar behavior/options across cargo subcommands.

However, we must be clear that fmt is a lint tool that does not modify the source logic itself and does not output compilation results.

Yup, we're aware 😉

I think the reason why fmt calls metadata may be that it need to confirm the scope of the current project, such as workspace,

The cargo fmt subcommand does so to help simplify the process of running rustfmt, which is especially helpful in larger projects that have crates with multiple targets, multiple crates/workspace members, detecting editions, etc. As I mentioned in my original response, rustfmt itself does not

and this work does not need to understand the source code of deps. I can't imagine that there will be security problems due to offline.

I noted this over in #3183, but I don't know the answer to this. I don't know enough about the cargo internals to know if for example, whether running cargo fmt --all ... in offline mode in a new environment (like a CI server) would be problematic, as the dep tree is inspected when the --all flag is included.

I don't personally have any objections to switching back to adding the --offline flag that I originally submitted in #3813, though I do understand why @topecongiro wants be mindful about adding more and more CLI args.

I'd be more than happy to make any additional changes to resolve this (whether that's offline by default or adding the --offline flag), but as always I defer to @topecongiro on which path to take.

@driftluo
Copy link
Author

driftluo commented Oct 3, 2019

running cargo fmt --all in offline mode in a new environment

I don't consider this, this may be have a problem. The possibilities should be carefully considered.

offline default fallback to inline?

@upsuper
Copy link

upsuper commented Oct 3, 2019

It's indeed a bit surprised that cargo fmt requires network by default, since it isn't supposed to need any build information at all. Even cargo fmt --all mentioned above doesn't seem to need network either, since its help message mentions "workspaces", which, I suppose, means formatting all packages under the current workspaces, which is still local.

@calebcartwright, do you think we should open a separate issue for further discussion of switching the default?

I can understand your concern that we may be breaking things if we don't take care. Are you aware of some way we can have people who have more knowledge around this involve in the discussion?

@calebcartwright
Copy link
Member

@calebcartwright, do you think we should open a separate issue for further discussion of switching the default?

I think it's fine to keep the discussion on this thread, at least for the time being

It's indeed a bit surprised that cargo fmt requires network by default

The only reason cargo fmt has historically operated in online mode is because it needs to invoke cargo metadata ... commands (which it does via the cargo_metadata crate) in order to get the data it needs to pass as input to rustfmt, and cargo uses online mode by default

So the real connectivity question is about the cargo metadata ... commands. It is entirely possible (perhaps even probable) that cargo metadata ... will always work in every situation in offline mode, but I just personally don't know whether that is true 100% of the time.

IMHO (for whatever it's worth 😄), going with an approach to solve this one particular use case that would change the default behavior for every cargo fmt invocation for every user when there's still some unknowns/questions with that approach carries some degree of risk of unintended/undesireable side effects.

Even cargo fmt --all mentioned above doesn't seem to need network either, since its help message mentions "workspaces", which, I suppose, means formatting all packages under the current workspaces, which is still local

When the --all flag is included, there's a recursive stepping through the dep tree with multiple cargo metadata invocations. Again I'm not saying that cargo metadata needs network connectivity here, it was just one example of an instance where I don't know whether connectivity is needed, or what, if any, the impact of offline would be 🤷‍♂

fn get_targets_recursive(
manifest_path: Option<&Path>,
mut targets: &mut BTreeSet<Target>,
visited: &mut BTreeSet<String>,
) -> Result<(), io::Error> {
let metadata = get_cargo_metadata(manifest_path, false)?;
let metadata_with_deps = get_cargo_metadata(manifest_path, true)?;
for package in metadata.packages {
add_targets(&package.targets, &mut targets);
// Look for local dependencies.
for dependency in package.dependencies {
if dependency.source.is_some() || visited.contains(&dependency.name) {
continue;
}
let dependency_package = metadata_with_deps
.packages
.iter()
.find(|p| p.name == dependency.name && p.source.is_none());
let manifest_path = if dependency_package.is_some() {
PathBuf::from(&dependency_package.unwrap().manifest_path)
} else {
let mut package_manifest_path = PathBuf::from(&package.manifest_path);
package_manifest_path.pop();
package_manifest_path.push(&dependency.name);
package_manifest_path.push("Cargo.toml");
package_manifest_path
};
if manifest_path.exists() {
visited.insert(dependency.name);
get_targets_recursive(Some(&manifest_path), &mut targets, visited)?;
}
}
}
Ok(())
}

Are you aware of some way we can have people who have more knowledge around this involve in the discussion

I imagine the folks on the cargo team would have the most insight into whether the cargo metadata command will always work in offline mode and/or whether there'd ever be any missing data elements in the cargo metadata results when using offline mode.

All that being said, how we proceed from here is up to @topecongiro, and I'm willing to help any way I can!

@calebcartwright
Copy link
Member

calebcartwright commented Oct 3, 2019

I also went ahead and opened #3830 and #3831 with implementations of the alternative approaches to make it easier to proceed with one of those: one with offline by default with online fallback, and another that adds the --offline flag to cargo fmt to allow users to opt into offline mode.

@topecongiro
Copy link
Contributor

@driftluo Thank you for elaborating on your concern.

I checked the source code of cargo. AFAICT, cargo metadata --no-deps won't require any network connection whereas a plain cargo metadata may lead to a network connection. As in the @calebcartwright's comment, the only usage of cargo metadata without the --no-deps option is when we run cargo fmt --all.

The usage of cargo metadata without the --no-deps option was introduced in #3664. Though I haven't confirmed, judging from the comment, we can remove this usage by updating cargo_metadata. I prefer to take this route, if possible, as it does not require adding a command line option while keeping cargo-fmt completely offline.

@calebcartwright Thank you for creating #3830 and #3831. As I noted above, I prefer to try removing the usage of cargo metadata without the --no-deps first, but if that turns out to be infeasible, then I am happy to merge one of two of your PRs.

@calebcartwright
Copy link
Member

calebcartwright commented Oct 4, 2019

Though I haven't confirmed, judging from the comment, we can remove this usage by updating cargo_metadata

That would also require a patch to cargo itself wouldn't it? AFACIT the cargo_metadata crate is primarily just running cargo metadata .. commands and deserializing the command output.

We'd need cargo to be updated so that running cargo metadata --no-deps would include the manifest_path value for each dependency (or a value for source that we could use to locate the manifest) to prevent a recurrence of the bug in #3643.

I'll look into the viability of that, although I'm not sure how long it would it take to deliver such a patch for cargo and then for cargo_metadata

Edit: Removed some items I originally thought we'd need cargo to do, but that I realized we could handle in cargo fmt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants