Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better pattern argument naming in rustdoc #97048

Open
conradludgate opened this issue May 14, 2022 · 28 comments
Open

Better pattern argument naming in rustdoc #97048

conradludgate opened this issue May 14, 2022 · 28 comments
Labels
T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@conradludgate
Copy link
Contributor

I noticed with function pattern arguments that rustdoc makes some interesting choices regarding argument names.

For instance:

fn func(Foo(bar): Foo) {}

gets rustdoc'd into

fn func(__arg0: Foo) {}

(see
https://docs.rs/atuin-server/0.9.1/atuin_server/handlers/history/fn.calendar.html for a real life version)

It's hard to say what rustdoc should do in the general case, but there are some simple cases that could be improved.

For the single capture pattern, the argument could inherit the name. Eg, in the example above, it could be rendered as bar: Foo.

However, this single capture rule should maybe not apply if in a case like

fn func(Struct { bar, .. }: Struct) {}

Since bar might be a small subset of what the type is representing, and might be misleading (or implementation detail).

In the general case, the argument name could be a snake case version of the type name, if no conflicts. This is the best heuristic I can come up with for a sensible name

@GuillaumeGomez GuillaumeGomez added the T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. label May 20, 2022
@GuillaumeGomez
Copy link
Member

In such case, I think generating something like:

fn func(_: Foo) {}

would work out nicely enough.

@Urgau
Copy link
Member

Urgau commented May 20, 2022

In such case, I think generating something like:

fn func(_: Foo) {}

would work out nicely enough.

Except that someone might think that because it's _ it doesn't matter and isn't used at all, which is certainly not the case.

@GuillaumeGomez
Copy link
Member

I'm not sure that printing Foo(bar): or Struct { bar, .. }: would be better though. What useful information would it provide for the doc reader?

@Urgau
Copy link
Member

Urgau commented May 20, 2022

I'm not sure that printing Foo(bar): or Struct { bar, .. }: would be better though. What useful information would it provide for the doc reader?

Agree, printing Foo(bar): or Struct { bar, .. }: wouldn't be useful either.
Maybe simply printing arg0 instead of __arg0 might already be better.

@GuillaumeGomez
Copy link
Member

It's not great either... Any opinion on this @rust-lang/rustdoc?

@jsha
Copy link
Contributor

jsha commented May 20, 2022

Can we pass through exactly what was written in the source code by the author?

@GuillaumeGomez
Copy link
Member

I suppose so, but once again: would this destructuring information be useful for the doc reader?

@jsha
Copy link
Contributor

jsha commented May 20, 2022

Yes. For instance, if a destructuring only cares about a single field of a big struct, it's useful for the caller to know that only that one field matters.

Also, passing through the original improves predictability. People's general mental model of what shows up in rustdoc is "the function signature as written, plus some hyperlinks and formatting." It's confusing to deviate from that.

@GuillaumeGomez
Copy link
Member

This is a good point. My main worry was the complexification of the doc but I guess it's secondary.

@Urgau
Copy link
Member

Urgau commented May 20, 2022

Yes. For instance, if a destructuring only cares about a single field of a big struct, it's useful for the caller to know that only that one field matters.

But that's a implementation detail not a promise. The field might be private and this would be even more confusing for the reader.

@jsha
Copy link
Contributor

jsha commented May 20, 2022

This is true - but I think it's more confusing (for both the author and the reader) for the documented function signature to be different from what's written in the code.

If the author wants to make sure users can't see the destructuring, they can put it inside the function. If the author does want users to see the destructuring, they can put it in the function signature. This approach gives both predictability and flexibility for the doc author.

@GuillaumeGomez
Copy link
Member

GuillaumeGomez commented May 20, 2022

But the problem about private fields remains though. And it's quite a big one.

@jsha
Copy link
Contributor

jsha commented May 20, 2022

If the function signature includes a destructuring that names a private field, we should go ahead and list that name, with a doc lint saying "this function signature exposes the name of a private field."

@GuillaumeGomez
Copy link
Member

That seems like way too much useless information. Why would this be useful for the reader? For me it looks like explanations about what they're reading. It seems more confusing than useful.

I didn't expect this issue to be so complicated. 😄

@Urgau
Copy link
Member

Urgau commented May 20, 2022

If the function signature includes a destructuring that names a private field, we should go ahead and list that name, with a doc lint saying "this function signature exposes the name of a private field."

It isn't invalid. People should not rely on this. Doing this would change the semantics of this de-structuring from just being a sugar for the dev to something that is important and that API dev must take into account. That would be a major change.

And that doesn't even take into account the useless information for the reader:

That seems like way too much useless information. Why would this be useful for the reader? For me it looks like explanations about what they're reading. It seems more confusing than useful.

@jsha
Copy link
Contributor

jsha commented May 20, 2022

Why would this be useful for the reader? For me it looks like explanations about what they're reading. It seems more confusing than useful.

That's a decision for the doc author to make, not rustdoc. Rustdoc's role is to provide a powerful, predictable tool. The doc author's role is to structure their code so that it provides useful information when rendered by rustdoc.

Here's an analogy: the names of function parameters are not used by the caller. But we include them in rustdoc output because they are useful in giving a hint about what that parameter will do. Since doc authors know the parameter names become part of the public documentation, they don't name parameters supercalifragilisticexpialidocious, they name them something useful like user. Similarly, a doc author can choose to use, or not use, a destructuring in their public API function signatures.

@GuillaumeGomez
Copy link
Member

I understand your point, but don't agree with all of it (and also don't completely agree with your analogy as you're mixing two different things). As soon as a destructured parameter is using a private field, then it doesn't provide useful information to the reader. So unless document-private is used, I think it shouldn't be displayed.

For me rustdoc has two different kind of usage:

  • generating docs for developers working on the crate (so with document-private)
  • generating docs for developers using the crate

@Urgau
Copy link
Member

Urgau commented May 20, 2022

This could be even a security/logic issue. Example, a dev write this:

pub struct Struct {
    pub bar: u8,
    pub security_bit: u8
}

pub fn func(Struct { bar, .. }: Struct) {}

But than change it to:

pub struct Struct {
    pub bar: u8,
    pub security_bit: u8
}

pub fn func(Struct { bar, security_bit }: Struct) {}

Someone looking at the first version realize that security_bit doesn't matter but than the dev change it to the second code where it now matter. Nothing as changed from the API definition, but if rustdoc decided to show it would be relevant despite being a implementation detail.

@conradludgate
Copy link
Contributor Author

conradludgate commented May 21, 2022

Regarding the private fields, I raised an issue a year ago showing that public functions can return 'private' types. rustdoc includes the type name in that return position, just it doesn't include any link to documentation for that type. Whether a function argument destructuring contains private fields or not I think is irrelevant.

I also don't know what kind of security problems it would cause since the source code is all publicly available anyhow.

The main reason I raised this issue was that I felt that without a proper name, function arguments can be very difficult to understand. __arg0 or _ doesn't help. Path(user_id): Path or user_id: Path would help a lot more in comparison.

@Urgau
Copy link
Member

Urgau commented May 21, 2022

Regarding the private fields, I raised an issue a year ago showing that public functions can return 'private' types. rustdoc includes the type name in that return position, just it doesn't include any link to documentation for that type. Whether a function argument destructuring contains private fields or not I think is irrelevant.

I disagree showing private would not be helpful and would just add more distraction.
Concerning your example with Self as the return type, I agree that only showing Self without any other direct indication is not very helpful; rustdoc should maybe add a link to the impl<L> Router<L> part, but that's another issue.

I also don't know what kind of security problems it would cause since the source code is all publicly available anyhow.

Imaging that func does a manipulation between bar and security_bit, if security_bit was not set properly because when the programmer watch the function definition he saw that if wasn't used, it could lead to unwanted manipulation.

The main reason I raised this issue was that I felt that without a proper name, function arguments can be very difficult to understand. __arg0 or _ doesn't help. Path(user_id): Path or user_id: Path would help a lot more in comparison.

I think we all agree on this and we're trying to find the best solution for it.


As my final words on this issue I agree with @GuillaumeGomez: "So unless document-private is used, I think it shouldn't be displayed." and I don't know what should be displayed in the non document-private case.

@jsha
Copy link
Contributor

jsha commented May 22, 2022

There is a theme that comes up frequently for rustdoc: we want to produce the most useful docs for the reader. Are we doing that in spite of the doc author, or in collaboration with the doc author? In other words, should we expect that the doc author is writing their public API with an eye towards documentation? I think rustdoc has to be in collaboration with the doc author. There are some ways a doc author can write code such that we can't make good docs out of them, and the solution is to fix the code, not try to work around it in rustdoc.

The main reason I raised this issue was that I felt that without a proper name, function arguments can be very difficult to understand.

Agreed. The names of function parameters are an important part of the documentation, and they're chosen by the doc author.

When the doc author uses a destructuring in a public function signature, they haven't given us a meaningful name we can show in the docs. There few options:

  1. Rustdoc displays a meaningless name (__arg0).
  2. Rustdoc displays a meaningless name that looks more normal by repeating the type name (e.g. path: Path<String>).
  3. Rustdoc displays exactly what the doc author wrote, destructing and all.
  4. The doc author moves the destructuring inside the function body and gives a meaningful name to the parameter in the public function signature.

(4) is the only approach that can possibly yield good docs. So we should encourage doc authors to do (4). In the cases where they don't, rustdoc should do whatever is simplest (implemenation-wise, or conceptually simplest).

@conradludgate
Copy link
Contributor Author

I think I agree with this conclusion. It's just a shame this feature would be unrecommended in lib code beyond the use of something like mut foo: Foo

@Manishearth
Copy link
Member

I think rustdoc has to be in collaboration with the doc author. There are some ways a doc author can write code such that we can't make good docs out of them, and the solution is to fix the code, not try to work around it in rustdoc.

Strong agree with this btw

@notriddle
Copy link
Contributor

I'd choose between these two options for rendering Foo(bar): Foo

  • Just truncate it entirely, resulting in _: Foo
  • Just show the source code, resulting in Foo(bar): Foo

Both options are defensible. The first one is less noisy, and is probably fine given how often the type really is enough and you don't actually need a name. The second is defensible on "show the source code as written" grounds.

@ratmice
Copy link
Contributor

ratmice commented May 23, 2022

I don't think private field naming being leaked is a problem, it may happen to also be the name of a private field, but it is just a variable binding which can be seen by renaming it...

fn foo(Foo{bar: baz}: Foo) {
    baz
}

Edit: I would personally rather read pattern matches than e.g. __arg0

@Manishearth
Copy link
Member

I slightly prefer truncation because the pattern destructuring IME is more for the internal code's benefit than for the external API benefit

@passcod
Copy link
Contributor

passcod commented Oct 9, 2022

There's three contexts in which a function signature is displayed in docs:

  • freestanding functions and impl methods. In that case a pattern is internal use only, ack'ing the point that partial use could be interesting information.
  • required methods in traits. In that case a pattern is not allowed.
  • provided methods in traits. In that case, a person using the method would 'only need to know the type', but a person overriding the method would benefit from having the default usage's destructure

There's also, as mentioned briefly above, two cases for patterns in signatures:

  • complete patterns, like fn foo(Arg { a, b, c }: Arg), where the destructure is useless information for all but the internals
  • partial pattern, like fn bar(Arg { a, .. }: Arg), where there is some information that could be useful to an outside observer

I feel like there could be a design where the pattern is surfaced in some of these cases but not all.


Additionally, it's been pointed out you can influence the doc's autonaming using @:

pub fn function(_name @ Arg { field, .. }: Arg) {}

which renders to:

pub fn function(_name: Arg) {}

so the code's author can already avoid having the docs be confusing due to the autogenerated name, at the slight cost of some extra noise in the pattern.

@passcod
Copy link
Contributor

passcod commented Oct 9, 2022

There is also some minor precedent in not showing pattern details in that mut modifiers are excluded from signatures.

Tangentially, #35203 may contain interesting context to this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants