`#[doc(hidden)]` result in false positives #120

mqudsi · 2022-09-01T19:17:41Z

Steps to reproduce the bug with the above code

It is possible to use #[doc(hidden)] to implement seemingly-breaking changes without actually breaking semver compatibility.

Since cargo-semver-checks only uses the json rustdoc output to form its analysis, the claim that cargo-semver-checks has no false positives should probably have this shortcoming noted in the readme.

e.g. these two versions of the same code below are technically semver compatible:

pub struct Foo {}
pub struct Bar {}

#[doc(hidden)]
pub struct Foo {}
pub struct Bar {}

as merely hiding a type (or as is more often, a method) from the documentation doesn't break the semver contract.

or, for a more interesting case, the following:

pub struct Fo {}

pub struct Foo {}
// The previous release included a typo in the type name so the following is kept for backwards-compatibility:
#[doc(hidden)]
pub type Fo = Foo;

There is a lot more dark magic that can done that would break docs but preserve backwards compatibility. Relying solely on the docs is a great and efficient way of implementing straightforward semver breakage cases, but I don't think it's fair to claim that any tool using this approach has no false positives.

Actual Behaviour

A breaking change is reported.

Expected Behaviour

No breaking change should be reported (though I realize this is fundamentally not possible for this crate as it is written).
As such, the "no false positives" claim should be removed or at least the caveats thoroughly mentioned.

Additional Context

No response

Debug Output

No response

The text was updated successfully, but these errors were encountered:

obi1kenobi · 2022-09-01T21:43:26Z

Thanks for reporting. I don't think it's fundamentally impossible to fix. We just need more data from rustdoc and then to tweak some queries to act on that data appropriately.

Action items:

Suitably update README wording on false positives. For cargo-semver-checks, false positives are bugs — we avoid them as best we can and squash them when we find them, but few things are truly 100% bug-free and we're all doing our best here.
We currently run rustdoc with --document-private-items, and we should also run it with --document-hidden-items as well to get the additional data we need (ref: Distinguish between private items and hidden items in rustdoc rust-lang/rust#67875 and Add --document-hidden-items flag to cargo doc rust-lang/cargo#7766 )
For the various deleted / renamed / moved lints, we should only check items that were not hidden in the baseline, and match them regardless of whether they are hidden in the current version. Should also add test cases for these behaviors.
For all other lints (I believe, let's double-check), we should filter only to items that were not, and continue to not be, hidden. Also need test cases.
For future semver-minor lints, we should consider how #[doc(hidden)] interacts with "new functionality added to the public API" kinds of lints. We can afford to punt this for a little while since we still have plenty of semver-major lints to write first.

Related to #58 since users may occasionally want to explicitly include a #[doc(hidden)] item for checking, or exempt non-hidden items from checking.

epage · 2022-09-02T00:01:13Z

We currently run rustdoc with --document-private-items, and we should also run it with --document-hidden-items as well to get the additional data we need

I don't even remember why we do this. Can we get rid of private items?

obi1kenobi · 2022-09-02T01:34:47Z

I don't even remember why we do this. Can we get rid of private items?

TL;DR: We can, but I'd wait a bit first just to be safe.

It's because of a bug in rustdoc that caused a bug in cargo-semver-checks: #32

The bug has since been fixed, but I don't remember whether the fix is before or after the last rustdoc JSON format was introduced. If it was after the last format change, then removing private items is equivalent to tightening the minimum nightly version we support, which might be a bit more trouble than it's worth right now. The next time we need a newer nightly bound, we should remove it then.

obi1kenobi · 2022-12-16T23:42:24Z

Update: --document-private-items is necessary for one additional reason at the moment. It lets us determine the existence of private fields in structs, which affects whether they are externally-constructible, which is relevant when determining e.g. whether adding #[non_exhaustive] is a breaking change or not. For structs, it's only breaking if the struct was previously externally-constructible.

We could work around this requirement if need be, but that additional effort at the moment is perhaps better spent elsewhere.

aDotInTheVoid · 2023-01-07T09:46:29Z

It lets us determine the existence of private fields in structs.

We could work around this requirement if need be

This is reported directly in fields_stripped, so theirs a relativly easy "native" way to do it.

Bromeon · 2023-01-17T22:05:37Z

#[doc(hidden)] is often used precisely to exclude something from the public API, i.e. exclude it from semantic versioning guarantees. Many times, #[doc(hidden)] is the only option, when there is no technical alternative in Rust (e.g. macros polluting the top-level namespace, internal dependencies between crates, generated code, etc.)

As such, it seems wrong to me that changes in hidden items are detected as "breaking semver".
Shouldn't hidden items be treated as non-existing from a public API perspective?

As such, I disagree with this conclusion:

e.g. these two versions of the same code below are technically semver compatible:
pub struct Foo {}
pub struct Bar {}
#[doc(hidden)]
pub struct Foo {}
pub struct Bar {}
as merely hiding a type (or as is more often, a method) from the documentation doesn't break the semver contract.

Foo was part of the public API and is now removed from it. This is a semver-breaking change in my opinion.

If you want to communicate that a previously existing symbol should no longer be used, #[deprecated] is the better choice.

Deprecating is better than hiding, as it clearly communicates the migration path to the user and doesn't let them wonder why a symbol seemingly disappeared but still works.
This is btw exactly what should be done in the Fo/Foo typo example.

If on the other hand the presence of the symbol is a deal-breaking bug, the symbol should be removed, and the next semver-incompatible version should be released.

Leaving it accessible but hidden does not prevent any harm. A compiler will not inform about usage of hidden symbols (as opposed to deprecated symbols or straight compile errors).

Some popular crates follow this understanding, for example syn.

This view is further backed by the Rust API guidelines:

Rustdoc is supposed to include everything users need to use the crate fully and nothing more. It is fine to explain relevant implementation details in prose but they should not be real entries in the documentation.

Especially be selective about which impls are visible in rustdoc -- all the ones that users would need for using the crate fully, but no others. In the following code the rustdoc of PublicError by default would show the From<PrivateError> impl. We choose to hide it with #[doc(hidden)] because users can never have a PrivateError in their code so this impl would never be relevant to them.

obi1kenobi · 2023-02-04T21:43:15Z

Another nuance, posted without prejudice: some clap versions in the 3.x family include items and renames that are deprecated and doc-hidden, presumably with the intention of allowing code that depends on the old names to continue working while encouraging users to use the new names: https://github.com/clap-rs/clap/blob/v3.2.0/src/lib.rs#L65-L68

If we stopped using --document-hidden-items then the above would be reported as a removal and a breaking change, and that crate's maintainers would have a reasonable argument that the error is false-positive as no code would actually be broken.

It's clear that this issue is both important and not straightforward to solve. More research is needed to come up with a good solution.

davidhewitt · 2023-03-04T10:40:24Z

If there are different usages of #[doc(hidden)], perhaps there may need to be some configuration to help the crate authors indicate what is acceptable in their case?

PyO3, for example, has a lot of #[doc(hidden)] items exported from the pyo3::impl_ module. This module is explicitly documented as implementation details which are semver exempt.

It might be good enough for PyO3's case to have an option --allow-semver-exempt=pyo3::impl_ or something like this, which the crate authors could use to specify semver-exempt namespaces.

obi1kenobi · 2023-03-04T15:53:18Z

Thanks for the suggestion. I'd love to flesh this out a bit more and understand how a few edge cases here should behave:

Say a type Foo was defined in pyo3::impl_ and then re-exported by another module which was not marked as semver-exempt. The re-export is not #[doc(hidden)]. Is Foo semver-exempt or not?
Say that pyo3::impl_::Foo type is re-exported by another crate, and the re-export is not #[doc(hidden)]. Is Foo semver-exempt or not?
Say that pyo3::impl_::Foo type is used in a private field in struct pyo3::Bar, which is not semver-exempt. Both pyo3::impl_::Foo and pyo3::Bar used to be Send + Sync. As of the most recent version, the semver-exempt pyo3::impl_::Foo stopped being Send + Sync. However, this made pyo3::Bar not Send + Sync any more either. Should this be reported as a semver problem?

Edge cases like these are tricky and difficult to reason about. I'd love your help in getting them right!

adamreichold · 2023-03-04T16:03:39Z

Is Foo semver-exempt or not?

At least for our use case, the re-export itself is most likely an error and we would benefit from it being reported, independent of any changes. As a corollary, changes could be reported as well.

Is Foo semver-exempt or not?

I think just publically exposing these items outside of the crate and allowing to list them could be reported as a potential semver hazard. Thereby, I would say changes can be reported as this is strictly less of a requirement.

Should this be reported as a semver problem?

Definitely because it affects types we expect our uses to access.

adamreichold · 2023-03-04T16:12:22Z

Maybe as proposal for a mechanistic definition: Allow listing a path prefix means that these items are completely removed before any checks are performed. Anything publically visible referencing them (and not removed itself in the previous step) is an error by itself.

obi1kenobi · 2023-03-04T22:24:58Z

Ultimately this will be some form of query tweak equivalent to a mechanistic definition, since all the lints are written as queries. Two things we have to be careful of, though:

Items exempt from semver are allowed to fulfill semver obligations (i.e. prevent semver violations). They just aren't allowed to break semver obligations and cause violations.
The item should remain in existence but any import paths in semver-exempt modules should be filtered out subject to the above constraint. This is so that e.g. moving an item into a semver-exempt module but re-exporting it in its original non-semver-exempt location correctly tracks that item as subject to semver due to the re-export.

Probing more edge cases, imagine we have a type defined in a semver-exempt module and not re-exported elsewhere.

Are there cases where some of the methods on that type are not semver-exempt?
Are the traits that type implements ever not semver-exempt?

I'm worried about situations like macro code producing something the user is expected to use or observe directly in some limited fashion.

obi1kenobi · 2023-04-05T00:30:00Z

An observation I made recently gave me an idea for solving this problem.

#[doc(hidden)]-ness is a property of the import path.

Consider the following situation:

#[doc(hidden)]
pub mod inner {
    pub struct Foo;

    pub struct Bar;
}

pub use inner::Foo;

Here, Bar is #[doc(hidden)] but Foo is not. crate::Foo is a valid import path for Foo that does not go through any #[doc(hidden)] items, whereas Bar is only accessible as crate::inner::Bar which is #[doc(hidden)].

We already have infrastructure in place for computing all possible import paths. This was a hard problem to solve, but we've solved it — and with a slight extension we can reuse that solution to solve #[doc(hidden)] too.

Proposal

I propose we add a new edge to the Importable interface to signify importable paths that are not #[doc(hidden)]. Name subject to bike-shedding (please share your suggestions!), tentatively calling it non_hidden_importable_path.

Then we can adjust queries to use non_hidden_importable_path in the place of importable_path in appropriate places. For example:

Removing a function (or its re-export) is major+breaking only if the function previously had a non_hidden_importable_path.
Adding a variant to an exhaustive enum is major+breaking only if the enum previously had a non_hidden_importable_path.

We must consider two more factors: deprecations, and transitions from non-hidden to hidden and vice versa.

Handling deprecations

When maintainers deprecate items, they also make them #[doc(hidden)] to discourage their use. This is a minor change, and must not be reported as major+breaking.

However, such hidden + deprecated items may still be in use by crates in the ecosystem, so the APIs should still be considered non-hidden for semver-checking purposes.

Perhaps we should implement this along the lines of "deprecation undoes #[doc(hidden)]" in the importable path computations?

Handling transitions between hidden and non-hidden status

Items that were initially added as #[doc(hidden)] and have remained thus are exempt from semver-checking.

A hidden item becoming non-hidden should be a minor change, as if the item was just newly added.

A non-hidden item becoming hidden without being deprecated must be reported as major+breaking. Here's why: say we don't do this, then we can have a situation where v1.0 adds a non-hidden pub API, v1.1 hides part of that pub API, then v1.2 deletes the hidden portion of the API. This is a breaking change between v1.0 and v1.2 — but where could we have detected this? The only sensible place to detect and report this change is v1.1, since the v1.1 -> v1.2 change is just a deletion of a #[doc(hidden)] type and we already said this isn't major+breaking.

davidhewitt · 2023-04-18T07:14:01Z

The above proposal sounds correct to me; I can't think of any uses in PyO3 which would not fit in the above.

obi1kenobi · 2023-04-19T14:32:09Z

@tonowak pointed out another edge case to consider: auto-traits in #[doc(hidden)] items.

Normally it's fine for macro internals to be hidden and not public API even though their types are technically pub.

But consider a use case like the ouroboros crate where those macros generate new types containing their own types. The auto-traits on the macro-internal types affect the auto-traits on the user's generated type. If the user's generated type today is Send / Sync / UnwindSafe, a change in the macro-internal types can break it even though the changed type was not part of the public API.

Are we stuck picking between "always report auto-trait changes on hidden items, and suffer false-positives" and "never report auto-trait changes on hidden items, and suffer false-negatives"? Is there a better solution?

obi1kenobi · 2023-10-30T14:46:04Z

Another interesting edge case: if a non-sealed trait has a #[doc(hidden)] method without a default implementation, that method is still public API despite the #[doc(hidden)] since downstream implementors have to implement it, and may be broken otherwise.

If the #[doc(hidden)] method has a default impl but isn't itself sealed,[0] it could be argued that it isn't public API ("users don't have to implement it") or that it is ("users may choose to implement it anyway"). I'm currently leaning toward the former, but I need to think about this some more.

Brought up here: https://hachyderm.io/@jsbarretto@social.coop/111323903224194053

[0]: My post "A definitive guide to sealed traits in Rust" shows how to make a trait method not allowed to be called or overridden outside of the defining crate. If that method has a default impl, then the trait itself is not sealed because of it but that method's default impl may not be overridden. In such a case, #[doc(hidden)] on that method definitely makes it not public API, since it arguably already wasn't part of the public API anyway — it couldn't be called or overridden outside of its defining crate.

epage · 2023-10-30T15:13:43Z

I feel like doc(hidden) should be a tool of last resort for indicating something is private and, where possible, sealing traits or trait methods should be used.

obi1kenobi · 2023-10-30T15:15:41Z

FWIW I agree. The linked thread has more info, but TL;DR there was a concern about the quality of error messages being degraded for some reason if the trait is sealed. It might be worth filing a diagnostics issue on rustc for that as well.

adamreichold · 2023-10-30T15:21:16Z

I feel like doc(hidden) should be a tool of last resort for indicating something is private and, where possible, sealing traits or trait methods should be used.

The main use case for PyO3 is types which are supposed to be implemented, i.e. cannot be sealed, but never manually, i.e. only via proc-macro generated code.

obi1kenobi · 2023-11-15T18:19:22Z

Resolved by #576. Headline feature of cargo-semver-checks v0.25, planned for release shortly after Rust 1.74.

Official announcement blog post coming soon on my blog: https://predr.ag/blog

Thanks to everyone for your patience! Thanks to @davidhewitt in particular for bouncing ideas and prototypes back and forth until we got to something viable.

Also thanks to everyone that sponsors my work — the best way to help cargo-semver-checks progress faster is to sponsor its development, which helps me dedicate more time to it instead of doing consulting gigs to pay my rent.

mqudsi added the C-bug Category: doesn't meet expectations label Sep 1, 2022

mqudsi changed the title ~~doc(hidden) types result in false positives~~ #[doc(hidden)] result in false positives Sep 1, 2022

davidhewitt mentioned this issue Mar 4, 2023

Enable cargo-semver-checks to catch accidental breakage of backwards compatibility. PyO3/pyo3#2984

Closed

This was referenced Aug 29, 2023

Implement properties around participation in public APIs on importable paths and items. obi1kenobi/trustfall-rustdoc-adapter#260

Merged

add item_removed_from_public_api lint #525

Closed

obi1kenobi mentioned this issue Sep 1, 2023

Lint: Changing the type of a pub struct's pub field #148

Open

joshlf mentioned this issue Oct 4, 2023

[macro_util] Rename from derive_util, include more google/zerocopy#455

Merged

obi1kenobi closed this as completed Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`#[doc(hidden)]` result in false positives #120

`#[doc(hidden)]` result in false positives #120

mqudsi commented Sep 1, 2022 •

edited

obi1kenobi commented Sep 1, 2022 •

edited

epage commented Sep 2, 2022

obi1kenobi commented Sep 2, 2022

obi1kenobi commented Dec 16, 2022

aDotInTheVoid commented Jan 7, 2023

Bromeon commented Jan 17, 2023

obi1kenobi commented Feb 4, 2023 •

edited

davidhewitt commented Mar 4, 2023

obi1kenobi commented Mar 4, 2023

adamreichold commented Mar 4, 2023 •

edited

adamreichold commented Mar 4, 2023 •

edited

obi1kenobi commented Mar 4, 2023

obi1kenobi commented Apr 5, 2023

davidhewitt commented Apr 18, 2023

obi1kenobi commented Apr 19, 2023

obi1kenobi commented Oct 30, 2023 •

edited

epage commented Oct 30, 2023

obi1kenobi commented Oct 30, 2023

adamreichold commented Oct 30, 2023

obi1kenobi commented Nov 15, 2023

#[doc(hidden)] result in false positives #120

#[doc(hidden)] result in false positives #120

Comments

mqudsi commented Sep 1, 2022 • edited

Steps to reproduce the bug with the above code

Actual Behaviour

Expected Behaviour

Additional Context

Debug Output

obi1kenobi commented Sep 1, 2022 • edited

epage commented Sep 2, 2022

obi1kenobi commented Sep 2, 2022

obi1kenobi commented Dec 16, 2022

aDotInTheVoid commented Jan 7, 2023

Bromeon commented Jan 17, 2023

obi1kenobi commented Feb 4, 2023 • edited

davidhewitt commented Mar 4, 2023

obi1kenobi commented Mar 4, 2023

adamreichold commented Mar 4, 2023 • edited

adamreichold commented Mar 4, 2023 • edited

obi1kenobi commented Mar 4, 2023

obi1kenobi commented Apr 5, 2023

Proposal

Handling deprecations

Handling transitions between hidden and non-hidden status

davidhewitt commented Apr 18, 2023

obi1kenobi commented Apr 19, 2023

obi1kenobi commented Oct 30, 2023 • edited

epage commented Oct 30, 2023

obi1kenobi commented Oct 30, 2023

adamreichold commented Oct 30, 2023

obi1kenobi commented Nov 15, 2023

`#[doc(hidden)]` result in false positives #120

`#[doc(hidden)]` result in false positives #120

mqudsi commented Sep 1, 2022 •

edited

obi1kenobi commented Sep 1, 2022 •

edited

obi1kenobi commented Feb 4, 2023 •

edited

adamreichold commented Mar 4, 2023 •

edited

adamreichold commented Mar 4, 2023 •

edited

obi1kenobi commented Oct 30, 2023 •

edited