New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement basic input validation for built-in attributes #57321

Merged
merged 2 commits into from Jan 16, 2019

Conversation

Projects
None yet
5 participants
@petrochenkov
Copy link
Contributor

petrochenkov commented Jan 4, 2019

Correct top-level shape (#[attr] vs #[attr(...)] vs #[attr = ...]) is enforced for built-in attributes, built-in attributes must also fit into the "meta-item" syntax (aka the "classic attribute syntax").

For some subset of attributes (found by crater run), errors are lowered to deprecation warnings.

NOTE: This PR previously included #57367 as well.

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 4, 2019

@bors try

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 4, 2019

⌛️ Trying commit 3b37f15 with merge b9139d2...

bors added a commit that referenced this pull request Jan 4, 2019

Auto merge of #57321 - petrochenkov:atokens, r=<try>
Implement basic input validation for built-in attributes + Stabilize `unrestricted_attribute_tokens`

Based on #57272

---

In accordance with the plan in https://internals.rust-lang.org/t/unrestricted-attribute-tokens-feature-status/8561:
- The correct top-level shape (`#[attr]` vs `#[attr(...)]` vs `#[attr = ...]`) is enforced for built-in attributes.
- For non-built-in non-macro attributes:
    - The key-value form is restricted to bare minimum required to support what we support on stable - unsuffixed literals (#34981).
    - Arbitrary token streams in remaining forms (`#[attr(token_stream)]`, `#[attr{token_stream}]`, `#[attr[token_stream]]` ) are now allowed without a feature gate, like in macro attributes.

This will close #55208 once completed.
Need to go through crater first.
@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 4, 2019

cc @CAD97

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 4, 2019

☀️ Test successful - status-travis
State: approved= try=True


#[derive] //~ WARNING empty trait list in `derive`
struct Foo;
// compile-pass

#[derive()] //~ WARNING empty trait list in `derive`

This comment has been minimized.

@Centril

Centril Jan 4, 2019

Contributor

Btw... I tried to look for the reason why this happens instead of unused_attributes but got lost... (cc #54651).

This comment has been minimized.

@petrochenkov

petrochenkov Jan 4, 2019

Contributor

This is a hardcoded warning in libsyntax/ext/derive.rs.
derive is removed during expansion currently (

let item = self.fully_configure(item)
.map_attrs(|mut attrs| { attrs.retain(|a| a.path != "derive"); attrs });
) so it doesn't survive until unused attribute checking, IIRC.

Expansion should degrade #[derive(Macro1, Macro2, ...)] into #[derive()] instead with derive attribute itself marked as used when it applies a macro.

Show resolved Hide resolved src/test/ui/proc-macro/attribute.rs Outdated
Show resolved Hide resolved src/test/ui/stability-attribute/stability-attribute-sanity-4.rs Outdated
@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 4, 2019

  • The correct top-level shape (#[attr] vs #[attr(...)] vs #[attr = ...]) is enforced for built-in attributes, built-in attributes also must fit into the "meta-item" syntax (aka the "classic attribute syntax").

I really enjoyed this and the improved error messages as well as the improvements to feature_gate.rs.

  • For non-built-in non-macro attributes:
    • The key-value form is restricted to bare minimum required to support what we support on stable - unsuffixed literals (#34981).
    • Arbitrary token streams in remaining forms (#[attr(token_stream)], #[attr{token_stream}], #[attr[token_stream]] ) are now allowed without a feature gate, like in macro attributes.

My understanding is that the stable grammar of attributes then are:

OuterAttr = "#" attr:Attr;
InnerAttr = "#" "!" attr:Attr;
Attr = "[" path:Path input:AttrInput "]";
AttrInput =
  | {}
  | "(" TOKEN_STREAM ")"
  | "[" TOKEN_STREAM "]"
  | "{" TOKEN_STREAM "}"
  | "=" LITERAL
  ;

Is that correct?

This will close #55208 once completed.

I have some thoughts on this:

  • The restrictions and improvements to error messages you are making I think we should make right now since they are fantastic and correct.
  • However, it seems that the more liberal attribute grammar (i.e. the stabilization) is quite a nontrivial change that, while I entirely agree with it, should probably go through an RFC.
  • But I understand that you don't enjoy writing prose so maybe T-Lang (e.g. me or some other member) or someone else can assist with the RFC authorship.
  • I do think | "=" TOKEN_TREE should be allowed at the same time as the other liberalizations take effect as I think #[foo = 1 + 1] is often more idiomatic and we should not stabilize the unidiomatic variant first and have the community rush to use that one instead of the idiomatic variant.

Need to go through crater first.

Regardless of what I've written here let's do that.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 4, 2019

Assigning to r? @nikomatsakis for technical review
(I did review everything and it looked great but I'm not an expert...).

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 4, 2019

@craterbot run start=master#c0bbc3927e28c22edefe6a1353b5ecc95ea9a104 end=try#b9139d2caca1db46014a9c302d5c47cfae0d8ae0 mode=check-only

@craterbot

This comment has been minimized.

Copy link
Collaborator

craterbot commented Jan 4, 2019

👌 Experiment pr-57321 created and queued.
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot

This comment has been minimized.

Copy link
Collaborator

craterbot commented Jan 4, 2019

🚧 Experiment pr-57321 is now running on agent aws-1.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 4, 2019

@Centril

My understanding is that the stable grammar of attributes then are:

Correct.

But I understand that you don't enjoy writing prose so maybe T-Lang (e.g. me or some other member) or someone else can assist with the RFC authorship.

In this case the prose is already kind of written actually, in https://internals.rust-lang.org/t/unrestricted-attribute-tokens-feature-status/8561/3.

However, it seems that the more liberal attribute grammar (i.e. the stabilization) is quite a nontrivial change that, while I entirely agree with it, should probably go through an RFC.

Hm, I feel exactly oppositely, this is the most trivial part of the proposal.
The summary of the stabilization change is that we allow inert proc macro helpers to use same syntax as proc macro attributes themselves.
That's actually already true de-facto for non-derive helper attributes, because proc macros must remove them and feature gating is done after expansion (oops), so the relaxed syntax is already usable on stable.
FCP would be enough, I think.

The restrictions and improvements to error messages you are making I think we should make right now since they are fantastic and correct.

This also needs some attention and FCP, since this is some change of direction.
Previously attribute checking was pretty relaxed in most cases (garbage input is either recognized and used or ignored), but for some attributes input was randomly checked and errors reported.
With this work the intent changes to checking the input grammar (and eventually the attribute target) for all built-in attributes.
I think this is an improvement, but who knows what others think.

I do think | "=" TOKEN_TREE should be allowed at the same time as the other liberalizations take effect as I think #[foo = 1 + 1] is often more idiomatic and we should not stabilize the unidiomatic variant first and have the community rush to use that one instead of the idiomatic variant.

I wouldn't want to delay simple and already semi-stable delimited attributes on this.
There are so many options for expanding from #[foo = LITERAL].

  • We can decide that key-value attribute form is legacy, and do not extend it.
  • We can make key-value attributes "expression-oriented" and extend the grammar to #[foo = EXPR], leaving other things like #[foo = TYPE] unsupported.
  • We can make key-value attributes "token-oriented" and extend the grammar to #[foo = TOKEN_TREE], leaving key-value attributes with multiple tokens #[foo = a b c] unsupported (including multi-token expressions).
  • We can make key-value attributes accept almost arbitrary token streams with single special token , (comma) acting as a terminator (#[foo = TOKEN_STREAM, bar = TOKEN_STREAM]).

I think this does requires an RFC and some thoughts about the future, but I wouldn't want to want to work on it myself, since it's not high priority.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 4, 2019

Hm, I feel exactly oppositely, this is the most trivial part of the proposal.
The summary of the stabilization change is that we allow inert proc macro helpers to use same syntax as proc macros themselves.
That's actually already true de-facto for non-derive helper attributes, because proc macros must remove them and feature gating is done after expansion (oops), so the relaxed syntax is already usable on stable.
FCP would be enough, I think.

Fair.

This also needs some attention and FCP, since this is some change of direction.

Hah; that was implied, sorry... :)
Let's wait for crater and then I'll fcp-merge.

(I propose that we do what the PR description suggests; tho I would split off the #[$path = $token_tree] form to another feature gate rather than remove it from nightly...)

I think this is an improvement, but who knows what others think.

It definitely is an improvement; the improved diagnostics are particularly great.

I wouldn't want to delay simple and already semi-stable delimited attributes on this.
There are so many options for expanding from #[foo = LITERAL].

  • We can decide that key-value attribute form is legacy, and do not extend it.
  • We can make key-value attributes "expression-oriented" and extend the grammar to #[foo = EXPR], leaving other things like #[foo = TYPE] unsupported.
  • We can make key-value attributes accept almost arbitrary token streams with single special token , (comma) acting as a terminator (#[foo = TOKEN_STREAM, bar = TOKEN_STREAM]).

I think this does requires an RFC and some thoughts about the future, but I wouldn't want to want to work on it myself, since it's not high priority.

Also fair... I might have some time to work on this, or @CAD97 can do it if they prefer (we can collaborate ofc...).

@craterbot

This comment has been minimized.

Copy link
Collaborator

craterbot commented Jan 5, 2019

🎉 Experiment pr-57321 is completed!
📊 985 regressed and 0 fixed (50551 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 5, 2019

985 regressed

O_O
I guess I should have used a forbid-by-default lint instead of an error after all.
Ok, activating plan B and downloading/grepping the logs would probably be faster than running crater again.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 5, 2019

I guess I should have used a forbid-by-default lint instead of an error after all.
Ok, activating plan B and downloading/grepping the logs would probably be faster than running crater again.

We might want to analyze these to categorize the regressions and see what the roots are before going for plan B... After doing that we might want to allow some new forms (and give them correct semantics rather than just ignoring...) and interpret them correctly or not based on the analysis. From a quick look I saw a lot of #[link = "dl"] originating with a single root crate.

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 5, 2019

We might want to analyze these to categorize the regressions and see what the roots

That's exactly the purpose of the plan B.
To categorize the root regressions I need to download and grep the logs :)

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 5, 2019

That's exactly the purpose of the plan B.
To categorize the root regressions I need to download and grep the logs :)

Oh, OK =P
Do your thing ^.^

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 5, 2019

First of all - no regressions from restricting key-value attributes to literals.

Beside that, here's the list of offending attributes with some statistics (statistics include non-root regressions):

126 #[deprecated = "Deprecated by RFC3445"]
 21 #[deprecated = "For Zone signing DNSKEY should be used"]
  3 #[deprecated = "just use Client::query"]
  2 #[deprecated = "just use query(...) from `Client`"]
 21 #[deprecated = "obsolete, transform_from instead"]
  2 #[deprecated = "use `error` instead."]
  6 #[deprecated = "Use blacklist_type instead"]
 69 #[deprecated = "Use upgrade instead"]
  6 #[deprecated = "use whitelist_function instead"]
  6 #[deprecated = "use whitelist_type instead"]
  6 #[deprecated = "use whitelist_var instead"]
 25 #[deprecated = "will be removed post 0.9.x"]
  2 #[deprecated = "You probably want the `Gcode::major_number()` and `Gcode::minor_number()` methods instead"]
  5 #[deprecated = "use SIG0 or DNSSec constructors"]
  5 #[deprecated = "will be removed post 0.9.x"]
  2 #[deprecated = "This was a failed attempt at finding a suitable abstraction. The async-codec crate might be what you need instead."]
 21 #[deprecated = "Deprecated by RFC3007"]
 84 #[deprecated = "Magically disappeared from the Vulkan specs"]
  3 #[deprecated = "Moved to spirit::utils"]
 21 #[deprecated = "obsolete, transform_from instead"]
 21 #[deprecated = "Renamed to COLOR_SPACE_SRGB_NONLINEAR_KHR"]
 18 #[deprecated = "Renamed to DEBUG_REPORT_OBJECT_TYPE_DEBUG_REPORT_CALLBACK_EXT_EXT"]
  1 #[deprecated = "see trust_dns::client::ClientFuture"]
  6 #[deprecated = "will be removed post 0.9.x, use RecordSet"]
  5 #[derive(::EnumSetType, Debug)]
  8 #[ignore(dead_code)]
  1 #[ignore(note = "compile-only test")]
  2 #[ignore = "Unimplemented: https://github.com/alexcrichton/toml-rs/pull/264#issuecomment-431707209"]
  4 #[inline = "always"]
  6 #[link]
861 #[link = "dl"]
  3 #[should_panic = "panicked safely"]
  3 #[should_panic = "the desired length"]
  3 #[should_panic = "the desired offset"]

Here's the list of root regressions:

altbitflags
appendbuf
arc-swap
async-serialization
bindgen
cvar
enumset
frunk_core
gcode
glutin
libc-extra
pcap-rs
rlbot
servo-glutin
spirit
systemfd
throw
toml
trust-dns
trust-dns-proto
vk-sys
winit
@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 5, 2019

Conclusions:

#[derive(::EnumSetType)] is a plain bug, absolute paths are not supported in meta-items, needs fixing.
There's also a suspicious case of #[doc] without an argument apparently coming from a desugared doc comment and macro (?) in pcap-rs, needs investigating.

It seems quite reasonable to support #[deprecated = "reason"] and #[ignore = "reason"] without warnings.

ignore(...), inline = "...", link, link = "...", should_panic = "..." need to be lowered from errors to deprecation lints.

#[link = "dl"] never worked, ha-ha.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 5, 2019

#[derive(::EnumSetType)] is a plain bug, absolute paths are not supported in meta-items, needs fixing.
There's also a suspicious case of #[doc] without an argument apparently coming from a desugared doc comment and macro (?) in pcap-rs, needs investigating.

👍

It seems quite reasonable to support #[deprecated = "reason"] and #[ignore = "reason"] without warnings.

Agreed; I suggested the same in #56896 (comment).

ignore(...), inline = "...", link, link = "...", should_panic = "..." need to be lowered from errors to deprecation lints.

re. should_panic = "..." we could maybe make that synonymous with should_panic(expected = "...") similarly to deprecated as above?

As for deprecation lints, do you mean C-future-compatibility or that we should keep this forever?

#[link = "dl"] never worked, ha-ha.

😂

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 5, 2019

?re. should_panic = "..." we could maybe make that synonymous with should_panic(expected = "...") similarly to deprecated as above?

Yes, didn't notice that. If should_panic already supports specifying the reason, then adding shortcut is ok.

As for deprecation lints, do you mean C-future-compatibility or that we should keep this forever?

What's the difference? Even if the warning ends up being kept forever, it's going to be framed as a deprecation lint anyway.

@nikomatsakis
Copy link
Contributor

nikomatsakis left a comment

Looks great! Just needs a few comments I think

Show resolved Hide resolved src/libsyntax/feature_gate.rs
Show resolved Hide resolved src/libsyntax/feature_gate.rs Outdated
Show resolved Hide resolved src/libsyntax/feature_gate.rs
Show resolved Hide resolved src/libsyntax/feature_gate.rs
Show resolved Hide resolved src/test/ui/proc-macro/attribute.stderr Outdated

@petrochenkov petrochenkov force-pushed the petrochenkov:atokens branch from ce4be55 to 2ad8cf7 Jan 12, 2019

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jan 12, 2019

Updated.

Show resolved Hide resolved src/librustc_lint/lib.rs Outdated

@petrochenkov petrochenkov force-pushed the petrochenkov:atokens branch from 2ad8cf7 to d3411d3 Jan 13, 2019

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2019

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 15, 2019

📌 Commit d3411d3 has been approved by nikomatsakis

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 15, 2019

@bors p=7

Rollup fairness.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 16, 2019

⌛️ Testing commit d3411d3 with merge ceb2512...

bors added a commit that referenced this pull request Jan 16, 2019

Auto merge of #57321 - petrochenkov:atokens, r=nikomatsakis
Implement basic input validation for built-in attributes

Correct top-level shape (`#[attr]` vs `#[attr(...)]` vs `#[attr = ...]`) is enforced for built-in attributes, built-in attributes must also fit into the "meta-item" syntax (aka the "classic attribute syntax").

For some subset of attributes (found by crater run), errors are lowered to deprecation warnings.

NOTE: This PR previously included #57367 as well.
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 16, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: nikomatsakis
Pushing ceb2512 to master...

@bors bors merged commit d3411d3 into rust-lang:master Jan 16, 2019

1 check passed

homu Test successful
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment