Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework warning 53 (misplaced attributes) to work on all attributes in all places #12451

Merged
merged 13 commits into from
Nov 10, 2023

Conversation

ccasin
Copy link
Contributor

@ccasin ccasin commented Aug 1, 2023

This PR reworks warning 53 (misplaced attribute) so that it works for all attributes and is issued in a uniform way.

Presently, there are many attributes known to the compiler for which this warning is never issued (even though there are many places one can put them that they will be ignored). Those for which the warning can be issued are handled by just looking for them in specific places we think someone might mistakenly write them. The goal of this PR is to fix that situation.

The main idea is to take an approach like ppxlib: When the parser sees an attribute, it adds it to a table. When the compiler uses the attribute, it marks it as used by removing it from the table. When compilation is done, we check whether any attributes remain in the table, and issue warnings for them.

We only track known compiler built-in attributes for the warning. There is now a list of such attributes in Builtin_attributes. While most ppxes now remove their attributes after processing them, not all do, and other attributes may be in the program for various reasons. To deal with the problem that the compiler can't know whether non-built-in attributes are misplaced, we only track those attributes which appear in the list of built-ins.

When we deployed this within Jane Street, we found more than 200 misplaced attributes in our code base. And this PR even cleans up a few misplaced attributes in the stdlib. So, I believe this is a useful addition.

Reviewing

This PR isn't massive, but I've broken it into several commits to create an opportunity to explain a few subtleties:

Commit 1: Add attr tracking mechanism, but don't whitelist any attrs yet

This adds all the new attribute tracking code, but leaves the list of tracked attributes empty and the old w53 stuff in place. So all the code to add attributes to the table is there and fires, but no attributes are on the whitelist so they aren't added.

A couple notes about when attributes are added to the table:

When to add attributes: It's not sufficient to just add attributes when they are parsed in the parser, because not all parsetrees come from the parser. The -pp flag will read a marshalled parsetree instead, and ppxes may be used to modify the parsetree.

We want to track only attributes in the final version of the parsetree. We can accomplish this by adding attributes to the table in two cases:

  • If no -ppx or -pp flags are passed, the parser will be run and the parsetree won't be modified, so we add attributes during parsing.
  • If either of those flags is passed, the Ast_invariants check will be run on the final parsetree, so we add attributes during that check instead.

Attributes in attributes: Attribute payloads can contain more attributes. This happens in practice, if rarely.

These inner attributes should not be tracked for w53 - we can't know the meaning of the outer attribute, so we have no way to know what to do with them and whether or not they were used appropriately.

If we're adding attributes during the Ast_invariants check, that's easy - just update the iterator when descending into an attribute payload. In the parser, it's harder. I've done the dumb thing of just removing any nested attributes from the table after parsing an attribute. This wastes a little effort, but attribute payloads are typically small and nested attributes are rare, so it shouldn't be too bad. This strategy seemed much simpler and easier to review than trying to thread some state through the parser, but I'm open to other suggestions.

Commit 2: Rework w53 tests

This replaces the w53 test with a new one that tests all compiler built-in attributes. For each attribute, I tried to include examples of where it is allowed to appear and where it is not and w53 is issued (if such a place exists). The corresponding reference file is updated over the next two commits as attributes are added to the whitelist.

Commit 3: Handle the basic attributes

This adds most attributes for the whitelist, leaving off those related to warnings (which need to be handled a little differently).

The bulk of the diff here is in lambda/translattribute.ml. There are several related changes:

  • We should use the accessors in Builtin_attributes to look for specific attributes, ensuring they are marked as used.
  • A few functions, like add_local_attribute, are slightly rearranged so that attributes aren't checked for until we're sure we're in a context where they can be used (since the act of looking for the attribute marks it as used).
  • It's no longer necessary to remove these attributes from the terms when they are used. The old implementation of w53 for those attributes, which relied on checking whether attributes remained on terms after translation, is removed.
  • Care must be taken to mark an attribute as used even if we're compiling with a configuration that does not use it. E.g., the unrolled attribute only has an effect on the program if compiling with flambda, but should not issue w53 when compiling with closure.

This commit fixes a bug in find_attribute, I think: It used to be that if an attribute was written twice on the same term (e.g., [@inline] [@inline]), it was removed with a warning. Now one of them is used, with a warning.

It also removes a few misplaced [@inline] attributes in the stdlib (these attributes do nothing in mli files).

Commit 5: Handle attributes like "warning" and "alert"

Attributes like [@alert] are "used" by placing them in the environment, where they may be noticed later. So this commit adds code to env.ml to mark such attributes used when they are added to the environment as part of some form that is allowed to use them. The special case of a file-level [@@@alert] is handled in typemod.ml.

In the stdlib there is an [@@@alert] attribute in the hashtable template file which is particularly annoying: That template is included in one place at the top level of a file (so the attribute is legal) and another place inside a module (where the attribute has no effect). I've dealt with this by using warning attributes to disable w53 here.

Commit 6: [@ppwarning] with a bad payload should give w47, not w53

This just fixes a minor issue I noticed in the final state of this PR. [@ppwarning] with a bad payload used to be silently ignored, but after this PR it would issue w53. But warning 47 (bad attribute payload) is more appropriate.

@ccasin
Copy link
Contributor Author

ccasin commented Aug 1, 2023

There is a problem here with the stdlib template mechanism (hashtbl.template.mli has an attribute that works fine in hashtbl.mli but is incorrect when that signature is inlined in moreLabels.mli, because [@@@alert] only works at the file level).

I don't know much about the stdlib template mechanism, so I'm not immediately sure what the right fix is. Happy for suggestions, or I'll look closer tomorrow.

(Now fixed, see below)

@ccasin ccasin force-pushed the warning-53 branch 2 times, most recently from 1593c39 to 4d037f1 Compare August 5, 2023 17:30
@ccasin
Copy link
Contributor Author

ccasin commented Aug 5, 2023

OK, I've (hopefully) fixed the stdlib / templates issue by making warning 53 for top-level alerts controllable with warning attributes, and turning it off around the relevant attribute in the template file. It would be nice to make it fully controllable with warning attributes, but that seems a bit tricky and best left for a subsequent PR. Though I'm open to other suggestions regarding the template file issue.

I've rebased, adjusted the history, and force pushed, putting this fix as part of the commit that dealt with alert attributes and cleaning the bootstrapping into its own commits. If anyone wants to see just the fix added for the template issue, I put it here. I've adjusted the original PR description to reflect this.

@Octachron Octachron self-requested a review September 20, 2023 13:10
@Octachron Octachron self-assigned this Sep 20, 2023
Copy link
Member

@Octachron Octachron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some minor remarks on the Builtin_attributes interface with respect to the ocaml namespace for attributes, but otherwise that PRs feels like a clear improvement to me.

Marking built-in attributes as we use them should be far less error prone than trying to remember to check if some builtin attributes were left unused somewhere or somewhen.

parsing/builtin_attributes.ml Show resolved Hide resolved
; "warnerror"; "ocaml.warnerror"
; "warning"; "ocaml.warning"
; "warn_on_literal_pattern"; "ocaml.warn_on_literal_pattern"
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than listing each built-in in its prefixed and non-prefixed form, I propose to move the prefixing logic to the lookup functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this change, and the similar suggested changes below. This resulted in some additional simplifications (various functions that used to take lists can now just take a single attribute name).

I was on the fence, after reading the suggestion, because a consequence of this move is that we'll needlessly allocate additional strings when we strip ocaml. prefixes. However, after seeing the simplifications, I'm happy with the change. (Also, for small programs or programs with no attributes, this is probably just a win.)

@@ -67,6 +133,40 @@ let error_of_extension ext =
| ({txt; loc}, _) ->
Location.errorf ~loc "Uninterpreted extension '%s'." txt

let mark_alert_used a =
match a.attr_name.txt with
| "ocaml.deprecated"|"deprecated"|"ocaml.alert"|"alert" ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, do we want to add an unprefix function that removes the ocaml. prefix to avoid the duplication here?

@@ -243,34 +351,36 @@ let warning_scope ?ppwarning attrs f =
Warnings.restore prev;
raise exn


let warn_on_literal_pattern =
let has_attribute nms attrs =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, all arguments of has_attribute have the form ["ocaml." ^ attribute, attribute] which suggest to move the namespacing logic inside the has_attribute function.

parsing/builtin_attributes.mli Show resolved Hide resolved
parsing/builtin_attributes.ml Show resolved Hide resolved
parsing/builtin_attributes.mli Outdated Show resolved Hide resolved
parsing/builtin_attributes.ml Outdated Show resolved Hide resolved
parsing/builtin_attributes.ml Outdated Show resolved Hide resolved
parsing/builtin_attributes.mli Show resolved Hide resolved
@ccasin
Copy link
Contributor Author

ccasin commented Oct 29, 2023

@Octachron Thanks very much for the thoughtful review, and apologies for the delay in getting back to this. I've responded to a couple points above. I meant to rebase and integrate the review feedback today as well, but got distracted by tracking down #12697 and am now out of time. I will return later in the week to integrate your suggestions.

ccasin and others added 12 commits October 31, 2023 13:52
The reference file is not correct yet - it will be updated in subsequent commits
as we track attributes with the new mechanism.
Needs a bootstrap due to removing misplaced attrs in the stdlib.  That's done in
next commit.
This also makes w53 controllable with warning attributes, but only for top-level
alerts.  This is used to handle the fact that in the stdlib there's a template
file with a top-level alert attribute, and it's included in two different
places, one where that attribute is legal, and one where it is not

A boostrap is needed, and performed in the next commit.
This also simplifies the types of several functions, which no longer need to
take lists.
@ccasin
Copy link
Contributor Author

ccasin commented Oct 31, 2023

@Octachron I believe I have addressed all your comments above with four new commits, and this is ready for further review. I've also rebased it on to the latest trunk. Thanks again!

if String.starts_with ~prefix:"ocaml." s && len > 6 then
String.sub s 6 (len - 6)
else
s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it seems that in a handful of cases, we could avoid the allocation by having a attr_equal function rather than using the String.equal (drop_ocaml_attr_prefix x) y pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - I've made this change, leaving only the allocation in is_builtin_attr. This seems like a fine compromise.

end

module TestDeprecatedStruct = struct (* CJC XXX THIS IS ALL BUGGY *)
let x = 5 [@deprecated] (* rejected *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is buggy in this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry, nice catch! This was a stray comment from an intermediate state, now deleted.

Copy link
Member

@Octachron Octachron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to merge to me (except for some minor remarks).
I really like the fact that with this PR the misplaced attribute warning will only require care from attribute authors.

@ccasin
Copy link
Contributor Author

ccasin commented Nov 9, 2023

I've integrated the last round of minor feedback, and believe this is good to go when the tests pass.

@gasche
Copy link
Member

gasche commented Nov 9, 2023

What should we do with the git history? Squash everything, something else?

@ccasin
Copy link
Contributor Author

ccasin commented Nov 9, 2023

What should we do with the git history? Squash everything, something else?

I think squashing is reasonable - it should read fine as one commit. But I'm open to suggestions if you'd like me to reorganize it.

@Octachron
Copy link
Member

The commit history seems mostly linear (no back-and-forth) and contains some bootstrap, thus I think it is clearer to merge the PR as it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants