Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize the "deprecated" alert #1804

Merged
merged 36 commits into from Nov 15, 2018
Merged

Generalize the "deprecated" alert #1804

merged 36 commits into from Nov 15, 2018

Conversation

alainfrisch
Copy link
Contributor

@alainfrisch alainfrisch commented May 28, 2018

We can currently mark components of libraries as being "deprecated", thus triggering warning 3 when they are referenced. It was proposed in #730 to add a notion of "unsafe" components, which would behave similarly, and I suggested to generalize this even further to allow other kind of alerts. For instance, one might want to mark components that perform I/O, rely on generic marshaling, or can
possibly raise exceptions.

This PR is an early proposal of such a generalization, in order to discuss the concept. I'll add add tests if there is a good reception and an agreement on the user interface.

In the same way as one can currently declare in an interface:

 val foo: int -> int
    [@@ocaml.deprecated "Please use bar instead!"]

one can now declare other kinds of alerts:

 val foo: int -> int
    [@@ocaml.alert unsafe "Please use bar instead!"]

where unsafe is the kind of alert (an arbitrary lowercase identifier). The former becomes sugar for:

 val foo: int -> int
    [@@ocaml.alert deprecated "Please use bar instead!"]

One can of course attach multiple alerts to a given component:

 val foo: int -> int
    [@@ocaml.alert deprecated "Please use bar instead!"]
    [@@ocaml.alert unsafe "Please use bar instead!"]

When an identifier is referenced, an instance of warning 3 is raised for each attached alert:

File "../foo.ml", line 11, characters 8-13:
Warning 3: deprecated: X.foo
Please use bar instead!
File "../foo.ml", line 11, characters 8-13:
Warning 3: unsafe: X.foo
Please use bar instead!

The same warning is also raised when a signature coercion "forgets" some alert kinds (however, one can freely change the associated message without triggering the warning).

We also provide more fine-grained control over which specific alerts are enabled, to further restrict warning 3 to specific kinds. Currently this can only be done by attributes (but doing it from the command-line would be useful as well), for individual kinds or for "all" kinds:

  ...
  [@@@ocaml.alert -unsafe]   (* disable warning 3 for the "unsafe" kind *)
  ...
  [@@@ocaml.alert +unsafe]   (* disable warning 3 for the "unsafe" kind *)
  ...
  [@@@ocaml.alert +all]   (* enable all alert kinds *)
  ...
  [@@@ocaml.alert -all]   (* disable all alert kinds *)

Not currently included:

  • Control alert kinds from the command-line.
  • Agree on some alert names and annotate some stdlib components.
  • Also mark some language constructs (e.g. field assignments or reads from mutable fields).
  • Control "warnerror" for each specific kinds?
  • A construct to mark a component implementation with a given alert and at the same time disable reports on that alert while type-checking that component (for instance, to mark a function "unsafe" for external callers and allow inner uses of unsafe features without getting a warning)?

Opinions?

@Drup
Copy link
Contributor

Drup commented May 28, 2018

As a user, I regularly wish for an unsafe annotation and this was also on my list of things to implement. I like the overall design, although I'm not sure about bundling everything under warning 3 (but then, again, the whole warning system needs a bit of a remake).

One thing that I think is essential is the ability to do something like

begin[@alert -unsafe]
 ...
end

I'll let the inevitable bikeshedding happen and review the implementation once the design has stabilized.

@alainfrisch
Copy link
Contributor Author

Once thing that I think is essential is the ability to do something like

Just to be clear: this is already supported. I gave examples only with "floating" attributes, but in all places where warning-control attributes can be used, it is possible also to control alerts.

@bobot
Copy link
Contributor

bobot commented May 29, 2018

One additional feature that I think is worth considering here for control over the alert, is the possibility to put an ocaml version as argument of ocaml.alert, and to have the possibility on the command line to specify a compatibility version. Every alert with a number greater than the version given on the command-line would be hidden.

@alainfrisch
Copy link
Contributor Author

Every alert with a number greater than the version given on the command-line would be hidden.

You mean the opposite, right? (Alerting when a "new" feature is used but we want to enforce compatibility with an "old" version of OCaml.)

@bobot
Copy link
Contributor

bobot commented May 30, 2018

You mean the opposite, right?

Perhaps there are different needs, but for mine I stay with this side: hide alert > cmdline. A specific use case:

  1. In development (not for release) emitting an alert is an error
  2. You want to support 4.02.3
  3. String.capitalize has been deprecated in 4.03.0
  4. A developer that uses ocaml > 4.04.0 has an error:
    • if it fixes it by using String.capitalize_ascii, the code doesn't work for 4.02.3.
    • if it silence the deprecation warning, the deprecation present in 4.02.3 are not used

Alerting when a "new" feature is used but we want to enforce compatibility with an "old" version of OCaml.

For that, what is needed is the use of the @since by the compiler for another warning. It is interesting and complementary, but I don't see the direct link with alert and deprecation warning. Do you propose [@@ocaml.alert since 4.03.0]? This part is less critical because CI are able to catch it. In the first case of this message except using conditional compilation there is no good way to write the code.

@alainfrisch
Copy link
Contributor Author

Ok, understood; you want to specify a target version number and have the compiler flag only accesses to component that were already deprecated on that version. This is indeed complementary to flagging accesses to component that were not available on that version. (But I think both situation can be detected using CI, by considering deprecation warnings as errors while compiling against that target version.)

Do you think we should aim at a general mechanism, with version constraints, and not restricted to the version of OCaml? A library could also want to expose since/deprecated attributes (with version numbers) to help client code work against multiple versions of the library.

@alainfrisch
Copy link
Contributor Author

alainfrisch commented Jul 12, 2018

About version constraints, one could imagine declaring:

 val foo: ... [@@ocaml.alert since "4.07"]
 val foo: ... [@@ocaml.alert deprecated_since "4.06"]

(this is already possible) and then specifying constraints on the command or in attributes:

 [@@@ocaml.alert +since > "4.07"]
 [@@@ocaml.alert +deprecated_since < "4.07"]

I believe this can come later, on top of the current design. Since it is not completely straightforward to compare version strings, I'd rather leave it outside this PR, if you (@bobot) agree.

@bobot
Copy link
Contributor

bobot commented Jul 13, 2018

@alainfrisch I agree, I'm happy that we agreed on the design and goal of this future feature.

@alainfrisch
Copy link
Contributor Author

@damiendoligez Does the high-priority tag mean that you'd like this to be ready for the next version (if so, is it different from milestone = 4.08)?

I think the only points which are really required for inclusion are:

  • Support for controlling alerts from the command-line.
  • Documentation and tests.

Other extensions listed in the original description can come later, I believe. Do you agree?

@alainfrisch
Copy link
Contributor Author

I'm planning to add a distinction similar to -w vs -warn-error, in order to control which alerts are fatal error. Currently, alerts which are enabled trigger warning 3, and this warning can be globally turned into a fatal error or not, but this might not be fine-grained enough.

Design question: Is it good to rely on the warning infrastructure for alerts, or should this be just a new notion (perhaps keeping warning 3 only for the deprecated alert)?

@lpw25
Copy link
Contributor

lpw25 commented Jul 17, 2018

I think it would probably be better to use a new mechanism. Possibly warning 3 should just be used for deprecated language features, with deprecated values generating a new "alert" instead.

On the implementation side, could you switch to using a concrete representation of the alerts attached to a value, rather than re-parsing the attributes at every use. I know it is how deprecated warnings are currently implemented, but this kind of "stringly-typed" approach just makes things harder to maintain.

@alainfrisch
Copy link
Contributor Author

On the implementation side, could you switch to using a concrete representation of the alerts attached to a value, rather than re-parsing the attributes at every use. I know it is how deprecated warnings are currently implemented, but this kind of "stringly-typed" approach just makes things harder to maintain.

This would require changing many record types (value_description, type_declaration, constructor_description, etc), adding one more field next to the existing xxx_attributes. Since those alerts are supposed to flow exactly as attributes (and since they are specified as attributes, one expects them to be kept in the xxx_attributes fields in .cmt files as well), I'm not sure about the benefits of adding those extra fields. Surely, any external tool that parses .cmt files can just call Builtin_attributes.alerts_of_attrs to synthesize the "virtual" alert field, if needed. What exactly becomes harder to maintain?

@alainfrisch
Copy link
Contributor Author

Moreover, it's not just "deprecated" which is currently handled directly through attributes internally, but other attributes understood by the compiler as well: ocaml.warning, ocaml.ppwarning, ocaml.warn_on_literal_pattern, ocaml.deprecated_mutable, ocaml.inline, etc. (ocaml.inline gets a dedicated representation, but only at the lambda level, i.e. they are kept as general attributes in cmt files). One could also make the argument for docstrings, which have special support in the compiler and could thus deserve custom representation. If we start adding fields for all of them, this will become quite heavy, and I don't see a compelling reason to do that.

@lpw25
Copy link
Contributor

lpw25 commented Jul 17, 2018

Since those alerts are supposed to flow exactly as attributes

They do not flow exactly as attributes, because attributes do not flow -- they are nodes attached to the AST and should not be propagated by the type-system. The occurrences of Parsetree.attributes in types.ml are basically just a mistake. Taking this approach leads to issues like the those mentioned in #1822 with people suggesting awful hacks like "cascading" attributes in an ad-hoc fashion to handle specific use cases.

What exactly becomes harder to maintain?

We are currently building a variety of extra information, that is meaningfully used in the type-checking and compilation of code, and then obfuscating it by hiding it as a AST in an associative list keyed by strings. This is just bad software engineering. It makes the compiler harder to maintain. When adding new operations on the types in the Types module there is no help from the type-system to indicate that there is semantically meaningful information that may also need to be handled hidden inside the _attributes field, or any indication of what the valid forms of this information are. When we convert the AST to a more meaningful representation we should convert the attributes understood by the compiler as well.

This would require changing many record types ...

Sure, but the size of the diff isn't nearly as important as the quality of the resulting code. Although to be fair, it is not like my suggested change needs to go in this particular PR. I'm just becoming steadily more frustrated with the situation each time we put additional meaningful information into the attributes.

If we start adding fields for all of them, this will become quite heavy

Well, as you pointed out, a number of these fields flow in the same ways and are basically used for the same things. I think you could easily group them into their own record type and just have a single (possibly optional) field for them in the original record.

@gasche
Copy link
Member

gasche commented Jul 17, 2018

I must say that I'm rather inclined to go in Leo's directly, of having specific representation for structured data, and turning it back into something unstructured at serialization points only. That said, this is more important for information that needs to travel a long way towards the runtime (the natural way to elaborate into more-structured representations is the (parsetree -> typedtree) step; it is less clear-cut for mostly-frontend that happens before the typedtree, or during typedtree production, so I see where Alain's perception comes from.

@alainfrisch
Copy link
Contributor Author

Ok, I see your point. IIUC, you are ok with keeping attributes in the Typedtree representation, but not in Types. Is that right?

I almost agree that at some point, it could make sense, indeed, to replace those occurrences in Types with a more typed version, keeping a structured representation of attributes that we need at this level. This would not include for instance ocaml.inline, since it is not used by the type checker, and found directly in Typedtree.attributes field. Is this what you have in mind?

I said "almost agree", because of references to Types records from Typedtree. For instance, Texp_ident keeps a snapshot of the Types.value_description from the definition of that identifier, and this record currently has all syntactic attributes on the definition. This can be used by an external tool, which can react on attributes unknown to the compiler (the "alert" stuff could almost be implemented as such an external tool, also thanks to Cmt_format.value_dependencies; and it is straightforward for an editor mode to show the "docstring" of a referenced function, just by reading the .cmt file). If we remove the syntactic attributes, we prevent such usages. If we keep them (in addition to the structured form), we have duplication of information, which is not nice either.

Anyway, as you kindly open the door to it: I suggest to keep this discussion for later. It would not really make sense to keep other similar properties in attributes, and introduce a new field just for alerts.

@alainfrisch
Copy link
Contributor Author

Coming back to the UI for the suggested feature:

  • No longer depends on warnings. I've dropped warning 3 (Deprecated). Leo suggested to keep it just for the "built-in" deprecation of linguistic features, but I'm not sure it's worth making a distinction. I think that linguistic features could trigger other kinds of alerts at some point (unsafe primitives, side effects, etc), so the distinction would remain purely historical.

  • Allow to control both the enabled/disabled flag, and the error/notice mode (to make it possible to declare some alerts as fatal, other as informative only).

  • Support for controlling alerts from the command-line and OCAMLPARAM, and changed the syntax for the corresponding attributes, so that all use the same syntax.

@lpw25
Copy link
Contributor

lpw25 commented Jul 17, 2018

Is that right?

Yeah, that's the idea.

This can be used by an external tool, which can react on attributes unknown to the compiler

I don't think any serious tools uses this information because it is very unreliable. Certainly, odoc and merlin ignore it and reconstruct the relationship between uses and definitions themselves, because it is the only way to make it work properly in the general case. Personally, I would prefer if we tried to store something that helped tools move from the uses in the typedtree to the corresponding definitions . This would be sufficient to easily extract the attributes if that is what your tool wants to do.

Even if we do have to keep both the unstructured form for tools, and the structured form for the compiler, I think that would be a large improvement over the current situation.

I suggest to keep this discussion for later

Fair enough, that's my last word on the topic in this thread.

I think that linguistic features could trigger other kinds of alerts at some point

That's a good point. I'm sad that we're left with a warning that is completely unused, but I guess it can't be helped.

@alainfrisch
Copy link
Contributor Author

I don't think any serious tools uses this information because it is very unreliable

Can you elaborate? If we look at values only, I think that implementing something like "deprecated" or "alerts" based on Texp_ident and value_dependencies would be as robust as the current implementation.

I'm sad that we're left with a warning that is completely unused

That will make two of them actually (25 was also killed).

@lpw25
Copy link
Contributor

lpw25 commented Jul 17, 2018

Can you elaborate? If we look at values only, I think that implementing something like "deprecated" or "alerts" based on Texp_ident and value_dependencies would be as robust as the current implementation.

Actually, it probably would be fine for the simple uses you had in mind. It had been a while since I'd last thought about this issue, and I think I got my point somewhat back-to-front. My point would be better stated as: more complex tools require a way to go from a use to a definition in typedtrees reliably anyway -- and if we had that then the simple tools could use it instead and we wouldn't need to keep Types.value_description on uses of identifiers.

@alainfrisch
Copy link
Contributor Author

more complex tools require a way to go from a use to a definition in typedtrees reliably anyway -- and if we had that then the simple tools could use it instead and we wouldn't need to keep Types.value_description on uses of identifiers.

I fully agree with that and I wished we had such a way to "point" to some inner nodes of a Typedtree (and support fast lookup of the target). This doesn't sound hard, but a bit tedious in practice (e.g. some "syntactic path", but this requires to keep track of that while type-checking).

@alainfrisch
Copy link
Contributor Author

About warning 3: I've made it a synonym for alert "deprecated", meaning that attributes such as @@@ocaml.warning "-3" or command-line options such as -w @3 keep their meaning.

@alainfrisch
Copy link
Contributor Author

The doc has been written. Removing the wip label. This is now ready for review.

Future extensions listed in the original description and during the discussion can be left for later. I think the current proposal is already useful.

@damiendoligez damiendoligez changed the title [RFC] Generalize the "deprecated" alert Generalize the "deprecated" alert Jul 25, 2018
Copy link
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments.

I haven't thoroughly reviewed the code, so a second review would be good.

parsing/builtin_attributes.ml Outdated Show resolved Hide resolved
parsing/builtin_attributes.ml Outdated Show resolved Hide resolved
@alainfrisch
Copy link
Contributor Author

@Drup : have you had a chance to give a close look to this PR?

@Drup
Copy link
Contributor

Drup commented Oct 8, 2018

@alainfrisch Not yet, and I won't have time before at best next week.

@alainfrisch
Copy link
Contributor Author

@damiendoligez kinda volunteered to also review this PR.

Copy link
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, but I have a few remarks.

  • I would like to add the ++/-- syntax to warnings for consistency (but that's probably for another PR).
  • We should think hard about what the default should be. Currently it's "all alerts enabled and not errors" but if we want to add alerts to the standard library (like unsafe, mutable, I/O, exceptions) that's probably too disruptive: for every alert we add, people will get lots of warnings that are irrelevant to them.

manual/manual/refman/exten.etex Outdated Show resolved Hide resolved
parsing/builtin_attributes.ml Show resolved Hide resolved
manual/manual/refman/exten.etex Outdated Show resolved Hide resolved
@alainfrisch
Copy link
Contributor Author

Currently it's "all alerts enabled and not errors" but if we want to add alerts to the standard library (like unsafe, mutable, I/O, exceptions) that's probably too disruptive: for every alert we add, people will get lots of warnings that are irrelevant to them.

When we add such "built-in" alerts, one could decide at this point to disable some of them by default (as we do when we add new warnings). The alternative would be to disable everything by default, but my concern is that if libraries start using alerts, nothing will encourage their "clients" to enable alerts; having everything enabled by default will make alerts more "visible" and thus useful.

@alainfrisch
Copy link
Contributor Author

@damiendoligez : does my note above address your concern?

Since this PR is rather mechanical but invasive, it is a pain to rebase it, so if the code looks good, could we try to have it merged soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants