Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning 40 ("Constructor or label name out of scope") should be an error #6000

Closed
vicuna opened this issue Apr 25, 2013 · 18 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link

commented Apr 25, 2013

Original bug ID: 6000
Reporter: @lpw25
Assigned to: @lpw25
Status: closed (set by @xavierleroy on 2015-12-11T18:26:28Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 4.01.0+dev
Target version: 4.02.0+dev
Category: typing
Related to: #5759
Monitored by: @gasche @lpw25 @hcarty yminsky

Bug description

We have had record disambiguation on trunk now for a few months,
and I think that it would be wise to look again at the issue of
out of scope labels and constructors before the next major
release.

The issue is around the new ability to use a constructor or label
even though that constructor or label is not in scope, as long as
the type information is available.

For example, the following is allowed:

module M = struct
type foo = Foo
end

let x: M.foo = Foo

Note that previously this would be rejected, forcing the user to
write:

let x = M.Foo

Currently, using this feature produces a warning (40 "Constructor
or label name used out of scope"), however I think that it should
be upgraded to a full error.

This was discussed in #5759, however more issues have emerged
since that discussion, and I think now is a good time to
re-evaluate this descision.

My arguments against allowing out of scope labels and
constructors are as follows:

  • It allows references to constructors to be used in files that
    contain no reference to the file defining those
    constructors. For example, I have already come across the
    following code in the wild:

    (* types.ml *)
    ...
    module Project = struct
    ...
    type project = {
    project_id: string;
    ...
    }
    ...
    end
    ...

    (* data.ml *)
    ...
    open Types
    ...
    module Projects = struct
    open Project
    module Platform = struct

        let project =
          { project_id="platform"; 
            ...
          }
        ...
      end
      ...
      let all = [ Platform.project; ... ]
      ...
    end
    

    (* gantt.ml *)
    ...
    open Data
    ...
    List.map Projects.all
    ~f:(fun proj -> ... proj.project_id ...)
    ...

    Note that gantt.ml uses project_id but contains no reference to
    types.ml where that label is defined. When a bug occurred, it
    was non-trivial to fix due to the difficulty of finding the
    appropriate definition of project_id. This bug caused a lot of
    confusion among less experienced OCaml programmers, and I think
    that it is a very bad idea to allow it.

  • It only works "in the presence of type information". This is an
    unspecifed criteria that most users will probably find
    confusing.

  • It doesn't work with GADTs. If the constructor in question is
    declared using GADT syntax then it cannot be used out of scope.

  • It can't work with exceptions.

  • It won't work with "open types" (#5584) or "pattern synonyms"
    which may be added to the language in future.

  • It will confuse users by adding a small element of structural
    typing to an otherwise nominative system. I think that most
    users picture variant constructors as named values, and all
    named values in OCaml obey the same scoping rules. To suddenly
    allow these values to be used as if they were not named values,
    but structural values (like the methods in an object) will
    confuse people.

In return for these issues, all that we gain is the ability to
refer to out of scope constructors and labels without using a
module name, but only under certain difficult to define
conditions.

Last time that this issue was discussed, it only appeared to be
discussed by 4 people. I think perhaps it would be a good idea
to get wider feedback before including a potentially risky
feature like this in a release.

It is also worth pointing out that it would be easier to add this
feature in the future than to remove it later if it causes
problems.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 25, 2013

Comment author: @alainfrisch

It allows references to constructors to be used in files that contain no reference to the file defining those constructors.

This is not really new, the same happens e.g. with objects. The problem, for me, is more general: ocamlc should not be allowed to open foo.cmi when compiling bar.ml if bar.ml does not somehow mention Foo. At LexiFi, we have a local patch to do that, and it ensures that "ocamldep" returns (by construction) a safe upper-bound of the dependencies.

Also, I guess that proper tooling (based on -annot or -binannot) should mitigate the bad user experience you mention.


What will happen if we remove the feature? Assume we have two modules Ast and TypedAst which define two variant types with the same constructors. In the type checker, we map from one to the other, and it's quite nice to be able to write:

open TypedAst

let type_check env (e : Ast.t) =
let loc = e.loc in
match e.desc with
| Foo -> {desc=Foo; loc; typ=...}
| ....

How would you write this kind of code?


Also, it we turn warning 40 into an error, people will be strongly encouraged to use "open" more often. I consider "open" as a dangerous (even though quite useful) feature in itself, and being able to avoid it as much as possible is an advantage. In particular, the "out of scope" styles permit nice library APIs, like:

type flags = {visible: bool; enabled: bool}
type mode = Foo | Bar
val do_something: flags:flags -> mode:mode -> ...

...

TheLib.do_something ~flags:{visible=true; enabled=false} ~mode:Foo ...

If we disallow the "out of scope" style, either library authors will be encouraged to use structural types (and then we don't gain any readability), or users will use "open" more often. The local open mitigates the problems with "open", but it is not a panacea.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 25, 2013

Comment author: yminsky

I do agree with Alain on this point. The whole approach of doing constructor disambiguation that is based on having multiple conflicting constructors dumped into the same scope strikes me as an unscaleable mess. Leo's point that this is all very hard to reason about is perhaps right. But without this feature, the whole thing of having constructor disambiguation also strikes me as fairly low-value. From my perspective, it will change how we write programs at Jane Street almost not at all.

And, if this feature doesn't make it, we should figure out a way of doing a constructor-only local open. What I typically want is to get access to the constructors and only the constructors for a given type. There's no clean way of doing this now.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 25, 2013

Comment author: @lpw25

This is not really new, the same happens e.g. with objects.

That isn't really the same. Objects are structurally typed and
methods are essentially constants. So for example:

let f o = o#m + 3

always means the same thing and always has a type. Whereas

let f r = r.m + 3

only has a type in the presence of a definition of the m label
with type int.

Also, I guess that proper tooling (based on -annot or -binannot) should mitigate the bad user experience you mention.

Only if the bug has left the correct type information in place,
otherwise they will not be able to help. For instance, changing a
type annotation in one file can cause an "Unbound label" error in
another file, with no easy way to work out which definition the
label originally referred to.

open TypedAst

let type_check env (e : Ast.t) =
let loc = e.loc in
match e.desc with
| Foo -> {desc=Foo; loc; typ=...}
| ....

How would you write this kind of code?

Depending on the circumstances, I would use one of:


open Ast
open TypedAst

let type_check env (e : Ast.t) =
let loc = e.loc in
match e.desc with
| Foo -> {desc=Foo; loc; typ=...}
| ....

open TypedAst

let type_check env (e : Ast.t): t =
let open module Ast in
let loc = e.loc in
match e.desc with
| Foo -> {desc=Foo; loc; typ=...}
| ....

open TypedAst

let type_check env (e : Ast.t) =
let loc = e.Ast.loc in
match e.desc with
| Ast.Foo -> {desc=Foo; loc; typ=...}
| ....

Also, it we turn warning 40 into an error, people will be strongly encouraged to use "open" more often. I consider "open" as a dangerous (even though quite useful) feature in itself, and being able to avoid it as much as possible is an advantage.

I certainly agree that "open" needs improvements to make it
safer (allowing explicit signatures on them would be a good
start). However, the problems with open should be addressed by
solutions that apply to all values, not to a subset of values
that (to the average user) will appear completely arbitrary.

In particular, the "out of scope" styles permit nice library APIs, like:

type flags = {visible: bool; enabled: bool}
type mode = Foo | Bar
val do_something: flags:flags -> mode:mode -> ...

...

TheLib.do_something ~flags:{visible=true; enabled=false} ~mode:Foo ...

For these APIs what you want is structural types, and I see no
problem with using a polymorphic variant for the "mode" argument
in your example.

For the "flags" argument, you could use one of the following:

type mode = [Foo | Bar];;
val do_something: visible:bool -> enabled:bool -> mode:mode -> ...
...
TheLib.do_something ~visible:true ~enabled:false ~mode:`Foo ...

or

type mode = [Foo | Bar];;
type flags = [Visible | Enabled];;
val do_something_else: flags:flags list -> mode:mode -> ...
...
TheLib.do_something ~flags:[Visible] ~mode:Foo ...

The second example behaving much like the flags used in C APIs.

However, what your example really shows is a desire for a
light-weight object syntax. That would allow this code:

type flags = <visible: bool; enabled: bool>
type mode = [Foo | Bar];;
val do_something: flags:flags -> mode:mode -> ...
...
TheLib.do_something ~flags:{{ visible=true; enabled=false }} ~mode:`Foo ...

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 25, 2013

Comment author: @lpw25

The whole approach of doing constructor disambiguation that is based on having multiple conflicting constructors dumped into the same scope strikes me as an unscaleable mess.

This is essentially what we do with regular values. The only difference for labels and constructors is that you cannot rename them, hence the need for record disambiguation.

And, if this feature doesn't make it, we should figure out a way of doing a constructor-only local open. What I typically want is to get access to the constructors and only the constructors for a given type. There's no clean way of doing this now.

I agree, allowing something like:

open Foo.t

to get access to local access to t and all of its constructors would be very convenient.

In addition, providing:

open (M: S)

that only adds the elements of M mentioned in the signature S into the environment would make local opens much safer.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 26, 2013

Comment author: @garrigue

Leo, I don't understand your point.
The warning is already enabled by default.
It means that using this feature is a deliberate choice.
You may have to take extra care, but this is because you decided to do so.

If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.
Note that it is already pretty easy to obtain this information by adding a type annotation.

On the different points you mention, I see nothing very convincing.

  • On your concrete exemple, why couldn't you just get the type of proj?
    This is enough to know everything needed.
    Was the warning disabled? If this is not the case, Ignoring warnings is bad coding.
  • "presence of type information" is well-defined if you use -principal; even otherwise it is relatively predictable.
    Moreover this argument is against type-based ambiguity resolution as a whole, it is not directly related with scope.
  • it doesn't work with GADT only by choice; it could work if needed
  • it doesn't work for exceptions, but exceptions have a different semantics
  • we already had the discussion on nominal vs. structural, and personally I don't buy it, but again nobody forces anybody to use this feature
@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 26, 2013

Comment author: @lpw25

The warning is already enabled by default.
It means that using this feature is a deliberate choice.

Programmers tend to just ignore warnings that they don't understand. Which is what had happened in the case I described.

If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.

This would certainly help. The message should also be clearer.

On your concrete exemple, why couldn't you just get the type of proj?
This is enough to know everything needed.

It is unusual to have to solve an "unbound value" error by looking at the types. It did not occur to the programmers in question to even consider this. (It barely occurred to me).

This problem was also exacerbated because the program was being edited by some people using trunk and some others using 4.00.1. The improved error messages from trunk might have made the problem clearer. I think Hongbo mentioned similar problems in another bug report.

"presence of type information" is well-defined if you use -principal; even
otherwise it is relatively predictable.

Only to those of us who understand the type system. Currently, we are saying:

"If your value is a constructor and there is type information available and
it is not a GADT and it is not an exception, then you don't need to write
the module name"

I think that most programmers will just approximate this to

"Sometimes you don't need to write the module name"

it doesn't work with GADT only by choice; it could work if needed

Then it definitely should. The type information criteria is confusing enough without taking into account the syntax used to define the constructor.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 26, 2013

Comment author: @lpw25

If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.

This would certainly help. The message should also be clearer.

How about something like:

"Constructor M.Foo is not available as Foo in the current environment"

or

"Constructor M.Foo is not available as Foo in the current scope"

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 26, 2013

Comment author: @garrigue

I improved the message in commit 13612.

Warning 40: x was selected from type M.t.
It is not visible in the current scope, and will not
be selected if the type becomes unknown.

Note that if you do not believe in using type information,
you might want to enable warning 42 too.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 1, 2013

Comment author: @hcarty

It feels surprising that a value of Types.Project.project can not be created without type annotations and/or explicit pathing of the record fields but one can sometimes access record fields with no annotation or pathing.

The reasoning makes sense but the rules seem difficult to explain, particularly to someone new to the language:

module M = struct type t = { x : float; y : float } end;;

(* pre-4.01.0 method which still works )
let p_pathed_ok = { M.x = 1.0; y = 1.0 };;
(
Does not work, no type information available )
let p_not_ok = { x = 1.0; y = 1.0 };;
(
Specifying the type allows this definition to work *)
let p_annotation_ok : M.t = { x = 1.0; y = 1.0 };;

(* Accessing p_*_ok.x is fine here because we know the type already... *)
print_float p_annotation_ok.x;;

(* ... but this does not work (no type information) even though it
looks very similar to the 'print_float' line above... )
let f_not_ok p = p.x;;
(
... except that we were missing an annotation for this 'p' *)
let f_annotation_ok (p : M.t) = p.x;;

@vicuna

This comment has been minimized.

Copy link
Author

commented May 1, 2013

Comment author: @hcarty

This example is more surprising:

module M = struct type t = { x : float; y : float } end;;
type t = { x : float; y : float };;

(* OK - val f : M.t -> float *)
let f p = p.M.x +. p.y;;

(* Not OK -
Error: The field M.y belongs to the record type M.t
but a field was expected belonging to the record type t *)
let f p = p.x +. p.M.y;;

@vicuna

This comment has been minimized.

Copy link
Author

commented May 2, 2013

Comment author: @gasche

This example is the sign of the ambivalent situation we are in.

  • if out-of-scope label access was not possible, there would be no such confusion
  • if out-of-scope label access is considered the correct, usual case (without the warning 40), then "p.x" in the second program below is ambiguous, and the type-checker should at least warn when selecting the candidate "t"; the presence of this warning would make it clear what the problem is
  • currently out-of-scope label access has an intermediary status, so it doesn't really makes sense to mark an ambiguity when there is only one choice in scope (and more outside), and that would result in too much warnings to pay for a feature that the user may not use

I would personally support marking the out-of-scope access warning (40) as an error by default, and enlarging the set of ambiguity candidates (for warning 41) with out-of-scope candidates when warning 40 is not an error.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 2, 2013

Comment author: @garrigue

Just for you information:

$ ocaml -principal
....

let f p = p.M.x +. p.y;;

Warning 40: y was selected from type M.t.
It is not visible in the current scope, and will not
be selected if the type becomes unknown.
Warning 18: this type-based field disambiguation is not principal.
val f : M.t -> float =

I don't see that much asymmetry between the two cases...

I may be repeating myself, but ignoring warnings, even for beginners,
is not the right attitude. I hope that the new message is self explaining.
Warnings are not errors because they are warnings. This does not mean
you should ignore them.

If you really want to use labels out of scope, the right options are
ocaml -principal -w -40 -warn-error +18

Your other comment about hard to understand type propagation makes
sense (yet your examples don't look hard to understand).
But then the only solution is to disable all use of inferred type
information. All you examples can be rewritten without out of scope
access, and exhibiting exactly the same problem.

type t = { x : float; y : float };;
type u = { x : float; y : float; z:float };;

let f (p:t) = p.x;;

Does this mean that warning 42 should be enabled by default?
Or something limited to type disambiguation (42 also covers field set
disambiguation) ? This could be useful to remind that -principal
is needed for a precise definition.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 17, 2013

Comment author: @damiendoligez

Jacques,

Does all this mean that we should have -principal as the default? And warning 40 as an error by default?

As for Gabriel's suggestion of making the behaviour of the type-checker dependent on whether a warning is error-enabled, that will be over my dead body. If you want to do something like that, make up a new option.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 28, 2013

Comment author: @damiendoligez

A side question for Jacques about -principal:

fun b -> if b then format_of_string "x" else "y";;

  • : bool -> ('a, 'b, 'c, 'd, 'd, 'a) format6 =

fun b -> if b then "x" else format_of_string "y";;

Error: This expression has type ('a, 'b, 'c, 'd, 'd, 'a) format6
but an expression was expected of type string

The typing of string constants is not principal because of the overloading with formats. Is there an easy way to make the compiler output a warning for the first expression with -principal?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 16, 2013

Comment author: @lpw25

And warning 40 as an error by default?

The much improved wording of the warning is probably sufficient.

I still have some concerns about using constructors from files that are not explicitly referenced. Alain mentioned that Lexifi's version of OCaml does not allow an ml file to have an unmentioned module as one of its imports. Maybe we could provide a warning for such cases.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 22, 2013

Comment author: @alainfrisch

Alain mentioned that Lexifi's version of OCaml does not allow an ml file to have an unmentioned module as one of its imports. Maybe we could provide a warning for such cases.

A warning detecting exactly those cases where our modification would fail is difficult to do. In many cases, disallowing a .cmi to be loaded (because its module is not mentioned in the source code being compiled) does not lead to an error: the compiler tries to expand a type abbreviation and since the .cmi cannot be loaded, it considers it as an abstract type, but this works fine (the code doesn't actually require the expansion to be type-checked). This is not to say that the warning would not be useful as well...

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 2, 2014

Comment author: @lpw25

I think my original objections were satisfied by Jacques rewording of the warning message, so I'm closing this issue.

@vicuna

This comment has been minimized.

Copy link
Author

commented Apr 3, 2014

Comment author: @garrigue

The typing of string constants is not principal because of the overloading with formats. Is there an easy way to make the compiler output a warning for the first expression with -principal?

Yes, basically we just need to check that the expected type "format6" is generalizable.
This is done in trunk at revision 14523.

fun b -> if b then format_of_string "x" else "y";;

Warning 18: this coercion to format6 is not principal.

Sorry for not answering earlier.

As for how much -principal is important, the performance obstacle seems the main problem.
Alain tells me that he couldn't use it on the Lexifi code base because it was too slow.
In general I would say that you don't need to use it, but if you start to get unstable behavior in you code, it can help you to make it more robust, and give you a better intuition of where annotations are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.