New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ``[]`` as a user-defined constructor #234

Merged
merged 6 commits into from Jan 29, 2016

Conversation

Projects
None yet
@objmagic
Contributor

objmagic commented Aug 29, 2015

This patch allows empty list as a user-defined constructor name. Together with pull request #144 , one can write the following heterogeneous list.

module H = struct
  type 'a hlist =
   [] : unit hlist
   | :: : 'a * 'b hlist -> ('a * 'b) hlist
end

let h = H.[1; "two"; false]
module H :
  sig
    type 'a hlist = [] : unit hlist | :: : 'a * 'b hlist -> ('a * 'b) hlist
  end
val h : (int * (string * (bool * unit))) H.hlist =
  H.:: (1, H.:: ("two", H.:: (false, H.[])))
@Drup

This comment has been minimized.

Contributor

Drup commented Aug 29, 2015

I'm all in favor of that. It's basically a little piece of #76 but It should be enough for interesting uses.

I personally think the syntax, for uniformity, should be:

module H = struct
  type 'a hlist =
   [] : unit hlist
   | (::) : 'a * 'b hlist -> ('a * 'b) hlist
end
@objmagic

This comment has been minimized.

Contributor

objmagic commented Aug 30, 2015

I'm OK with both of the syntaxes.

@damiendoligez

This comment has been minimized.

Member

damiendoligez commented Sep 23, 2015

For consistency, definitely demand (::) instead of :: whenever the constructor is not surrounded by its arguments.

@objmagic

This comment has been minimized.

Contributor

objmagic commented Sep 24, 2015

I uncommented another line to allow (::) to be used as constructor name. I still leave :: as valid constructor name

@chambart

This comment has been minimized.

Contributor

chambart commented Oct 7, 2015

Notice a nice use case for that:

type (_,_) t =
  | [] : ('a, unit) t
  | (::) : 'a * ('a, 'b) t -> ('a, unit -> 'b) t

let rec map: type a b l. (a -> b) -> (a, l) t -> (b, l) t = fun f -> function
  | [] -> []
  | h :: t -> f h :: map f t

let [a; b; c] = map succ [1;2;3]

It would allow to safely write that kind of mapping on a set of values.

@lefessan

This comment has been minimized.

Contributor

lefessan commented Oct 7, 2015

I don't see the point of allowing overriding of all the of basic syntax. For every OCaml user, [] is expected to be the empty list, idem for :: and [a;b;c]; breaking this assumption makes the code unreadable, unmaintenable by other developers, incompatible with other libraries, etc. So, the only consequence that I can see of accepting such patch will be a strong recommendation in the user manual to never use that possibility...

So, why add features just to recomment later not to use them ???

@Drup

This comment has been minimized.

Contributor

Drup commented Oct 7, 2015

It's not possible to have confusion for the unaware user without getting a warning about disambiguation, at the very least, and it makes some pattern much easier to express. I don't see how it's different from various other recent additions to the language.

@lpw25

This comment has been minimized.

Contributor

lpw25 commented Oct 7, 2015

I don't see the point of allowing overriding of all the of basic syntax.

For the same reason OCaml allows overriding of operators.

@lefessan

This comment has been minimized.

Contributor

lefessan commented Oct 7, 2015

@Drup: the warning about disambiguation is going to disappear at some point in the future. At that point, there will be nothing to save the "unaware user".
@lpw25: yes, we made that mistake. Why make it worse now ? Those few people who use it do it for some time, and then just stop because they discover it is not good. It makes nice 5 lines programs, but rarely maintainable 10 klines programs.

I think we have a choice: either we want to make the language simple and easy, to encourage new users without a PhD in type theory to join us, or we want to make the most complex language in the world, with features that nobody use but so that we can say: oh, it's cool, I can write type 'a option = () of 'a | false and it works ! and it works... :-(

@lpw25

This comment has been minimized.

Contributor

lpw25 commented Oct 7, 2015

yes, we made that mistake. Why make it worse now ?

It's not a mistake its a useful feature. + is "normally" addition on integers but in some contexts it makes much more sense for it to mean something else. You might as well argue that every value in every program must be completely unique because obviously all identifiers only have one sensible meaning.

@Drup

This comment has been minimized.

Contributor

Drup commented Oct 7, 2015

Those few people who use it do it for some time, and then just stop because they discover it is not good.

Because Int32.add (Int32.mul x y) z is obviously so much better and easier to understand to the beginner (and the non-beginner) than Int32.( x * y + z )

Please.

@lefessan

This comment has been minimized.

Contributor

lefessan commented Oct 7, 2015

@lpw25: indeed, one feature I really enjoy in OCaml is having the possibility to make "every value in every program" "completely unique": the recommended coding style should be to never open modules except for constructors and labels, and to always prefix identifiers by the module providing them (List.map,etc.), thus making them "completely unique" in the program.

@Drup: Yes, I am wondering why the beginner does not understand simple error messages:

# let sum t x = Int32.(t.(x) + t.(x+1) );;
Error: This expression has type int but an expression was expected of type
         int32

I like compositionnality...

@lpw25

This comment has been minimized.

Contributor

lpw25 commented Oct 7, 2015

"every value in every program" "completely unique"

I wasn't suggesting your argument is equivalent to saying that there should only be one thing called map (or +) in scope, I was suggesting your argument is equivalent to saying there should only be one thing called map (or +) in the whole program. No Array.map since there is already a List.map. No Int32.(+) because there is already a Pervasives.(+).

@lefessan

This comment has been minimized.

Contributor

lefessan commented Oct 7, 2015

My argument is that things that cannot be uniquely identified should not exist in a program: so, map can be both in List and Map, but the user should never open both of them (and actually, they should never be opened). For (+) and other operators, there is no way to elegantly uniquely qualify them : x Int32.(+) y is not better than Int32.add x y.

Anyway, as the example with arrays was showing, you reach the limits of overriding (+) so fast that it's really not worth allowing it in the first place.

@lpw25

This comment has been minimized.

Contributor

lpw25 commented Oct 7, 2015

Anyway, as the example with arrays was showing, you reach the limits of overriding (+) so fast that it's really not worth allowing it in the first place.

That example is completely fabricated, I can't really see how that constitutes "so fast". In practice it is fine to use such overloading and it can lead to much clearer code.

@objmagic

This comment has been minimized.

Contributor

objmagic commented Oct 7, 2015

so, map can be both in List and Map, but the user should never open both of them

I agree.

module H = struct
  type 'a hlist =
   [] : unit hlist
   | :: : 'a * 'b hlist -> ('a * 'b) hlist
end

Say user writes this module. We can assume that user understands what bad thing is going to happen if he opens his module later. It is analogous to the case in which user opens module Plus = struct let ( + ) = ( * ) end. No one wants to open this module to pollute outside scope.

On the other hand, if user wants to open such module which is written by someone else, he should somehow understand the possibility of polluting outer environment. For example, open Core.Std will make all arguments of List.map become labelled arguments. However, I do agree overriding semantics of basic syntax could leave user confused.

@diml

This comment has been minimized.

Member

diml commented Oct 7, 2015

I think this feature is nice as it removes a bit of magic around the list type. Now you can say "this is how the list type is defined", instead of saying that it is a normal sum type except that you can't redefine it yourself.

And it's impossible to prevent people from writing unreadable code, so I don't think we should use this as an argument against this specific feature.

@lefessan

This comment has been minimized.

Contributor

lefessan commented Oct 7, 2015

@lpw25: "In practice, it is fine..." I have written alone projects of nearly 100klines of OCaml code, so I think I have quite an experience in the kind of code that can be written in such projects. It is not "fabricated", it is exactly what appears when you move from short examples to real life code.

@lpw25

This comment has been minimized.

Contributor

lpw25 commented Oct 7, 2015

It is not "fabricated"

Are you saying that the example was taken from a real piece of code, or are you objecting to my use of the word "fabricated"? Perhaps it is too strong. My point is drawing a very general conclusion "using operator overloading is always a mistake" from a small sample of code is not particularly persuasive.

@Octachron

This comment has been minimized.

Contributor

Octachron commented Oct 7, 2015

@lefessan:

Anyway, as the example with arrays was showing, you reach the limits of overriding (+) so fast that it's really not worth allowing it in the first place.

I don't think your example is particularly illustrative on this point. It could be easily rewritten as

let sum t x = Int32.(t.(x) + t.(succ x) )
 (* or *) 
let sum t x = Int32.( t.(x) + t.( Int.add 1 x ) )

At the same time, for numerical and mathematics oriented applications, the
ability to bind (+) to the right associative commutative binary operator is
a true blessing and can increase readability a lot.

I think the same thing is true for the list literal syntax; there is quite a few
constructions that are essentially a list plus some supplementary type
information that could benefit from having access to the list literal syntax.

@lefessan

This comment has been minimized.

Contributor

lefessan commented Oct 7, 2015

Anyway, whatever, I gave my opinion. PR are nice to improve the distribution tools and libraries. I think they are a bad way to improve the language itself, by small inconsistent increments and no long term vision.

@bluddy

This comment has been minimized.

bluddy commented Oct 7, 2015

I, for one, am supportive of this PR. Less magic is a good thing.

I also acknowledge the danger inherent in local opens. I think the real problem is the inability to restrict local opens easily. For example,

(* Ideal syntax *)
module Int32 = Int32((+), (-), (/), (*))
(* current syntax: 

module Int32 : sig val (+):int32 -> int32 -> int32 val (-):int32 -> int32 -> int32 val (/): int32 -> int32 -> int32 val (*): int32 -> int32 -> int32 end = Int32 *)

...
Int32.(x + (2 * 3))

Would be much safer than using Int32 directly.

Currently, to do this, we'd have to write a whole module type, which involves the type of every function we want (as is shown above), and is tedious. A quicker selective syntax as suggested in the first alternative would really help, but I don't think this particular syntax works because it clashes with functor application. Anyone have any ideas?

@ibnfirnas

This comment has been minimized.

ibnfirnas commented Oct 7, 2015

@bluddy

module Int32 = struct
  let (+) = Int32.add
  ...
end
@bluddy

This comment has been minimized.

bluddy commented Oct 7, 2015

@ibnfirnas: Good point, you could either restrict the type view of a module or build another module altogether -- I didn't think of that. I agree that building a new module is cleaner syntactically, but it's still not concise enough that people would use it in every file. I guess stating the point of this is important: you want to make it such that the included functions/types/etc can be concisely specified in every file. This prevents the problem you open yourself up to by opening a file either locally or globally, whereby a change in one file causes a possibly invisible semantic change in another file.

In other words, suppose module Foo has no (+), and I locally open it in another file, bar.ml. Many years later, I add (+) to Foo, not realizing that it's opened locally in bar.ml. Calls to (+) in Bar now go to Foo's version of (+) rather than the default library's version. Hilarity and painful debugging ensues.

If Foo was first constrained to only the used functions, any addition to Foo would not have been reflected in bar.ml. However, to be feasible as a common practice, this must be as painless a process as possible, which means it should really be a one line thing with minimal syntactic footprint IMO.

@objmagic

This comment has been minimized.

Contributor

objmagic commented Oct 7, 2015

@damiendoligez Right now, :: can be used as constructor. I agree that we should keep uniformity and demand (::). Should I disallow :: in this patch? Will this change cause any backward incompatibility?

@objmagic

This comment has been minimized.

Contributor

objmagic commented Oct 8, 2015

@chambart I guess the meaning of this map is to guarantee returned list has the same length? It is cool but I would like to know in which case it will be useful?

@lefessan

This comment has been minimized.

Contributor

lefessan commented Oct 8, 2015

@lpw25: Yesterday, I was not completely satisfied by my reply, but I couldn't put the finger on the problem. And then, I realized why: in fact, my example is completely real, it is indeed code that I have written in the past, and now, I remember where! In try.ocamlpro.com, from the beginning, we added an extension to locally rewrite constants. For example, you can type:

#   I.(4494999 * 48484848);;
- : Big_int.big_int = 217939343275152I

As you can see, integers and operators are overriden in the I module into big integers. And indeed, as soon as I tried to make a simple program using that extension, I fell on the example I gave you.

In the OCamlPro compiler (the one used by Memprof), we have tested such features for a long time, and our conclusion was often: once you have made the patch, you usually play with it on small examples for a few days; but then, in bigger projects, the new feature is never used. Finally, we usually decide to drop them after one or two years.

In conclusion, I would argue that people here should refrain from saying "it's worderful, I want it !", instead, people should take such PRs, merge them in their own local version of OCaml, and use them for one year in their projects. Then, after one year, they should tell us whether it was the wonderful feature that solved all their problems, or just a nice idea, but with little interest.

@damiendoligez

This comment has been minimized.

Member

damiendoligez commented Oct 8, 2015

@marklrh

Right now, :: can be used as constructor. I agree that we should keep uniformity and demand (::). Should I disallow :: in this patch? Will this change cause any backward incompatibility?

:: is on my list of things to remove anyway (see PR#5936). As far as I know, the incompatibility is entirely theoretical because no known program uses this misfeature.

@bobzhang

This comment has been minimized.

Member

bobzhang commented Oct 8, 2015

I agree with @lefessan , local open (overloading) makes code hard to reason about
Think twice, I think it might be interesting to providing a qualified syntax which only allow the module functions in that scope, for example

M.(f a b)  (* all functions inside M can only be imported from M*)

Edit: typo

@objmagic

This comment has been minimized.

Contributor

objmagic commented Oct 8, 2015

@bluddy yes, it is heterogenous association list.

@objmagic

This comment has been minimized.

Contributor

objmagic commented Oct 8, 2015

@damiendoligez I commented out :: and parser now only accepts ( :: ) as valid type constructor

@bluddy

This comment has been minimized.

bluddy commented Oct 8, 2015

@marklrh et al: isn't it still magical to make [] and ( :: ) definable as infix constructors when all other infix constructors are not definable?

In some ways, this makes the problem of programmer expectation and readability worse. Because these are the only infix constructors that can be defined, in cases where I would like to use infix constructors, I'm steered towards these two, which will potentially cause a huge amount of aliasing for the poor list constructors. It would make more sense to me to first add infix constructors and then include these two. Am I way off here?

@objmagic

This comment has been minimized.

Contributor

objmagic commented Oct 8, 2015

@bluddy I'm not sure what other infix constructors are good choices. I don't know if custom infix constructors is a good idea. I do think basic infix constructors [] and ( :: ) are necessary.

@yallop

This comment has been minimized.

Member

yallop commented Oct 8, 2015

isn't it still magical to make [] and ( :: ) definable as infix constructors when all other infix constructors are not definable?

There are two (very vaguely related) issues:

  1. The nil constructor [] can only be defined in the compiler, not within the language. That issue is addressed by this PR (#234).
  2. There's only one infix constructor, ::. That issue is addressed by PR #76.

The two issues are independent: each issue is (arguably) useful without the other, and it doesn't matter which order they're merged in, if they're merged at all. I think separating the discussions is a good idea.

@bluddy

This comment has been minimized.

bluddy commented Oct 8, 2015

@yallop, thanks for the clarification and cross-reference. LGTM now, but I'd also like to see #76 merged.

@damiendoligez

This comment has been minimized.

Member

damiendoligez commented Oct 13, 2015

@marklrh

Also, notice that false and true also can be used as type constructor and it seems no one complains about this.

That's not exactly right: as evidenced by Mantis #5936, most people expect true and false to be keywords whose meaning is fixed. If we wanted them to be ordinary constructors, we'd write them capitalized.

@objmagic

This comment has been minimized.

Contributor

objmagic commented Nov 19, 2015

@gasche any opinion on this issue during developer meeting?

@gasche

This comment has been minimized.

Member

gasche commented Nov 19, 2015

No, we didn't have time to cover all pull requests and open bugs, and this one was not discussed, sorry.

@damiendoligez

This comment has been minimized.

Member

damiendoligez commented Nov 24, 2015

I think this boils down to the mental representation of [] and ::.
Are they keywords or constructors? @lefessan says keywords, everyone else says constructors.

At any rate, it's really strange to have :: redefinable but not []. I move that we merge this PR for 4.03.

@alainfrisch

This comment has been minimized.

Contributor

alainfrisch commented Nov 24, 2015

Generally speaking, overloading built-in operators/syntax/constructors/... seems a rather bad idea in terms of readability, but I don't really object to the PR.

FWIW, I consider [] as some piece of syntactic sugar, the degenerate form of the n-ary construction [e1;...en] (and it allows an arbitrary number of whitespaces between [ and ]). Without this PR, we could change the internal constructor name to which the empty list is desugared, this is purely an implementation detail.

@objmagic

This comment has been minimized.

Contributor

objmagic commented Dec 9, 2015

ping again. any consensus on getting this merged? more discussion?

@objmagic

This comment has been minimized.

Contributor

objmagic commented Dec 17, 2015

@damiendoligez @gasche any plan on merging this in 4.03.0? Is there any concern? I would love to fix any problem.

Changes Outdated
@@ -23,6 +23,9 @@ Language features:
- GPR#88: allow field punning in object copying expressions:
{< x; y; >} is sugar for {< x = x; y = y; >}
(Jeremy Yallop)
* GPR#234: allow "[]" as a user-defined type constructor. Demand parenthesis
around "::" when using "::" as type constructor.

This comment has been minimized.

@damiendoligez

damiendoligez Dec 23, 2015

Member

"type constructor" is the wrong phrase here. You should just say "user-defined constructor" (in both instances).

@damiendoligez damiendoligez added this to the 4.03.0 milestone Dec 23, 2015

@objmagic

This comment has been minimized.

Contributor

objmagic commented Jan 28, 2016

@damiendoligez Rebased against trunk, and made Change entry better

@mshinwell

This comment has been minimized.

Contributor

mshinwell commented Jan 29, 2016

I am going to merge this, since @damiendoligez has approved it, and there have been no further objections for more than one month.

mshinwell added a commit that referenced this pull request Jan 29, 2016

Merge pull request #234 from marklrh/empty_list_user_defined_constructor
GPR#234: Allow ``[]`` as a user-defined constructor

@mshinwell mshinwell merged commit 209139a into ocaml:trunk Jan 29, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@SanderSpies SanderSpies referenced this pull request Jul 3, 2016

Closed

4.03.0 #604

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment