New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "enum" construct #1958

Merged
merged 37 commits into from Apr 6, 2017

Conversation

Projects
None yet
5 participants
@odersky
Contributor

odersky commented Feb 8, 2017

This is a prototype implementation to add an "enum" construct to Scala. Scala enums give a more concise notation for

  • enums as in Java
  • ADTs
  • GADTs

A specification of what's implemented is provided by #1970.

Current status:

  • First implementation
  • Some test cases
  • A specification
  • A decision whether we want to go ahead with this

If we want to go ahead with this, the following todos should be broken out as separate issues.

  • An implementation of generic programming in the style of SYB. We need to clarify first exactly
    what we want from the compiler
  • A way to interop with Java enumerations
  • Update the exhaustiveness checking in the pattern matcher
  • Optimize pattern matcher to use enumTags
@odersky

This comment has been minimized.

Show comment
Hide comment
@odersky

odersky Feb 8, 2017

Contributor

@felixmulder The build failure comes from my changes to how doc comments are treated in Parser. The changes are in b20c1be. I noted a lot of docstring parameters passed, but they were all in perfect sync with the start position that we also pass, and they were all computed as in.getDocString(start). I foolishly assumed that getDocString was referentially transparent and could be called at the point where we need the docstring instead. But that seems to be a wrong assumption. My attempted fix in 4e0dc0e did not help.

So, we could back out, which is slightly messy but doable. But maybe it's best to go forward instead. Can you make getDocString referentially transparent, so that the timing when we call it does not matter? I would simply manage a map from a token start offset to the docstring that immediately precedes it. Then, since we have exact start positions of all trees we automatically have their docstrings as well, and a lot of redundancy and passing things around becomes unnecessary. I also don't see a need in this case to maintain a stack of docstrings that follows the block structure. WDYT?

Contributor

odersky commented Feb 8, 2017

@felixmulder The build failure comes from my changes to how doc comments are treated in Parser. The changes are in b20c1be. I noted a lot of docstring parameters passed, but they were all in perfect sync with the start position that we also pass, and they were all computed as in.getDocString(start). I foolishly assumed that getDocString was referentially transparent and could be called at the point where we need the docstring instead. But that seems to be a wrong assumption. My attempted fix in 4e0dc0e did not help.

So, we could back out, which is slightly messy but doable. But maybe it's best to go forward instead. Can you make getDocString referentially transparent, so that the timing when we call it does not matter? I would simply manage a map from a token start offset to the docstring that immediately precedes it. Then, since we have exact start positions of all trees we automatically have their docstrings as well, and a lot of redundancy and passing things around becomes unnecessary. I also don't see a need in this case to maintain a stack of docstrings that follows the block structure. WDYT?

@felixmulder

This comment has been minimized.

Show comment
Hide comment
@felixmulder

felixmulder Feb 9, 2017

Contributor

I think that originally I wanted to protect from all cases of misuse. E.g:

trait T {
  /** Cheeky docstring that shouldn't appear in `C` */
}
class C

ce40517 is a bit more permissive and allows this misuse...

Another alternative to this fix could be to make docstring its own token and attach it as part of the AST, but I guess that would be added complexity to the parser.

Update: failing test is compileStdLib

Contributor

felixmulder commented Feb 9, 2017

I think that originally I wanted to protect from all cases of misuse. E.g:

trait T {
  /** Cheeky docstring that shouldn't appear in `C` */
}
class C

ce40517 is a bit more permissive and allows this misuse...

Another alternative to this fix could be to make docstring its own token and attach it as part of the AST, but I guess that would be added complexity to the parser.

Update: failing test is compileStdLib

@odersky

This comment has been minimized.

Show comment
Hide comment
@odersky

odersky Feb 9, 2017

Contributor

I think that originally I wanted to protect from all cases of misuse. E.g:

trait T {
   /** Cheeky docstring that shouldn't appear in `C` */
}
class C

But if you associate doc comments to start positions of tokens following them, you can do that. The cheeky doc comment would be associated with }' not withclass`. So it would not qualify.

Re: stdlib I forgot to push a new one. (the old one uses enum in one place, but that works no longer since it is now a keyword).

Contributor

odersky commented Feb 9, 2017

I think that originally I wanted to protect from all cases of misuse. E.g:

trait T {
   /** Cheeky docstring that shouldn't appear in `C` */
}
class C

But if you associate doc comments to start positions of tokens following them, you can do that. The cheeky doc comment would be associated with }' not withclass`. So it would not qualify.

Re: stdlib I forgot to push a new one. (the old one uses enum in one place, but that works no longer since it is now a keyword).

@odersky odersky referenced this pull request Feb 13, 2017

Closed

Add enum construct #1970

@viktorklang

This comment has been minimized.

Show comment
Hide comment
@viktorklang

viktorklang Feb 13, 2017

viktorklang commented Feb 13, 2017

@odersky

This comment has been minimized.

Show comment
Hide comment
@odersky

odersky Feb 14, 2017

Contributor

Regarding shorthand for simple enums: Instead of

enum Color { case Red, Green, Blue }

should it rather be:

enum Color { case Red | Green | Blue }

? That would align it with pattern syntax.

Contributor

odersky commented Feb 14, 2017

Regarding shorthand for simple enums: Instead of

enum Color { case Red, Green, Blue }

should it rather be:

enum Color { case Red | Green | Blue }

? That would align it with pattern syntax.

@odersky

This comment has been minimized.

Show comment
Hide comment
@odersky

odersky Feb 27, 2017

Contributor

We now have the syntax

enum Color { case Red, Green, Blue }

supported.

I propose we defer all other todos to separate issues and skip straight to the decision. Do we want to go ahead with this concept? If everyone is still positive, let's review the code and get this in.

Contributor

odersky commented Feb 27, 2017

We now have the syntax

enum Color { case Red, Green, Blue }

supported.

I propose we defer all other todos to separate issues and skip straight to the decision. Do we want to go ahead with this concept? If everyone is still positive, let's review the code and get this in.

@odersky odersky requested a review from liufengyun Feb 28, 2017

ModuleDef(modName, Template(emptyConstructor, Nil, EmptyValDef, body))
.withMods(mods)
}
Thicket(clsDef :: modDef :: Nil)

This comment has been minimized.

@liufengyun

liufengyun Feb 28, 2017

Contributor

I'm not sure if it's a good idea to do this inside parser.

When we introduce macros, we do have multiple level desugaring (expansion) in Namer. And as we already have all the facility to support multiple level desugaring, I think we probably should use that. I've done some changes in Namer to support multiple-level desugaring can be seen here. It seems a flatten method is all that's needed.

@liufengyun

liufengyun Feb 28, 2017

Contributor

I'm not sure if it's a good idea to do this inside parser.

When we introduce macros, we do have multiple level desugaring (expansion) in Namer. And as we already have all the facility to support multiple level desugaring, I think we probably should use that. I've done some changes in Namer to support multiple-level desugaring can be seen here. It seems a flatten method is all that's needed.

This comment has been minimized.

@odersky

odersky Feb 28, 2017

Contributor

I tried multi-level expansion first, including flattening, but it did not work. Essentially the compiler got confused with symbol attachments and I did not find a good way around it.

@odersky

odersky Feb 28, 2017

Contributor

I tried multi-level expansion first, including flattening, but it did not work. Essentially the compiler got confused with symbol attachments and I did not find a good way around it.

Show outdated Hide outdated compiler/src/dotty/tools/dotc/ast/Desugar.scala
// todo: also use anyRef if constructor has a dependent method type (or rule that out)!
else (constrVparamss :\ classTypeRef) ((vparams, restpe) => Function(vparams map (_.tpt), restpe))
(constrVparamss :\ (if (isEnumCase) applyResultTpt else classTypeRef)) (
(vparams, restpe) => Function(vparams map (_.tpt), restpe))

This comment has been minimized.

@liufengyun

liufengyun Feb 28, 2017

Contributor

I may miss some point here, it seems to me the fold is useless here, because constrVparamss can only be of size 0?

@liufengyun

liufengyun Feb 28, 2017

Contributor

I may miss some point here, it seems to me the fold is useless here, because constrVparamss can only be of size 0?

This comment has been minimized.

@odersky

odersky Feb 28, 2017

Contributor

size 0 or 1

@odersky

odersky Feb 28, 2017

Contributor

size 0 or 1

This comment has been minimized.

@odersky

odersky Feb 28, 2017

Contributor

size 0 or 1, actually.

@odersky

odersky Feb 28, 2017

Contributor

size 0 or 1, actually.

This comment has been minimized.

@liufengyun

liufengyun Feb 28, 2017

Contributor

Ah, I see, thanks, obviously I've some problems with basic math :)

@liufengyun

liufengyun Feb 28, 2017

Contributor

Ah, I see, thanks, obviously I've some problems with basic math :)

Show outdated Hide outdated compiler/src/dotty/tools/dotc/ast/Desugar.scala
@odersky

This comment has been minimized.

Show comment
Hide comment
@odersky

odersky Mar 2, 2017

Contributor

Rebased to master

Contributor

odersky commented Mar 2, 2017

Rebased to master

@odersky odersky changed the title from WIP Add "enum" construct to Add "enum" construct Mar 3, 2017

odersky and others added some commits Feb 5, 2017

Add enum syntax
Modify syntax.md and Tokens/Parser/untpd to support enums.
Don't pass docstring as a parameter.
It's completely redundant, docstring is just the comment found
at the `start` offset, which is passed anyway.
Fix mal-formatting.
Insert an empty line before "where" in an explanation.
Simplify syntax
`enum' only allowed as a prefix of classes, dropped from traits and objects.
Change handling of enum defs.
The previous scheme did not work because desugaring cannot deal with
repeated expansions. We now sidestep the issue by doing the expansion in the parser. Luckily,
positions work out perfectly, so that one can reconstruct the source precisely from the parsed untyped
trees.
Improvement to REPL test
In case of difference, dump transcript to file for easier comparisons.
Be more systematic about result of apply method
The same type needs to be used as a result type for
the copy method, and the function type implemented by
the companion module.
More fine-grained distinctions when flags are defined.
Flags like Trait are in fact not always defined when a symbol
is created. For symbols loaded from class files, this flag, and
some other is defined only once the classfile has been loaded.
But this happens in general before the symbol is completed.

We model this distinction by separating from the `FromStartFlags` set
a new set `AfterLoadFlags` and distinguishing between the two sets
in `SymDenotations#is`.

Test case is enum-Option.scala. This erroneously complained before
that `Enum` was not a trait.
Don't change the return type of the `copy` method
`copy` should always return the type of it's rhs. The discussion of
#1970 concluded that no special treatment for enums is needed.
Change return type of `apply`.
In an enum case like

     case C() extends P1 with ... with Pn ...

apply now returns `P1 & ... & Pn`, where before
it was just P1.

Also, add to Option test.
Change enumeration members.
Based on the discussion in #1970, enumeration objects now
have three public members:

 - valueOf: Map[Int, E]
 - withName: Map[String, E]
 - values: Iterable[E]

Also, the variance of case type parameters is now
the same as in the corresponding type parameter of
the enum class.

odersky added some commits Feb 28, 2017

Support cases with type parameters that extend a non-parameterized base
Support cases with type parameters that implicitly extend a non-parameterized base
without needing their own extends clause. The proposal has been updated to make clear
that this is supported.

Also address other reviewers comments.
Fix neg test error count
Previous expansion caused 3 spurious follow-on errors which are now
avoided.
Avoid assertion failure on neg test
This commit can hopefully be reverted once #2121 is in.
Infer enum type args from type parameter bounds
Infer type arguments for enum paraments from corresponding type
parameter bounds. This only works if the type parameter in question
is variant and its bound is ground.
Implementation of proposal changes
 - rename utility methods
 - generate utility methods also for object cases
@odersky

This comment has been minimized.

Show comment
Hide comment
@odersky

odersky Apr 5, 2017

Contributor

As far is I'm concerned we are done for this part. We already had a review by @liufengyun but if people want to review either everything or just the later commits please assign to yourself as reviewer.

Contributor

odersky commented Apr 5, 2017

As far is I'm concerned we are done for this part. We already had a review by @liufengyun but if people want to review either everything or just the later commits please assign to yourself as reviewer.

odersky added some commits Apr 5, 2017

@liufengyun

This comment has been minimized.

Show comment
Hide comment
@liufengyun

liufengyun Apr 5, 2017

Contributor

One thing missing is exhaustivity check, which can be done in another PR.

According to our current algorithm, as long as the parent class symbol carries a list of children symbols (case vals and case classes), the algorithm will be able to handle.

My concern here is that in parser, it's clear what the children are, and it's easy to get them. But after desugaring, it seems to require a lot of assumptions about desugaring in order to recover that information.

Contributor

liufengyun commented Apr 5, 2017

One thing missing is exhaustivity check, which can be done in another PR.

According to our current algorithm, as long as the parent class symbol carries a list of children symbols (case vals and case classes), the algorithm will be able to handle.

My concern here is that in parser, it's clear what the children are, and it's easy to get them. But after desugaring, it seems to require a lot of assumptions about desugaring in order to recover that information.

@odersky odersky merged commit 62c2a1e into lampepfl:master Apr 6, 2017

2 checks passed

CLA User signed CLA
Details
continuous-integration/drone/pr the build was successful
Details

@allanrenucci allanrenucci deleted the dotty-staging:add-enum branch Dec 14, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment