Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md

README.md

Flow Enums

Flow is working on adding an optional (and by default off) enums language feature, for improved developer experience, performance, and safety.

Motivation

  • Developer experience
    • There are currently a variety of patterns in use for modeling enums in Flow. We will replace (most of) these with a construct that is more ergonomic to write and use.
  • Flow performance
    • Type-checking arbitrary unions is expensive. While we have heuristics which optimize type-checking unions when they conform to certain patterns, enums are designed in such a way that they are guaranteed to have good type-checking performance, ensuring the programmer cannot accidently make a modification which will result in poor performance.
  • Safety
    • While the type-checking behavior is still being designed, we will probably disallow some behaviors that are currently possible with the “union of strings” model, in the spirit of preventing bugs in the programmer’s code.

Design Principles

  1. The design should be “JavaScripty” - creating and using enums should feel natural when compared to using existing JavaScript constructs and builtins.
  2. The grammar and AST specs should fit with the existing ECMAScript and ESTree specifications.
  3. While runtime optimizations (constant inlining of enum usage) should be separate from the language, enums should be designed in such a way to make doing these optimizations possible.
  4. The initial design should be as restrictive as possible, while still being useful. It is much easier to start off with something restrictive and make it less so as compelling use-cases come up, than the other way around.
  5. All else being equal, compatibility with TypeScript in terms of syntax is nice to have.

Examples

Basic enum with defaulted member values (values are strings mirrored of member names):

enum E {
  A,
  B,
}

Explicitly set member values:

enum E {
  A = 1,
  B = 2,
}

Explicit representation type:

enum E of symbol {
  A,
  B,
}

All at once:

enum E of string {
  A = "a",
  B = "b",
}

Why did we choose this syntax? See the Rationale section below.

Grammer Specification

Based on http://ecma-international.org/ecma-262/9.0/

Properties

The grammar captures:

  • Mixing defaulted and initialized members is not allowed.
  • Only literals are allowed as member initializers, of either type boolean, number, or string.
  • The type of the member initializers must all be of the same primitive type.
  • The type of member initializers must match the representation type (typeof of the member value must match), if it is specified.
  • Member names must not start with lowercase ‘a’ through ‘z’.
  • If the representation type is specified as boolean or number, then all enum members must be initialized with a value (no defaulting is allowed).
  • When you do not specify an explicit representation type or member initializers, string is used as the representation type.

Static semantics capture:

  • No duplicate member names are allowed.
  • No duplicate member values are allowed (as defined by the Strict Equality Comparison in the spec).

(Static semantics to be fully specified as per ECMAScript language spec conventions)

In short, we allow for completely initialized number or boolean enums, completely initialized or completely defaulted string enums, or completely defaulted symbol enums.

Grammar

  • Declaration (modified):
    • ...
    • EnumDeclaration
  • EnumDeclaration:
    • enum BindingIdentifier EnumBody
  • EnumBody:
    • EnumBooleanExplicitType[opt] { EnumBooleanBody[opt] }
    • EnumNumberExplicitType[opt] { EnumNumberBody[opt] }
    • EnumStringExplicitType[opt] { EnumDefaultedBody[opt] }
    • EnumStringExplicitType[opt] { EnumStringBody[opt] }
    • EnumSymbolExplicitType { EnumDefaultedBody[opt] }
  • EnumBooleanExplicitType:
    • of boolean
  • EnumNumberExplicitType:
    • of number
  • EnumStringExplicitType:
    • of string
  • EnumSymbolExplicitType:
    • of symbol
  • EnumDefaultedBody:
    • EnumDefaultedMemberList ,[opt]
  • EnumBooleanBody:
    • EnumBooleanMemberList ,[opt]
  • EnumNumberBody:
    • EnumNumberMemberList ,[opt]
  • EnumStringBody:
    • EnumStringMemberList ,[opt]
  • EnumDefaultedMemberList:
    • EnumDefaultedMember
    • EnumDefaultedMemberList , EnumDefaultedMember
  • EnumDefaultedMember:
    • EnumMemberName
  • EnumBooleanMemberList:
    • EnumBooleanMember
    • EnumBooleanMemberList , EnumBooleanMember
  • EnumBooleanMember:
    • EnumMemberName = BooleanLiteral
  • EnumNumberMemberList:
    • EnumNumberMember
    • EnumNumberMemberList , EnumNumberMember
  • EnumNumberMember:
    • EnumMemberName = NumericLiteral
  • EnumNumberMemberList:
    • EnumStringMember
    • EnumStringMemberList , EnumStringMember
  • EnumStringMember:
    • EnumMemberName = StringLiteral
  • EnumMemberName:
    • EnumMemberNameStart
    • EnumMemberName IdentifierPart
  • EnumMemberNameStart:
    • $
    • _
    • EnumMemberNameUnicodeIDStart
  • EnumMemberNameUnicodeIDStart:
    • any Unicode code point with the Unicode property “ID_Start”, except for a through z

Notes: EnumMemberName is IdentifierName, but without allowing the lowercase characters a through z as the first character.

AST Specification

Based on https://github.com/estree/estree

interface EnumDeclaration <: Declaration {
    type: "EnumDeclaration";
    id: Identifier;
    body: EnumBooleanBody | EnumNumberBody | EnumStringBody| EnumSymbolBody;
}

interface EnumBooleanBody {
    type: "EnumBooleanBody";
    members: [ EnumBooleanMember ];
    explicitType: boolean;
}

interface EnumNumberBody {
    type: "EnumNumberBody";
    members: [ EnumNumberMember ];
    explicitType: boolean;
}

interface EnumStringBody {
    type: "EnumStringBody";
    members: [ EnumStringMember ] | [ EnumDefaultedMember ];
    explicitType: boolean;
}

interface EnumSymbolBody {
    type: "EnumSymbolBody";
    members: [ EnumDefaultedMember ];
}

interface EnumBooleanMember {
    type: "EnumBooleanMember";
    id: Identifier;
    init: boolean;
}

interface EnumNumberMember {
    type: "EnumNumberMember";
    id: Identifier;
    init: Literal;
}

interface EnumStringMember {
    type: "EnumStringMember";
    id: Identifier;
    init: Literal;
}

interface EnumDefaultedMember {
    type: "EnumDefaultedMember";
    id: Identifier;
}

Syntax Rationale

At the top level, there are two possible syntaxes, c-style and ml-style, i.e.

  • c-style: enum E = {A, B, C}
  • ml-style: enum E = A | B | C;

We use the c-style syntax, because the syntax is more consistent with existing language constructs in JavaScript (principle 1).

Rationale for punctuation:

  • Keyword
    • enum has significant prior art - additionally, it is already a reserved word in JavaScript (see Future Reserved Words).
    • E.g. enum is used by: C, C++, C#, D, Hack, Java, Nim, Rust, Swift, TypeScript
  • Separator between enum members
    • , (comma) - this is the most common separator in use for this purpose.
    • E.g. , is used by: Ada, C, C++, C#, D, Java, Nim, Rust, TypeScript
  • Separator between member name and value
    • = (equals) - this is the most common separator in use for this purpose.
    • E.g. = is used by: C, C++, C#, D, Go, Hack, Nim, Rust, Swift, TypeScript
  • Optional representation-type definition syntax
    • of as a punctuator is already used in for-of statements.
    • A possible alternative, :, may be used at some point later to denote that the enum is a subtype of the RHS of : (a different property than what we're describing here, and already used for that purpose in opaque types) so we don't want to use it for that purpose.
    • Using of makes sense grammatically in English.
    • The RHS of of is one of boolean, string, number, symbol - which follows the output of typeof
    • undefined is a possible output of typeof, but we do not allow it as a representation type: enums usually represent a choice of more than one element, however this would define an enum which only has one possible member. Additionally, there is no undefined literal value which we could use as the member value.

Rationale for having an optionally explicit representation-type definition:

  • Allows us to have enums of symbols (there is no symbol literal for us to specify as the value).
  • Improves error messages when we add values of mixed types (e.g. enum E {A = “a”, B = 1}), without the explicit definition we error saying the types are mixed, with it we can be more specific (e.g. enum E of string {A = “a”, B = 1} “Error: The value of 'B' is not a 'string'”)
  • However, it is not required, as in many cases it is not necessary, and duplicates information already provided by providing explicit enum member values.

Rationale for properties of the syntax:

  • Mixing defaulted and initialized members is not allowed.

    • Mixing defaulted and explicitly initialized members can cause confusion, and can lead to unintuitive behavior regarding what the defaulted values should be, e.g. enum E {A = “a”, B = "b", C}, while a person looking at this quickly could easily think that the value of the member C is ”c”, it would be ”C”
  • Only allow literals as values.

    • We need to know statically the values of each member, to allow for them to be inlined (principle 3) [note: symbol based enums cannot be inlined]. We could allow arbitrary expressions, if those expressions can be statically evaluated, however this adds complexity the first version of enums, with unclear benefits.
  • The type of the member initializers must all be of the same type.

    • We don't allow for mixed types of member values (e.g. enum E {A = “foo”, B = 1}). This adds complexity for unclear benefit, and also complicates our rule that member values should be unique - would we allow enum E {A = 1, B = “1”}?
  • Member names must not start with lowercase ‘a’ through ‘z’.

    • Enum methods start with lowercase ‘a’ through ‘z’. This way an enum member name can never shadow a method, and there is never any confusion on the part of the user.
  • If the representation type is specified as boolean then all enum members must be initialized with a value.

    • We do not allow defaulting of boolean enums, because it is unclear which case should be true, and which case should be false (e.g. enum E of boolean {A, B}). English usually orders true before false, but commonly 0 evaluates to false and 1 to true, which is the opposite ordering.
  • If the representation type is specified as number, then all enum members must be initialized with a value.

    • We do not allow defaulting of numeric enums, because if a member from the middle of such an enum is removed or added, all subsequent member values would be changed. This can be unsafe (e.g. push safety, serialization, logging). Requiring the user to be explicit about the renumbering makes them think about the consequences of doing so.
  • When you do not specify an explicit representation type or initialized member values, string is used as the representation type.

    • E.g. for enum E {A, B}, the representation type is string, with the member values being strings mirrored from the member names.
    • Given that we don't allowed enums of boolean and number to use defaulted members, the choice is between string or symbol as the default representation type. However, if one of the goals is to enable inlining of enum member values, and symbol enums are not able to be inlined, we choose to have string be the default.
  • No duplicate member names are allowed.

    • Having duplicate names does not construct a valid enum. A single enum member name would point to multiple values. Not only is this confusing, but could cause errors when we compile the enum syntax, and would cause errors when attempting to inline enum members.
    • We additionally have precedent from some other cases where duplicate identifiers are not allowed:
      • Duplicate parameter names (under strict mode): "use strict"; function f(a, a) { } Uncaught SyntaxError: Duplicate parameter name not allowed in this context
      • Duplicate labels: a: while (false) { a: while (false) {}} Uncaught SyntaxError: Label 'a' has already been declared
      • Duplicate __proto__ properties: ({__proto__: {}, __proto__: {}}) Uncaught SyntaxError: Duplicate __proto__ fields are not allowed in object literals
  • No duplicate member values are allowed.

    • We have this rule in an attempt to prevent bugs in the programmer's logic.
    • Motivating example: There is an enum with duplicate values enum E {A = 1, B = 1} Later in the code, there is a switch over an enum value with the intention of having different logic run for each case: const e: E = ...; switch (e) { case E.A: return executeA(); case E.B: return executeB(); } However, this code will never hit the case that runs executeB(), because E.A and E.B have the same underlying value, and as such both will hit case E.A

Rationale for supporting boolean and symbol based enums:

  • Enums based on numbers and strings are common in other languages. We allow two additional variants: enums of booleans and symbols.
  • Boolean enums
    • Booleans, like numbers and strings, are primitive values with literal representations. In this they fulfill the requirements for being used as enum member initializers.
    • But why would they be useful? Boolean enums create a light way to make boolean flags safer. Rather than pass in raw boolean values which don't describe their purpose, a self documenting enum member can be used. E.g. with enum Status {Enable = true, Disable = false}, before f(x, y, z, true) vs. after f(x, y, z, Status.Enable) - while we don't go into details here, we will provide a mechanism to safely get the underlying value (in this case a boolean) from an enum.
  • Symbol enums
    • There is a trade-off between using primitive values like numbers or strings, and using symbols:
      • We can inline the usage of primitive values that have literals, like numbers and strings, but cannot inline the usage of symbols.
      • At runtime, if you purposefully ignore any type-checking that we do, it is possible for two different enums whose representation is either a boolean, number, or string to have members that are equivalent at runtime. However, this is not possible for symbol based enums.
      • Allowing symbol based enums allows the programmer to make this trade-off.
You can’t perform that action at this time.