Skip to content

What do you mean C#'s generic syntax is ambiguous?

Blokyk edited this page Aug 19, 2023 · 2 revisions

Note: We might also want to work on generic declarations (e.g. generic functions, traits, etc) in addition to use/instantiations

TL;DR

C#'s generic syntax is kinda ambiguous, and removing that ambiguity requires arbitrary lookback/lookahead in theory, and probably some cursed parslet in practice, so we need a different syntax.

Current proposals:

Simple bars Foo|T| (#1)

let l: List|int| = Foo();

let d = Dict|string, string|();

func allOk|T|(ICollection|Result|T|| coll): bool {}

impl From|T| for Stuff|T, U| {}

Pros:

  • Mostly unused
  • Short
  • Not visually overpowering

Cons:

  • Can be hard to scope visually when nested
  • Possible conflict with union/composite types (although we could use || or or for those I guess)
  • Rust uses |x, y| for lambdas, and since we haven't done the design for those yet, we might want to keep it just in case

Bars+classic Foo|<T>| (#2)

let l: List|<int>| = Foo();

let d = Dict|<string, string>|();

func allEqual|<T>|(ICollection|<T>| coll): bool {}

impl From|<T>| for Stuff|<T, U>| {}

Pros:

  • Unused
  • Kinda similar to most other generic syntax -> familiarity
  • Visible but not overwhelming
  • Unlike #1, will not clash with composite types OR bitwise or

Cons:

  • It can be kinda annoying to type repeatedly and text editors will not auto insert an ending delimiter
  • If we do go with |x, y| for lambdas, I'm not very comfortable with the syntactic/visual similarity between two features that are so unrelated

Brackets Foo[T] (#3)

let l: List[int] = Foo();

let d = Dict[string, string]();

func allEqual[T](ICollection[T] coll): bool {}

impl From[T] for Stuff[T, U] {}

Pros:

  • Best ratio of clarity/visibility
  • Similar to the Foo<T> notation
  • IDEs will complete the opening the '[' with ']' automatically
  • I just like it :)

Cons:

  • Clashes with current array declaration syntax (e.g. List[1, 4] vs int[1, 4])

    • On top of being a problem for the parser, this also means that the human reading the code always has to be aware of whether a name refers to a type or to a variable, which can become pretty taxing or challenging in some situations

Parentheses Foo(T) (#4)

let l: List(int) = Foo();

let d = Dict(string, string)();

func allEqual(T)(ICollection(T) coll): bool {}

impl From(T) for Stuff(T, U) {}

Pros:

Cons:

Alternative: marker character (Foo!<T> or Foo'[T])

The clash in proposal #3 or the ambiguity in the original syntax could be resolved by using a marker character right after the type name, like ! or '.

This was one of my first ideas, but I'm afraid it might be a bit heavy. It can also be pretty challenging to find a character that wouldn't clash with any other construct in the language. For example, Foo! means "unwrap variable Foo", while Foo!<T> means "Type Foo with type arg T." However, I'm afraid this is the route we'll have to go down, since the above proposals all have some pretty heavy drawbacks.

Bang + Classic

let l: List!<int> = Foo();

let d = Dict!<string, string>();

func allEqual!<T>(ICollection!<T> coll): bool {}

impl From!<T> for Stuff!<T, U> {}

Quote + Classic

let l: List'<int> = Foo();

let d = Dict'<string, string>();

func allEqual'<T>(ICollection'<T> coll): bool {}

impl From'<T> for Stuff'<T, U> {}

Quote + Bracket

Trust me, bang+bracket is just ugly

let l: List'[int] = Foo();

let d = Dict'[string, string]();

func allEqual'[T](ICollection'[T] coll): bool {}

impl From'[T] for Stuff'[T, U] {}

Ramblings

So, it turns out, I'm a bit dumb.

Although I know that generics will probably not be supported in the first few functioning versions of the compiler, I do want to support them in the final version of lotus, especially since we don't have mechanisms like boxing, since there's no "base type" for everything (like System.Object in .NET).

However, one thing I hadn't really considered deeply is the syntax for them. I mean, I like the one in C#, and it looks like most languages take the same route, so let's just use the same one, right?

But.

But it turns out that this syntax is actually ambiguous. For example, given no semantic context, how should the following expression be parsed:

Check(Stuff<Foo, Bar>())

Obviously it's a call to Check() with the argument being a call to the function Stuff taking two type parameters!

F
└── Params
    └── G
        ├── TypeArgs
        │   ├── T
        │   └── U
        └── Params
            └── ...

...or is it? vsauce music intensifies

I mean, how does the parser know that it's not just two comparisons?

F
└── Params
    ├── G < T
    └── U > ()

Well, that's because C# has a rule that