Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-56 - Proper Specification for Match Types. #65

Merged
merged 8 commits into from
Dec 11, 2023

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented Aug 11, 2023

Currently, match type reduction is not specified, and its implementation is by nature not specifiable.
This is an issue because match type reduction spans across TASTy files (unlike, for example, type inference or GADTs), which can and will lead to old TASTy files not to be linked again in future versions of the compiler.

This SIP proposes a proper specification for match types, which does not involve type inference.
It is based on baseType computations and subtype tests involving only fully-defined types.
That is future-proof, because baseType and subtyping are defined in the specification of the language.

The proposed specification defines a subset of current match types that are considered legal.
Legal match types use the new, specified reduction rules.
Illegal match types are rejected, which is a breaking change, and can be recovered under -source:3.3.

@sjrd sjrd force-pushed the match-types-spec branch 2 times, most recently from e0266f2 to 78254ae Compare August 11, 2023 09:37
content/match-types-spec.md Outdated Show resolved Hide resolved
content/match-types-spec.md Outdated Show resolved Hide resolved
content/match-types-spec.md Outdated Show resolved Hide resolved
content/match-types-spec.md Outdated Show resolved Hide resolved
content/match-types-spec.md Outdated Show resolved Hide resolved
content/match-types-spec.md Outdated Show resolved Hide resolved
content/match-types-spec.md Outdated Show resolved Hide resolved
content/match-types-spec.md Outdated Show resolved Hide resolved
Their specification is complicated, and the implementation as well.
Our quantitative study showed that they were however "often" used (10 occurrences spread over 4 libraries).
In each case, they seem to be a way to express what Scala 2 type projections (`A#T`) could express.
While not quite as powerful as type projections (which were shown to be unsound), match types with type member extractors delay things enough for actual use cases to be meaningful.
Copy link
Member

@SethTisue SethTisue Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dwijnand you might have something to say about this? I know it's come up repeatedly in our discussions about match types that it's possible to use them to smuggle in something akin to abstract type projections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something akin to abstract type projections

With the proposed spec, only concrete type projections actually make it. As long as it's abstract, ExtractTOf[A] cannot reduce to T like A#T does. The difference is that A#T has the "best" bounds of T in A, whereas ExtractTOf[A] has only the match type's bound as long as T is not concrete (through subsequent substitution of A by something more precise in which T is a type alias whose rhs does not depend on its prefix).

For example,

class Base {
  type T
  def someT: T = ???
}

type ExtractTOfAux[A] = Base { type T = A }
type ExtractTOf[X <: Base] = X match
  case ExtractTOfAux[t] => t

// Scala 2 abstract type projection, works
def fooOfBaseProj[B <: Base](b: B): B#T = b.someT

// Scala 3 match type extractor: does not work because the match type does not reduce
// as T is not concrete in B.
def fooOfBaseMT[B <: Base](b: B): ExtractTOf[B] = b.someT

The match-type-based type member extractor still features enough power because you can "carry around" values of that type, and at the edge your B becomes concrete enough that you can reduce. This is what happens with scala.Enumerations.

IMO, people should not use that. It's all a hack. Instead, we should use path-dependent types with actual values of the enclosing class. For example the proper definition of fooOfBase is:

def fooOfBasePathDep(b: Base): b.T = b.someT

Unfortunately, people have been using type projections/type member extractors instead. Yes, sometimes it avoids one extra layer of nesting: if you need a class that extends Foo[b.type], you would b to be declared one level up, and so that class needs to be nested in another class that has a b field. This is why the json4s workaround is what it is, instead of turning it into path-dependent types. The correct solution would be to have

class EnumSerializers(val enumeration: Enumeration) {
  class EnumSerializer extends Serializer(enumeration.Value) {
    ...
  }
}

but that would have changed the public API, so it was not an acceptable PR to make.

content/match-types-spec.md Outdated Show resolved Hide resolved
@soronpo
Copy link
Contributor

soronpo commented Aug 12, 2023

I gave only the specification part to GPT-4 for review. Here is what it mentioned:
The provided specification for the proposed SIP regarding Match Types in Scala 3 is thorough and quite detailed. However, based on the text you've provided, here are a few observations and potential issues:

  1. Clarity on Syntactical Components:

    • When you mention various constructs like TypeCapture, TypeWithoutCapture, etc., it would be helpful to provide some simple examples or even a brief definition for each. This ensures that the reader is clear on what constitutes each of these syntactical elements.
  2. Nested Captures in Legal Patterns:

    • The specification mentions that the following is not a legal pattern: case Inv[Cov[t]] => because a type capture is nested two levels below a non-covariant type constructor. However, it's not immediately obvious why this is illegal based solely on the legal pattern criteria provided. An elaboration on this could be beneficial.
  3. Concrete Vs. Abstract:

    • There's mention of checking whether a type is 'concrete' versus 'abstract', especially in the matching section. It might be helpful to clarify how the proposal differentiates between these two.
  4. The Matching Process:

    • The matching procedure is explained well in terms of steps and conditions. Still, due to its complexity, adding a visual representation, like a flowchart, might help in better understanding the flow of the process.
  5. Refined Type Constructs:

    • There's a mention of refined types, like Base { type Y = t }. However, an explanation of the intricacies of these refined types, especially within the context of match types, would be beneficial.
  6. Caveats or Limitations:

    • If there are any caveats or limitations associated with this proposal, it would be beneficial to outline them explicitly. This helps in setting clear expectations for developers using this feature.
  7. Examples:

    • While there are ample examples given, providing more complex scenarios and edge cases might be useful. This can help in highlighting the intricacies and power of the new match type system. Moreover, a practical use-case of where such a feature would be beneficial in real-world code can be a useful addition.
  8. Performance Implications:

    • Given the complexity of the match types and the specified algorithm, what are the expected performance implications? Would matching against these types result in a considerable slowdown in compile time?
  9. Error Handling & Messages:

    • The proposal doesn't talk about the quality or detail of error messages that would be provided when developers write illegal patterns or mismatched types. For a feature of this complexity, clear error messages are crucial.
  10. Disjointness:

    • The section on disjointness is quite brief. Elaborating more on how provably disjoint types are handled might provide clarity, especially in edge cases.

In summary, while the proposal is detailed and gives a good overview of the changes to Match Types, addressing the above points could make it even clearer and more actionable. It's essential that when proposing such changes, developers at all levels of expertise can understand and work with the feature effectively.

@anatoliykmetyuk
Copy link
Contributor

As per the recent changes to the SIP process, each SIP now needs to have a SIP Manager assigned. I propose @bjornregnell to be a manager of this SIP - is that OK with you?

As per the process spec, this role entails responsibility for all the communications around the SIP during its lifecycle, including requesting a vote on the SIP from the whole Committee, presenting the SIP to the Committee at the plenary meetings, merging or closing the corresponding PR, reporting to the community on the vote outcome, and announcing when it is available for testing.

@bjornregnell
Copy link

@anatoliykmetyuk I'm not sure I'm the best pic for this as I have never seriously tried Match types or any other Scala meta-programming feature so I have some catch up + learning to do in order to write intelligibly about this in a community post etc.

But if there is no one more knowledgeable in Match types available, I can do it iff @sjrd is willing to give me a beginner-level personal tutorial in Zoom when possible ;) (It will have to be after the start of the fall semester as I'm currently swamped with teaching preparations...)

This enables a workaround for the `scala.Enumeration#Value`
extractor use case, found notably in json4s.
@sjrd
Copy link
Member Author

sjrd commented Aug 14, 2023

I have addressed Seth's comments.

As well as some comments of ChatGPT 🙄.

In each case, they seem to be a way to express what Scala 2 type projections (`A#T`) could express.
While not quite as powerful as type projections (which were shown to be unsound), match types with type member extractors delay things enough for actual use cases to be meaningful.

As far as we know, those use cases have no workaround if we make type member extractors illegal.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand correctly, the currently unspecified Match-types brings some expressive power from Scala 2's type projections T#A with respect to type member extraction as in the use case of computing the type of a member in an enum. This seems useful. Currently the proposal seems to state that this expressive power should be removed with no workaround. I think that the proposal needs a discussion on what can be done about this or why this particular case is not included among the legal cases as a special case or something similar. Or is it unsound to ask for the type of enum members? (sorry if this is a nonsensical question)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum no, the sentence says: if we remove type member extractors from the proposed spec, then we will have no workaround anymore. With the proposed spec, we can extract type members.

The originally submitted SIP/spec did not allow to extract the type member of SomeEnumeration#Value because Value is a class member (not a type alias), but the commit "Allow class members to be matched by type member extractors." made that actually possible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks for the clarification!

@anatoliykmetyuk
Copy link
Contributor

@bjornregnell got it, thanks for the heads-up!

@soronpo has agreed to be the SIP Manager, so, in that case, he'll manage this SIP.

@sjrd
Copy link
Member Author

sjrd commented Aug 23, 2023

Well, it took me quite a while to get to the bottom of it, but I added a proposed specification for the provably-disjoint test, as requested. Note that in the end, I actually do recommend that we change it compared to the current implementation. I have a PR for those additional changes at scala/scala3#18416

@sjrd sjrd force-pushed the match-types-spec branch 2 times, most recently from d53fa51 to 88bd807 Compare August 23, 2023 13:22
I discovered that it is necessary for some existing, reasonable
use cases of `provablyDisjoint`.
@sjrd
Copy link
Member Author

sjrd commented Aug 30, 2023

I have polished the proposal a bit further with type lambda support in provablyDisjoint, and with a spec for the full reduction algorithm as well.

* `Ai ⋔ Bi` and it is in invariant position, or
* `Ai ⋔ Bi` and it is in covariant position and there exists a field of that type parameter in `E`

It is worth noting that this definition disregards prefixes entirely.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregards differing prefixes? Just to be clear that not all prefixes are disregarded.

@sjrd
Copy link
Member Author

sjrd commented Sep 4, 2023

@sjrd or @bishabosha could you add AnyKind to this, please? I.e. lampepfl/dotty#18510

What do you think is missing? It's already in the spec and in the implementation for this SIP. (Note that in the spec, Nothing and AnyKind are not classes; they are special types.)

@dwijnand
Copy link
Member

dwijnand commented Sep 4, 2023

@sjrd or @bishabosha could you add AnyKind to this, please? I.e. lampepfl/dotty#18510

What do you think is missing? It's already in the spec and in the implementation for this SIP. (Note that in the spec, Nothing and AnyKind are not classes; they are special types.)

That any type and AnyKind are never disjoint, which it is in the current implementation and that I'm fixing in my PR. If that's what's already said in this spec, then nothing more.

@sjrd
Copy link
Member Author

sjrd commented Sep 4, 2023

The provablyDisjoint () relation is defined in terms of all the possible cases that make it true. When one of the sides is AnyKind, the only possible rules that can apply are Nothing ⋔ T and S ⋔ Nothing, as well as rules on union/intersection types, which recurse. So AnyKind ⋔ Nothing and conversely, but no other type can ever be provably disjoint from AnyKind. There is no need for a rule that says something about when provablyDisjoint is not true.

@soronpo
Copy link
Contributor

soronpo commented Oct 17, 2023

@bjornregnell @Kordyjan Do you agree to request a vote in our next SIP meeting? If yes, what would be your recommendation?

@bjornregnell
Copy link

bjornregnell commented Oct 17, 2023

Do you agree to request a vote in our next SIP meeting? If yes, what would be your recommendation?

Yes, I recommend to accept iff as @sjrd is happy with the review feedback given so far and feel fairly certain that the spec does not contain major problems that cannot be fixed later. I guess we are voting for it to "experimental" and that we later vote for it be be accepted as "stable" when a tested implementation is available.

I have read through the proposal and find the motivation for the need of a spec very well written and understandable. However, my knowledge in the underlying theory, type systems. meta-programming, etc, is too limited to give deeper feedback to the actual specification, esp. the section "Matching".

If I mark the entire natural language bullet list after "matchPattern behaves according to what kind is P:" and copy-paste it into a val s = """...""" and calculate

scala> s.split("\n").map(_.trim).filter(_.startsWith("If ")).length

then I get 21 (sic!) "If "-clauses that are up for review. I humbly wonder if there is some other human with at least the knowledge of @sjrd that have double-checked each rule for omissions or bugs?

I have read all 21 of them and "LGTM", but I cannot say that I am able to check them, although they look reasonable to me on a surface-understanding level.

Also, perhaps it could be explicitly stated in the SIP preamble that this proposed spec form the requirements of a complete re-implementation from scratch of this language feature in a coming compiler version (if that is the case (?) - just to make it clear upfront).

(And then I wonder how to verify that the new implementation complies with this spec - but that is a matter of how to implement this and how to accept is as "stable" in a later stage...)

Anyway, thank you very much indeed @sjrd for embarking on this seemingly very intricate and tedious but also very important work.

@bjornregnell
Copy link

(And I guess that when implementing this in the compiler based on the spec, omissions and bugs in the spec are likely to be found, if any.)

@sjrd
Copy link
Member Author

sjrd commented Oct 17, 2023

I humbly wonder if there is some other human with at least the knowledge of @sjrd that have double-checked each rule for omissions or bugs?

FWIW, there is at least one other human: Olivier Blanvillain, who created match types in the first place.

I hoped there would be fewer "If"s. Unfortunately, it seems each one of them is necessary to have something complete enough. 🤷‍♂️

(And I guess that when implementing this in the compiler based on the spec, omissions and bugs in the spec are likely to be found, if any.)

There is already an implementation at scala/scala3#18262, which indeed, discovered omissions in the early versions of the spec. The latest spec is up-to-date with that implementation, and conversely. The implementation passes the (small) community build, but we have not tried it yet on the open community build.

@bjornregnell
Copy link

The implementation passes the (small) community build, but we have not tried it yet on the open community build.

OK great. Perhaps each "If" can be formulated as a test case in the code base?

@bjornregnell
Copy link

I looked at scala/scala3#18262 and indeed there are tests, but I could not trace them to the rules in the spec - would that kind of traceability be helpful? @sjrd

@bjornregnell
Copy link

bjornregnell commented Oct 17, 2023

Perhaps the rules in the spec should be numbered or named?

@soronpo
Copy link
Contributor

soronpo commented Nov 17, 2023

The committee voted today and accepted this SIP to move into the implementation stage.
A few comments:

  • It would be great to have a "running proof" disjointness example against a basic Scala match type example. One possible candidate is Match type regression scala3#18448
  • To move it to official from experimental, it's important to at least go through a green community build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants