-
Notifications
You must be signed in to change notification settings - Fork 39
Tuple projections #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,251 @@ | ||
# Tuple Projections | ||
|
||
## Overview | ||
|
||
This RFC proposes a quality-of-life improvement to OCaml's tuples, adding | ||
support for labeled and unlabeled tuple projections. | ||
|
||
## Proposed change | ||
|
||
The idea is to allow users to directly project elements from a tuple using | ||
labels or indicies, as opposed to patterns: | ||
|
||
```ocaml | ||
# let x = (~tuple:42, ~proj:1337, "is", 'c', 00, 1);; | ||
val x : (tuple: int * proj:int * string * char * int * int) = | ||
(~tuple:42, ~proj:1337, "is", 'c', 0, 1) | ||
# x.tuple;; | ||
- : int = 42 | ||
# x.3;; | ||
- : char = 'c' | ||
``` | ||
|
||
Here, we're able to project out of a 6-tuple (containing both labeled and | ||
unlabeled components) simply by writing `x.j` (for an index `j`) or `x.l` (for | ||
a label `l`). Tuple indices are 0-indexed, to match the existing indexing | ||
convention for arrays, lists, etc. | ||
|
||
This is useful for a couple reasons: | ||
- Conciseness: avoids unnecessary boilerplate pattern matches / projection | ||
functions e.g. `Tuple2.fst, Tuple3.fst, ...`. Additionally, with the advent | ||
of labeled tuples, such functions are less useful[^1]. | ||
- Clarity: `x.1` is more readable than `let _, x1, _ = x in ...` | ||
- Parity with records: complements record field projection and aligns tuple | ||
uses with other ML-like languages. | ||
|
||
Occurrences of explicit pattern matching for tuple projection are | ||
reasonably frequent where tuple projections could otherwise be used | ||
(found by doing a quick Sherlocode[^2]): | ||
- ~6k uses of projections on pairs, | ||
- ~1.15k for triples, | ||
- ~210 for quadruples. | ||
|
||
|
||
## Previous work | ||
|
||
Many other strongly-typed languages support built-in tuple projections. | ||
|
||
### SML | ||
|
||
SML models tuples as records with integer field names (1-indexed), so projection uses | ||
record selection syntax: | ||
```sml | ||
> val x = (1, "hi", 42);; | ||
val x = (1, "hi", 42): int * string * int | ||
> val y = #1 x;; | ||
val y = 1: int | ||
> val z = #3 x | ||
val z = 42: int | ||
``` | ||
|
||
### Rust | ||
|
||
Rust supports tuple projections (0-indexed) for ordinary tuples (and tuple structs): | ||
```rust | ||
let x = (42, "is", 'c'); | ||
let y = x.0; // 42 | ||
let z = x.1; // "is" | ||
|
||
struct Point(i32, i32); | ||
let p = Point(3, 4); | ||
let x_coord = p.0; // 3 | ||
``` | ||
|
||
Record structs also use the same syntax for projection: | ||
```rust | ||
struct Point { x : i32, y : i32 }; | ||
let p = Point { x = 3, y = 4 }; | ||
let x_coord = p.x; // 3 | ||
``` | ||
|
||
|
||
### Swift | ||
|
||
Swift supports tuple projections via both positional indicies and labels (as in this proposal): | ||
```swift | ||
import Foundation | ||
|
||
let x = (tuple: 42, proj: 1337, "is", 'c', 00, 1); | ||
|
||
print(x.tuple) // 42 | ||
print(x.5) // 1 | ||
``` | ||
|
||
## Implementation | ||
|
||
An experimental implementation is available at [PR 14257](https://github.com/ocaml/ocaml/pull/14257). | ||
|
||
### Parsetree changes | ||
|
||
Given the syntax for labeled tuple projection is overloaded with record field | ||
projection, i.e. there is no syntactic distinction between the projections in: | ||
```ocaml | ||
let x = { foo = 1; bar = 2 } in x.foo;; | ||
``` | ||
and | ||
```ocaml | ||
let x = ~foo:1, ~bar:2 in x.foo;; | ||
``` | ||
|
||
|
||
The proposed parsetree additions will represent all projections using `Pexp_field`: | ||
```ocaml | ||
type field = | ||
| Pfield_record_or_tuple_label of Longident.t loc | ||
| Pfield_tuple_index of int loc | ||
|
||
and expression_desc = | ||
... | ||
| Pexp_field of expression * field | ||
... | ||
``` | ||
|
||
|
||
### Typechecking | ||
|
||
While typechecking, when encountering a field projection in expressions, | ||
|
||
- If the field is a tuple index `j`, type as a unlabeled tuple projection. | ||
|
||
Check to see whether the expected type is known (principally known, if in `-principal` mode). | ||
Then: | ||
* If the type is not known: raise an error stating that the projection is ambiguous. | ||
* If the type is known to be `(?l0:ty0 * ... * tyj * ... * ?ln:tyn)`: type the projection as `tyj` | ||
|
||
- If the field is a record or tuple label `l`. | ||
|
||
Check to see whether the expected type is known: | ||
- If the type is not known: typecheck the projection `e.l` as a record projection | ||
- If the type is known to be `(ty0, ..., tyn) t`: ditto | ||
- If the type is known to be `(?l0:ty0 * ... * l:tyl * ... * ?ln:tyn)`: type the projection | ||
as `tyl`. | ||
|
||
## Considerations | ||
|
||
### Limitations of type-based disambiguation | ||
|
||
OCaml's current type-based disambiguation mechanism is relatively weak. As a result, | ||
many of the patterns that tuple projections are intended to replace would be ill-typed under | ||
today's implementation. For instance: | ||
```ocaml | ||
# List.map (fun x -> x.1) [42, "Hello"; 1337, "World"];; | ||
Error: The type of the tuple expression is ambiguous. | ||
Could not determine the type of the tuple projection. | ||
``` | ||
|
||
That said, this limitation does not arise from the feature itself, but from the | ||
weaknesses in OCaml's type propagation. Improving type propagation (separately) | ||
would benefit not only tuple projections, but other features that rely on | ||
type-based disambiguation (e.g. constructors and record fields). As such, we | ||
argue that tuple projections should not be rejected on this point alone, and | ||
that the broader issues of type propagation and disambiguation be addressed | ||
separately. | ||
|
||
### Syntactic overloading | ||
|
||
This proposal reuses the existing projection syntax `e.l` for both record | ||
fields and labeled tuples. The primary motivator behind this is to avoid | ||
introducing new operators and keeps projection syntax uniform. | ||
|
||
The downside is that it increases reliance on type-based disambiguation. | ||
|
||
### Diagnostic quality of error messages | ||
|
||
Type errors surrounding unknown fields will need to be refined. | ||
In particular, when the compiler defaults a labeled projection to a record | ||
field (even though it might also have been a labeled tuple projection), | ||
the diagnostic report ought to make this clear. | ||
|
||
Otherwise, programs like the following may yield cryptic messages: | ||
```ocaml | ||
# let is_ill_typed_due_to_defaults x = | ||
let y = x.tuple_label_a in | ||
ignore (x : (tuple_label_a:int * string * bool)); | ||
(y, x.2) | ||
Error: Unbound record field `tuple_label_a` | ||
``` | ||
|
||
A clearer diagnostic could be: | ||
``` | ||
Error: The field `tuple_label_a` is unknown. | ||
The projection `x.tuple_label_a` was interpreted as a record field, | ||
but no such record field exists. | ||
|
||
Hint: Did you mean to project from a labeled tuple instead? | ||
If so, add an annotation to disambiguate the projection. | ||
``` | ||
|
||
Other problematic examples include conflicts with existing records: | ||
```ocaml | ||
# type discombobulating_record = { tuple_label_a : int };; | ||
type discombobulating_record = { tuple_label_a : int } | ||
# let is_ill_typed_due_to_defaults x = | ||
let y = x.tuple_label_a in | ||
ignore (x : (tuple_label_a:int * string * bool)); | ||
(y, x.2) | ||
Error: The value `x` has type `discombobulating_record` but an expression was | ||
expected of type `tuple_label_a:int * string * bool` | ||
``` | ||
Here the error conflates record and tuple typing, which is misleading. | ||
A more informative report could combine a warning with the final error: | ||
```ocaml | ||
Warning: The projection `x.tuple_label_a` could refer either to a record field | ||
or a labeled tuple component. It was resolved as a record field of | ||
`discombobulating_record`. | ||
Please disambiguate if this is wrong. | ||
Error: The value `x` has type `discombobulating_record` but an expression | ||
was expected of type | ||
`tuple_label_a:int * string * bool` | ||
``` | ||
|
||
|
||
### Row-polymorphic tuples | ||
|
||
Unlabeled tuple projections can naturally (and efficiently) be typed using row polymorphism, in | ||
the same way object fields are typed today: | ||
```ocaml | ||
# let snd x = x.2;; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here we might want to have the same semantic as with let get_y t =
let ~y, .. = t in
y which is rejected if I remember correctly. We would also need to check that the back-end supports such polymorphism (I remember talking about a usefull feature similar to this one for modules with @lthls) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this is rejected currently since the expected type of the pattern (i.e. the type of |
||
val snd : 'a * 'b * .. -> 'b | ||
``` | ||
|
||
This generalises to tuples of arbitrary arity (and is strictly more powerful than this proposal). | ||
However, extending the same mechanism to labeled tuples is significantly more difficult (without incurring | ||
runtime overhead or using monomorphisation). | ||
|
||
From a language design perspective, though, we would ideally want the typing of projections | ||
to behave uniformly across both labeled and unlabeled tuples. Moreover, the typing behaviour | ||
discussed in this RFC is compatible with this use of row polymorphism. | ||
|
||
|
||
### Non-principal defaults | ||
|
||
For type-based disambiguation, OCaml usually implements a 'default' behaviour when | ||
the type is unknown. For instance, in record field / variant overloading, if the type is | ||
not known to be a nominal type `(ty0, ..., tyn) t`, the lexically-closed matching record field / variant | ||
is used. | ||
|
||
We could have a similar default rule for tuple projections `e.j`, if the type is unknown: type `e` as `min 2 (j + 1)`-ary tuple. | ||
|
||
[^1]: Without relying on coercions between labeled and unlabeled tuples, as in [PR 14180](https://github.com/ocaml/ocaml/pull/14180). And even then, it would be shorter to write a pattern than the types for `:>`. | ||
|
||
[^2]: The following family of patterns were used to derive estimates: [`fun (\w\+, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_)), [`fun (\w\+, _, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_%2C%20_)), [`fun (\w\+, _, _, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_%2C%20_%2C%20_)), [`fun (_, \w\+)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B)), [`fun (_, \w\+, _)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B%2C%20_)), [`fun (_, \w\+, _, _)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B%2C%20_%2C%20_)), etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in the case :
Is the example rejected because only
x.a
andx.b
are accepted and notx.0
/x.1
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be rejected, as you correctly surmise, the only well-typed projections for
x
would bex.a
,x.b
.This is consistent with the current pattern matching semantics i.e. the following is ill-typed: