-
Notifications
You must be signed in to change notification settings - Fork 1
/
unnamed.bs
619 lines (459 loc) · 18.7 KB
/
unnamed.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
<pre class='metadata'>
Title: A placeholder with no name
Shortname: D1110
Revision: 0
Audience: EWG
Status: D
Group: WG21
URL: http://wg21.link/P1110R0
!Source: <a href="https://github.com/jyasskin/cxx-unnamed-placeholder/blob/master/unnamed.bs">https://github.com/jyasskin/cxx-unnamed-placeholder/blob/master/unnamed.bs</a>
Editor: Jeffrey Yasskin, Google, jyasskin@google.com
Editor: JF Bastien, Apple, jfbastien@apple.com
Abstract: A C++ token for locations that would normally introduce a name, that says not to introduce the name.
Date: 2018-06-07
Markup Shorthands: markdown yes
Complain About: missing-example-ids yes
</pre>
> ❝I've been through the compiler on a var with no name<br>
> It felt good to be out of `main`<br>
> In the compiler you can remember your name<br>
> 'Cause there ain't no diagnostic for to give you no pain❞
>
> — A Var with no Name
# Introduction and motivation # {#intro}
We have several locations in C++ where a name is expected, but programmers may
not need to use that name ever again. Right now, they have to pick a different
name for each location, and readers can't immediately tell that the name isn't
used without an attribute or a naming convention. It would be nice to have a
clear, enforceable way to state that a name is ignored.
To enforce that the name is actually ignored, and that people reading the code
don't think that an unused variable is actually used, using it in an expression
should be an error instead of having a tag type like `nullptr_t`.
One of the authors of this paper filed [[EWG35]] on this subject years ago and
then forgot about it. Unlike that issue, this paper does not expect the
semantics of the placeholder name to be defined in terms of a generated unique
name.
# Examples # {#examples}
The goal of this section is to provide an example of using the placeholder in
all declaration contexts where it makes sense. These examples use `__` as the
placeholder, but see [[#spelling]] for an analysis of several options.
## Variables ## {#ex-variables}
To lock a mutex for a scope, without worrying that the variable name is already
taken:
```c++
std::lock_guard<std::mutex> __(a_mutex);
std::lock_guard<std::mutex> __(a_second_mutex);
```
To hold some data alive that's used via internal pointers:
```c++
std::unique_ptr<T> ptr = ...;
auto& field1 = ptr->field1;
auto& field2 = ptr->field2
[__=std::move(ptr), &field1=field1, &field2=field2](){...};
```
Note that the lambda capture case syntactically allows `[&__=init, ...](){}`, but
this is probably not useful.
Classes can also have data members named `__`, which have the marginal use of
reserving space for the future addition of new ABI-compatible fields. These
fields cannot be used for the same purpose as in lambda captures because they
can't be initialized.
```c++
class Foo {
Foo(std::unique_ptr<T> ptr)
: __(std::move(ptr)) // Doesn't work.
{}
std::unique_ptr<T> __; // Not effective.
int32_t __[4]; // Reserve 4 words of space for future fields.
};
```
## Structured Bindings ## {#ex-structured-binding}
[[P0144R2]] §3.8 mentions that structured bindings could potentially use a
syntax to ignore some fields from a bound structure:
> ```c++
> tuple<T1,T2,T3> f();
> auto [x, std::ignore, z] = f(); // NOT proposed: ignore second element
> ```
[[P0144R2]] suggests waiting until a pattern matching proposal, at which point
the right token will fall out. I suggest that this proposal also provides a
reasonable token, making this:
```c++
tuple<T1,T2,T3> f();
auto [x, __, z] = f(); // Ignore second element.
```
## Enumerators ## {#ex-enumerator}
This makes it very slightly easier to skip a value:
```c++
enum MmapBits {
Shared,
Private,
__,
__,
Fixed,
Rename,
...
};
```
I suspect that this is less clear than explicitly assigning the values, so it
would make sense to not support this.
## Speculative: Concept-constrained declarations ## {#ex-concept}
To define a deduced-type non-type template parameter where the type matches a
concept but the type isn't given a name in the body of the class or function:
```c++
template<Integral __ N> class integral_constant { ... };
```
To declare a deduced-type parameter or local variable whose type is constrained
by a concept:
```c++
Numeric multiplyAdd(Numeric __ x, Numeric __ y, Numeric __ z) {
Numeric __ multiplied = x * y;
return multiplied + z;
}
```
## Less useful examples ## {#less-useful-ex}
### Class, Struct, Enum ### {#ex-class}
These aren't necessary because developers can simply omit the name, but might
still be useful to simplify the declaration syntax:
```c++
class __ : Base { ... };
struct __ : Base { ... };
enum __ { Enumerator };
```
We couldn't actually require a name there for several more versions of C++, if
ever, but local style guides could require it in order to simplify what they
need to teach new C++ developers.
### Unnamed parameters ### {#ex-unnamed-param}
Like unused class names, parameter names can simply be omitted instead of
needing a placeholder. Again, requiring an explicitly-unused name, instead of
allowing it to be omitted, simplifies the declaration grammar.
```c++
class C : public Base {
void Func(int i, long __) override;
};
template<typename __>
class UsedInTemplateTemplateParam { ... };
```
### Function names ### {#ex-function}
Using `__` to name a function is a similar case to declaring a variable, except
that function definitions don't have the kind of side-effects that variable
definitions have.
```c++
void __(int arg) { ... }
void __(long arg) { ... }
```
Since the use of `__` prevents code from referring to these functions after
they're declared, the above code creates two independent functions rather than
an overload set (although I can't think of a way to distinguish the two).
Virtual functions can also be named `__`. Having an unnamed virtual function
take up a vtable slot could be useful to preserve ABI compatibility when a
future function is added, but that's a matter for platform ABIs rather than the
standard. This kind of virtual function cannot be overridden, of course, since
it can't be named again after its declaration.
### Using ### {#ex-using}
To use a type name, without being able to use the alias. I don't think using a
type name has side-effects, so there's probably not much point.
```c++
using __ = typename Base::value_type;
```
Similarly non-useful:
```c++
typedef typename Base::value_type __;
namespace __ = std::ranges;
```
### Concept introduction ### {#ex-concept-intro}
Like name aliases, a `requires` clause doesn't have side-effects, so doesn't
need to be assigned to an unnamed concept.
```c++
concept __ = requires {...};
```
## Discouraged examples ## {#discouraged-ex}
### Anonymous Namespaces ### {#ex-anon-namespace}
It would be possible to use `__` as the name of an unnamed namespace:
```c++
namespace __ { ... };
```
However, the meaning of an anonymous namespace isn't just the meaning of a
namespace with a name you never use again. Instead, [**namespace.unnamed**]
says,
> An *unnamed-namespace-definition* behaves as if it were replaced by
>
> <pre>inline<sub><i>opt</i></sub> namespace <i>unique</i> { /* empty body */ }
> using namespace <i>unique</i> ;
> namespace <i>unique</i> { <i>namespace-body</i> }</pre>
Because of this divergence in meaning from the other places `__` can appear, I
recommend disallowing it as a namespace name.
### Anonymous Unions ### {#ex-anon-union}
[**class.union.anon**] says that
```c++
struct MyStruct {
union {
int i;
double d;
};
};
```
Makes the union's members' names available as direct members of `MyStruct`.
Because of that extra behavior, I also don't think we should allow
```c++
struct MyStruct {
union __ {
int i;
double d;
} __;
};
```
We could say that naming either the union or the variable allows `__` in the
other position, or disallow `__` in either position.
# Prior Art # {#prior}
## Haskell ## {#prior-haskell}
Haskell uses a `_` token to indicate unused things:
```haskell
head (x:_) = x
tail (_:xs) = xs
```
Note that the _ can be "initialized" multiple times, unlike a named but unused
variable:
```haskell
data Colour = Colour { red::Int, green::Int, blue::Int, opacity::Int}
isOpaqueColour :: Colour -> Bool
isOpaqueColour (Colour _ _ _ opacity) = opacity == 255
```
## Python ## {#prior-python}
`_` is conventionally used to name unused variables. For example:
```python
label, has_label, _ = text.partition(':')
for _ in range(10):
call_ten_times()
```
https://stackoverflow.com/a/5893946/943619 describes this use, but there is no
mention in the official Python documentation.
## Scala ## {#prior-scala}
Scala uses `_` for many kinds of placeholders:
### [Pattern matching](https://docs.scala-lang.org/tour/pattern-matching.html#pattern-guards) ### {#scala-patterns}
```scala
def showImportantNotification(notification: Notification, importantPeopleInfo: Seq[String]): String = {
notification match {
case Email(email, _, _) if importantPeopleInfo.contains(email) =>
"You got an email from special someone!"
case SMS(number, _) if importantPeopleInfo.contains(number) =>
"You got an SMS from special someone!"
case other =>
showNotification(other) // nothing special, delegate to our original showNotification function
}
}
```
Variable declarations can be patterns as well, which has the side-effect that
```scala
var _ = ...
```
defines an un-named variable like the subject of this proposal. This is less
useful in Scala than in C++ because Scala doesn't have destrutors.
### [Defaulted definitions](https://www.scala-lang.org/files/archive/spec/2.12/04-basic-declarations-and-definitions.html#variable-declarations-and-definitions) ### {#scala-defaults}
The following initializes `x` to a default value depending on its type. For
example, integral types get `0`, and reference types get `null`.
```scala
var x: T = _
```
### [Imports](https://www.scala-lang.org/files/archive/spec/2.12/04-basic-declarations-and-definitions.html#import-clauses) ### {#scala-imports}
> The import clause `import p._` … makes available without qualification all
members of `p` (this is analogous to `import p.*` in Java).
### Wildcards ### {#scala-type-wildcards}
Scala uses the `_` for wildcards in both types and functions. It's possible to
write an anonymous function [just by using an _ in an
expression](https://www.scala-lang.org/files/archive/spec/2.12/06-expressions.html#placeholder-syntax-for-anonymous-functions).
The first column in each of the rows in the following table is equivalent to the
second.
<table highlight="scala">
<tr><td>`_ + 1`</td><td>`x => x + 1`</td></tr>
<tr><td>`_ * _`</td><td>`(x1, x2) => x1 * x2`</td></tr>
<tr><td>`(_: Int) * 2`</td><td>`(x: Int) => (x: Int) * 2`</td></tr>
<tr><td>`if (_) x else y`</td><td>`z => if (z) x else y`</td></tr>
<tr><td>`_.map(f)`</td><td>`x => x.map(f)`</td></tr>
<tr><td>`_.map(_ + 1)`</td><td>`x => x.map(y => y + 1)`</td></tr>
</table>
Similarly, [the following two types are equivalent](https://www.scala-lang.org/files/archive/spec/2.12/03-types.html#placeholder-syntax-for-existential-types):
```scala
Ref[_ <: java.lang.Number]
Ref\[T] forSome { type T <: java.lang.Number }
```
## Rust ## {#prior-rust}
Rust provides a couple ways to [ignore values in
patterns](https://doc.rust-lang.org/book/second-edition/ch18-03-pattern-syntax.html#ignoring-values-in-a-pattern),
which include variable and function parameter declarations:
`_`:
```rust
fn foo(_: i32, y: i32) {
println!("This code only uses the y parameter: {}", y);
}
```
Variables prefixed with `_`:
The compiler will usually give an error for an unused variable, but if it's
prefixed with `_`, the error is suppressed:
```rust
let s = Some(String::from("Hello!"));
if let Some(_s) = s {
println!("found a string");
}
```
However, because of Rust's borrow checker, this is semantically different from a
simple `_`: the `_s` takes ownership of the object it's bound to, where a `_`
wouldn't.
`..` to ignore several parts of a value:
```rust
let numbers = (2, 4, 8, 16, 32);
match numbers {
(first, .., last) => {
println!("Some numbers: {}, {}", first, last);
},
}
```
## C# ## {#prior-csharp}
C# defines the `_` variable as a
["discard"](https://docs.microsoft.com/en-us/dotnet/csharp/discards), and allows
their use in assignments in the following contexts:
* Tuple and object deconstruction.
* Pattern matching with is and switch.
* Calls to methods with `out` parameters.
* A standalone `_` when no `_` is in scope.
C# had to deal with `_` being a pre-existing valid variable name, which causes
[certain uses of _ to be errors or
bugs](https://docs.microsoft.com/en-us/dotnet/csharp/discards#a-standalone-discard)
if a `_` variable already exists in the same scope.
## Java ## {#prior-java}
Java uses a `?` token to represent a "wildcard" generic argument.
```java
void printCollection(Collection<?> c) {
for (Object e : c) {
System.out.println(e);
}
}
```
Java generic arguments usually need to be declared, but wildcards imply that a
function is generic:
```java
class Collections {
public static <T, S extends T> void copy(List<T> dest, List<S> src) {
...
}
class Collections {
public static <T> void copy(List<T> dest, List<? extends T> src) {
...
}
```
## Googlemock ## {#prior-googlemock}
Googlemock defines
[_](https://github.com/google/googletest/blob/master/googlemock/docs/CheatSheet.md#wildcard)
to match anything. It doesn't use it for declarations, but it's still an "ignore
this" token.
```c++
// Expects the turtle to move forward by 100 units.
EXPECT_CALL(turtle, Forward(100));
// Expects the turtle to move forward.
EXPECT_CALL(turtle, Forward(_));
```
## Halide ## {#prior-halide}
From Halide's
[documentation](http://halide-lang.org/docs/class_halide_1_1_var.html#a333e72cf9af6339530cb3544b7fe1324):
> For example, consider the definition:
>
> ```c++
> Func f, g;
> Var x, y;
> f(x, y) = 3;
> ```
>
> A call to f with the placeholder symbol _ will have implicit arguments injected automatically, so f(2, _) is equivalent to f(2, _0), where _0 = Var::implicit(0), and f(_) (and indeed f when cast to an Expr) is equivalent to f(_0, _1). The following definitions are all equivalent, differing only in the variable names.
>
> ```c++
> g(_) = f*3;
> g(_) = f(_)*3;
> g(x, _) = f(x, _)*3;
> g(x, y) = f(x, y)*3;
> ```
## Others ## {#prior-others}
I'm told that Swift, Erlang, Elixir, Prolog, OCaml, and F# also use the `_`
for purposes related to ignoring variables, but I ran out of time to investigate
and describe them all.
# Spelling # {#spelling}
How should we spell the token?
## `_` ## {#spell-underscore}
I believe `_` is the ideal spelling, but [**lex.name**] ¶3.2 only reserves it in
the global namespace, and it's now used by libraries:
* [testing::_ in
Googlemock](https://github.com/google/googletest/blob/master/googlemock/docs/CheatSheet.md#wildcard)
* [Halide::_ in Halide](http://halide-lang.org/docs/_var_8h.html)
* [The _() macro in Gnu
Gettext](https://www.gnu.org/software/gettext/manual/html_node/Mark-Keywords.html)
To avoid making these existing uses of `_` ill-formed, see [[#qs-second-use]].
## `__` ## {#spell-double-under}
This is already a reserved identifier per [**lex.name**] ¶3.1.
> Each identifier that contains a double underscore `__` or begins with an
> underscore followed by an uppercase letter is reserved to the implementation
> for any use.
It takes a little longer to type than `_`, and in many fonts the difference
between `__` and `_` may be unclear.
This paper uses `__` in the [examples](#examples).
The `__` identifier is used in some codebases, such as [the V8 JavaScript
engine](https://github.com/v8/v8/search?q=%22define+__%22&unscoped_q=%22define+__%22_).
Using a macro named `__` as V8 does means that breakage will only occur if that
codebase also attempts to use this new feature.
## `?` ## {#spell-question}
`?` might be confusable with the ternary `?:` operator, but developers can
distinguish based on it appearing in a declaration rather than expression
context.
`?` may also be confused with a wildcard like `*`, to mean "union all the
possible values in this position", for example in a `nested::name::specifier`.
Fortunately, I don't see a reason for `?` to be valid in contexts where that
interpretation is plausible.
Using `?` for the placeholder identifier might prevent its use in the syntax for
a forwarding reference. e.g. `ForwardingRef&&?`.
## Repeated `?` ## {#spell-rep-question}
Multiple `?` could be used to disambiguate some of the issues with a single `?`.
Using `??` or `???` is more typing but is fairly obvious, though it would likely
be better to make this a token so whitespace cannot be used between question marks.
## `auto` ## {#spell-auto}
The "attribute" proposal for concept declarations uses `auto` in a place a type
might appear, which one might consider precedent for using it as a more general
placeholder.
## Omit the name ## {#spell-omit}
We could just allow omitted names in more places:
```c++
auto = std::lock_guard(mx);
```
This option «mov[es] closer to "all syntax is valid and means something, but
maybe not what you meant".» — [Tony Van
Eerd](http://lists.isocpp.org/ext/2018/06/4518.php)
## Various Unicode characters ## {#💩}
We could use various Unicode characters, such as the replacement character
`�`, the empty set `∅`, the interrobang `‽`, a trash can `🗑`, one of the
recycle signs `♻️`, and countless other characters.
These all have the shortcoming that not all toolchains and editors support
them well or uniformly.
# Semantic questions # {#semantic-qs}
## Only on second use ## {#qs-second-use}
We could declare that our placeholder makes variables anonymous only once it's
used for the second time in a scope. That is:
```c++
int _ = 10; // fine
_ = 11; // fine, can use it
int _ = 12; // another _, fine
// at this point, both _ variables exist, but can no longer be accessed:
_ = 13; // error - which one?
```
Within this option, we could decide whether or not a second declaration is
ill-formed if `_` has already been used within the scope.
Allowing this option at namespace scope, as [Googlemock](#prior-googlemock)
would need, seems to make ODR violations more likely. A file with only 1 unused
variable (perhaps used to register a static initializer) would be likely to
forget that variable's `static` and so generate an external symbol, which would
collide with a similar symbol from another file.
# Wording # {#wording}
TBD
# Acknowledgements # {#ack}
Thanks to:
* Daveed Vandevoorde for suggesting using `?`.
* Thomas Köppe for suggesting `auto`.
* Mathias Stearn for suggesting omitting the name.
* Tony Van Eerd for providing an argument against omitting the name, and for
suggesting that `_` only become magic when it's re-used.
* Richard Smith for pointing out Halide.
* Michael Park for a long list of languages that use `_`.