save me from ASCII
β This page is still evolving.
This page was moved from this gist. See the original gist for comments and previous edit history.
Perhaps rakudo wiki is not the best place for this page to live. It will be moved elsewhere once a better home is found.
π You can edit this page π
Raku has support for various unicode characters (Β½ ΒΉ β Γ Γ·, see this link for a full list), but there are other things we can add. Here is a list of things to think about.
The idea behind this page is to store all ideas. In other words, this page is not a TODO list, but a blackboard for brainstorming.
To keep us all on the same page, here are some things to note. Good reasons to add a Unicode operator are:
- When an established math/compsci notation exists for a Raku feature (and is represented in Unicode).
Example: Γ for multiplication, β for Inf, etc. - When a Raku operator or syntax is basically ASCII art trying to paint a larger βglyphβ, and there's a Unicode character for that exact glyph (not just resembling it, but specifically intended for it).
Example: β for -> (-> is trying to paint a rightwards arrow, which is exactly what the Unicode char U+2192 RIGHTWARDS ARROW is for) - When it is too hard to implement it in a module (e.g. it is so deep in the parser that trying to reimplement it in a module is completely unreasonable)
Example: You can define your own β€ β₯ β ops but you cannot get them chain correctly (at least you couldn't at the time) - When a particular sequence is often auto-corrected by software.
Example: LibreOffice changes '' to ββ, ... to β¦, -> to β, etc.
This section has things that we might want to add (β¦ maybe!).
β in https://github.com/rakudo/rakudo/pull/1032
These are more or less obvious and a lot of people have wondered why Raku does not support them yet.
See also: https://irclogs.raku.org/perl6/2016-01-09.html#11:44
Why not allow β in pointy blocks?
for ^5 -> $i { say $i } # noo! for ^5 β $i { say $i } # yeaah!
Same goes for lambdas with rw signatures:
for @values <-> $even, $odd { $even Γ·= 2 } # noo! for @values β $even, $odd { $even Γ·= 2 } # yeaah!
We can also do the same thing with fat arrow:
my %h = 42 => 62 # noo! my %h = 42 β 62 # yeaah!
ΒΏ? , Β‘! and β β as Spanish inspired quoting delimiters (to act the same as "")
say ΒΏfoo $*IN bar? # foo <STDIN> bar say Β‘foo %*ENV<USER> bar! # foo liz bar say βfoo { Date.today } barβ # foo 2019-10-06 bar
Available in https://github.com/rakudo/rakudo/pull/3218 .
This section is for things that we probably don't want to add, or at least not in the near future. Still, we will keep all our ideas written down.
So that this:
#| This subroutine does the real work sub do_raw_magic ( Spell $s, #= Which spell to invoke *%options #= How to invoke it ) {...}
Could be written as:
#β This subroutine does the real work sub do_raw_magic ( Spell $s, #β Which spell to invoke *%options #β How to invoke it ) {...}
While it makes sense in some cases, #β and #β can be misleading in others. See speculations for more examples. Basically, #| is used for the βnext thingβ which can be #β or #β, and same for #= where it can mean #β or #β.
Β¬ for logical not β§ for and β¨ for or β» for xor
β, β and β. Probably as prefix operators. Some argue that we should only add β given that square root is much more common. But if we are adding β, then we should add β and β for consistency. The argument against adding any of the roots is variant precedence: sqrt 4+5 is 3, but β4+5 would need to be 7 if you're following the standard mathematical precedence.
β βΌ as a non-ASCII version of ?? !!.
Done in https://github.com/rakudo/rakudo/pull/1029, then reverted.
See this ticket for more information: https://rt.perl.org/Ticket/Display.html?id=131002
TL;DR: it fails to satisfy criteria mentioned on top of this page. That is, there is no reason to add it (and we are not adding unicode ops just for fun).
β as an alternative to / and Γ·. Why? Because we already support β.
U+2215 DIVISION SLASH [Sm] (β)
(And for those wondering: U+2212 MINUS SIGN [Sm] (β))
Note that U+2044 FRACTION SLASH [Sm] (β) is also listed on this page below.
@Zoffix: supporting `β` has been a nightmare, with conditionals littered all over the codebase. And even after all that work, it's still not fully supported (can't use it in `sprintf` formats for example, as those are handled in NQP and I'm unsure we want to leak all these fancy ops to NQP). So adding an op just-because is a bad idea and we shouldn't add any more slashes.
Raku already supports Unicode fractions, like Β½ and β . A logical extension would be to also support literals like 1Β½.
Triangular reduce can have its own unicode character too.
.say for [\+] ^10 .say for [βΊ+] ^10
Other possible candidates: βΏ, β₯, etc. (which one represents it the most?)
U+2301 ELECTRIC ARROW β for the ~~ operator.
By the way, we can't use β for anything because it brings a confusion about whether it is an approximation or a smartmatch. In that sense β (already supported by Raku) and β will play well together.
β as a non-ASCII version of :=. Pretty obvious.
However, if we are going to add that, then we cannot just leave out ::=, which also has a corresponding unicode character:
β©΄ for the ::= operator
The problem is that both are not rendered very nicely by current fonts. β©΄ is also very wide.
β for the || operator
Good, but there is no corresponding non-ASCII version of &&. So I guess that there is no reason to add that right now.
β£ for %% (U+2223 DIVIDES [Sm] (β£))
β€ for |%% (U+2224 DOES NOT DIVIDE [Sm] (β€))
- β¦
While β©΅ can possibly fit into the width of one character, β©Ά definitely won't. Normally, this kind of characters are full-width (they take double the size of a narrow character). Does it prevent us from adding it? Probably not, but it is something to think about.
It's called DECIMAL EXPONENT SYMBOL after all, that's its *job*; might as well permit it as a synonym for e in floating-point scientific notation, so that 6.02β¨23 is the same as 6.02e23. It's at least as prominent as not-a-digit as e is (and rather more so than E), and has quite a natural reading, as the subscript 10 thus places the exponent as its, well, exponent. It's a one-character change to the parser, of course.
Note that this is not the same as discussions of Subscripts in general, below.
Rakudo PR: https://github.com/rakudo/rakudo/pull/1348
We already have superscripts 4Β² # Woohoo!, but what about subscripts (*β)? The choice of unicode characters is pretty much obvious, but what should be the meaning?
There are *four* options:
- Subscripts could be allowed in variable names at the end, so that you can write my $xβ = 5. This has been implemented as [Slang::Subscripts](https://modules.perl6.org/dist/Slang::Subscripts)
- Subscripts could be allowed in variable names at the same places where we allow digits, so that you can write my $HβSOβ = 5. Implemented in https://github.com/rakudo/rakudo/pull/3219
- Subscripts could act like array subscripts so that you can write @xβ which will be equivalent to @x[1]
- Numbers in other bases: HU08ββ
@AlexDaniel insists on the third option, but most people strongly want the first or second one.
If we go for the third option, then there are some other interesting possibilities:
- We can use low asterisk to act like a subscript whatever star: @xβββ
- Or we can use β as a last index of the array. Like: @xβ. Perhaps this would make mathematicians happy.
Or we can support both β and β (because why not).
Another option is to use unicode subscripts as array subscripts if @ sigil is used, but allow subscripts in variable names that have $ sigil. This will probably make all of us happy (but it is going to be so weird⦠what a horrible idea).
The most problematic case was with the code like (* * *)(4, 2). It got better when school-grade math ops were implemented: (* Γ *)(4, 2), but still, it would be great to have a unicode equivalent to whatever star.
The problem is, there is no obvious character for that.
There are several classes of proposed characters:
- Star-like symbols: β β Ω βͺ βΆ β β° π and so on and so forth, unicode has so many of those it's not even funnyβ¦
- Asterisks: β β§ β β οΌ β
- Chars that look like an empty field to fill: π β― β πΎ β
- Other: β° β£
The problem with stars is that they perfectly represent the βstarβ part, but not so much the βwhateverβ part. Circles are just circles, they just don't have enough meaning in my opinion. β° is an APL char, which we'd much rather leave alone.
There is one more thing: besides Whatever (*) there is also HyperWhatever(**) which perhaps should also get a unicode symbol. This means that not only we have to find one good single character, it would be better if we had a pair of similar characters (e.g. something like β and β but better).
β β [+] β β [-] β β [Γ] (or something else?) β§ β [*] β§ β [/]
But of course there are many other operators that people use all the time. To solve this we can use U+20DE COMBINING ENCLOSING SQUARE [Me] (ββ):
Γ·β β [Γ·]
However, this has a limit of one character per operator, which means that in some cases you will be forced to fall back to ASCII [β¦].
A better idea for [+] and [Γ] is to add β and β.
β 1..10 == [+] 1..10 == 55 β 1..10 == [Γ] 1..10 == 3628800
β can be used as a spaceship operator (<=>). But there are other candidates as well:
U+22DA LESS-THAN EQUAL TO OR GREATER-THAN β U+22DB GREATER-THAN EQUAL TO OR LESS-THAN β U+1F680 ROCKET π
The last one is a joke, of course.
β’ β―β
say 2β₯5
U+2044 FRACTION SLASH [Sm] (β)
We may want to support β in addition to Γ·, / and ββ¦ but this one is a little bit special because it is meant for creating fractions (like β , but with any other numbers). What is supposed to happen if you have a variable in there?
What about using β for return?
This section is for ideas that have no ASCII equivalents. That is, addition of these things will also require addition of ASCII versions.
Β± can be used to create ranges. Example:
sub infix:<Β±> { Range.new: $^a - $^b, $a + $b }; say 5 Β± 2 # OUTPUT: Β«3..7β€Β» say 4 ~~ 5 Β± 2 # OUTPUT: Β«Trueβ€Β» say 0 ~~ 5 Β± 2 # OUTPUT: Β«Falseβ€Β»
Alternatively, Β± could create junctions. In this case, it'd be both a prefix and an infix operator.
Β±1 == 1 | -1 5 Β± 2 == 3 | 7 $x = (-$b Β± β($bΒ² - 4Γ$aΓ$c)) / (2Γ$a);
ββ¦β for floor(β¦) ββ¦β for ceil(β¦)
U+230A LEFT FLOOR [Ps] (β) U+230B RIGHT FLOOR [Pe] (β) U+2308 LEFT CEILING [Ps] (β) U+2309 RIGHT CEILING [Pe] (β)
What would the ASCII variants for this be? |_β¦_| and |^β¦^|? These are probably better off without ASCII equivalentsβ¦
There is an idea that πΌ (VERY HEAVY SOLIDUS) can produce a FatRat, as in:
sub infix:<πΌ> { FatRat.new: $^a, $^b }
We can use β β for creating bags.
U+27C5 LEFT S-SHAPED BAG DELIMITER [Ps] (β ) U+27C6 RIGHT S-SHAPED BAG DELIMITER [Pe] (β)