# perl6/doc

Switch branches/tags
Nothing to show
Fetching contributors…
Cannot retrieve contributors at this time
344 lines (258 sloc) 13.2 KB
 =begin pod :tag =TITLE Entering unicode characters =SUBTITLE Input methods for unicode characters in editors and the shell Perl 6 allows the use of unicode characters as variable names. Many operators are defined with unicode symbols (in particular the L) as well as some quoting constructs. Hence it is good to know how to enter these symbols into editors, the Perl 6 shell and the command line, especially if the symbols aren't available as actual characters on a keyboard. General information about entering unicode under various operating systems and environments can be found on the Wikipedia L. =head1 XCompose (Linux) X<|XCompose> Xorg includes digraph support using a L|https://en.wikipedia.org/wiki/Compose_key#GNU.2FLinux> . The default of C can be remapped to something easier such as C. In I and I this can be setup under C. So, for example, to input C<»+«> you could type C<< CAPSLOCK > > + CAPSLOCK < < >> I allows customising the digraph sequences using a C<.XCompose> file and L is an extremely complete one. In I, I was overridden and replaced with a hardcoded list, but it is possible to restore I by setting C in your environment. It might be necessary to install a xim bridge as well, such as C. =head2 Getting compose working in all programs You may have issues using the compose key in all programs. In that case you can try C. =for code :lang input_module=xim export GTK_IM_MODULE=$input_module export XMODIFIERS=@im=$input_module export QT_IM_MODULE=$input_module If you want this to be for all users you can put this in a file C, which is the easiest way, since you won't have to deal with how different GUI environments set up their environment variables. If you use KDE you can put this file in C<~/.config/plasma-workspace/env/compose.sh> and that should work. Other desktop environments will be different. Look up how to set environment variables in yours or use the system-wide option above. =head2 ibus X<|ibus> If you have problems entering high codepoint symbols such as B<🐧> using the C input module, you can instead use ibus. You will have to install the ibus package for your distribution. Then you will have to set it to start on load of your Desktop environment. The command that needs to be run is: =for code :lang ibus-daemon --xim --verbose --daemonize --replace Setting C<--xim> should also allow programs not using ibus to still use the xim input method and be backward compatible. =head3 KDE If you are using KDE, open the start menu and type in “Autostart” and click B which should be the first result. In the settings window that opens, click B, type in C and click OK. Then go into the Application tab of the window that pops up. In the C field, enter in the full ibus-daemon command as shown above, with the C<--desktop> option set to C<--desktop=plasma>. Click OK. It should now launch automatically when you log in again. =head1 WinCompose (Windows) X<|WinCompose> L adds L functionality to Windows. It can be installed either via the L page on GitHub, or with the L. Once the program is installed and running, right click the tray icon and select C to set your desired key. WinCompose has multiple sources to choose from in C. It is recommended to enable C and disable C, as there are a handful of operators which C does not provide sequences for, and C also has sequences which conflict with operator sequences present in C. Sequences can be viewed by right clicking the tray icon and selecting C. If you wish to add your own sequences, you can do so by either adding/modifying C<.XCompose> in C<%USERPROFILE%>, or editing user-defined sequences in the options menu. =head1 Editors and shells =head2 Vim X<|Vim> In L, unicode characters are entered (in insert-mode) by pressing first C (also denoted C<^V>), then C and then the hexadecimal value of the unicode character to be entered. For example, the Greek letter λ (lambda) is entered via the key combination: =begin code :lang ^Vu03BB =end code You can also use C/C<^K> along with a digraph to type in some characters. So an alternative to the above using digraphs looks like this: =begin code :lang ^Kl* =end code The list of digraphs Vim provides is documented L; you can add your own with the C<:digraph> command. Further information about entering special characters in Vim can be found on the Vim Wikia page about L. =head3 vim-perl6 The L plugin for Vim can be configured to optionally replace ASCII based ops with their Unicode based equivalents. This will convert the ASCII based ops on the fly while typing them. =head2 Emacs X<|Emacs> In L, unicode characters are entered by first entering the chord C at which point the text C appears in the minibuffer. One then enters the unicode code point hexadecimal number followed by the enter key. The unicode character will now appear in the document. Thus, to enter the Greek letter λ (lambda), one uses the following key combination: =begin code :lang C-x 8 RET 3bb RET =end code Further information about unicode and its entry into Emacs can be found on the L. You can also use L character mnemonics by typing: =begin code :lang C-x RET C-\ rfc1345 RET =end code Or C. To type special characters, type C<&> followed by a mnemonic. Emacs will show the possible characters in the echo area. For example, Greek letter λ (lambda) can be entered by typing: =begin code :lang &l* =end code You can use C to toggle L. Another L you can use to insert special characters is L. Select it by typing C. You can enter a special character by using a prefix such as C<\>. For example, to enter λ, type: =begin code :lang \lambda =end code To view characters and sequences provided by an input method, run the C command: =begin code :lang C-h I TeX =end code =head2 Unix shell At the bash shell, one enters unicode characters by using entering C, then the unicode code point value followed by enter. For instance, to enter the character for the element-of operator (∈) use the following key combination (whitespace has been added for clarity): =begin code :lang Ctrl-Shift-u 2208 Enter =end code This also the method one would use to enter unicode characters into the C REPL, if one has started the REPL inside a Unix shell. =head2 Screen L does sport a B command but with a rather limited digraph table. Thanks to B and B an external program can be used to insert characters to the current screen window. =begin code :lang bindkey ^K exec .! digraphs =end code This will bind control-k to the shell command digraphs. You can use L if you prefer a Perl 6 friendly digraph table over L or change it to your needs. =head1 Some characters useful in Perl 6 =head2 L These characters are used in different languages as quotation marks. In Perl 6 they are used as L Constructs such as these are now possible: say ｢What?!｣; say ”Whoa!“; say „This works too!”; say „There are just too many ways“; say “here: “no problem” at all!”; # You can nest them! This is very useful in shell: =begin code :lang perl6 -e 'say ‘hello world’' =end code since you can just copy and paste some piece of code and not worry about quotes. =head2 L These characters are used in French and German as quotation marks. In Perl 6 they are used as L, L and as bracket alternative in POD6. =begin table symbol unicode code point ascii equivalent ------ ------------------ ---------------- « U+00AB << » U+00BB >> =end table Thus constructs such as these are now possible: say (1, 2) »+« (3, 4); # OUTPUT: «(4 6)␤» - element-wise add [1, 2, 3] »+=» 42; # add 42 to each element of @array say «moo»; # OUTPUT: «moo␤» my$baa = "foo bar"; say «$baa$baa ber».perl; # OUTPUT: «("foo", "bar", "foo", "bar", "ber")␤» =head2 Set/bag operators The L all have set-theory-related symbols, the unicode code points and their ascii equivalents are listed below. To compose such a character, it is merely necessary to enter the character composition chord (e.g. C in Vim; C in Bash) then the unicode code point hexadecimal number. =begin table operator unicode code point ascii equivalent -------- ------------------ ---------------- ∈ U+2208 (elem) ∉ U+2209 !(elem) ∋ U+220B (cont) ∌ U+220C !(cont) ⊆ U+2286 (<=) ⊈ U+2288 !(<=) ⊂ U+2282 (<) ⊄ U+2284 !(<) ⊇ U+2287 (>=) ⊉ U+2289 !(>=) ⊃ U+2283 (>) ⊅ U+2285 !(>) ∪ U+222A (|) ∩ U+2229 (&) ∖ U+2216 (-) ⊖ U+2296 (^) ⊍ U+228D (.) ⊎ U+228E (+) =end table =head2 Mathematical symbols Wikipedia contains a full list of L as well as links to their mathematical meaning. =head2 Greek characters Greek characters may be used as variable names. For a list of Greek and Coptic characters and their unicode code points see the L. For example, to assign the value 3 to π, enter the following in Vim (whitespace added to the compose sequences for clarity): =begin code :allow :lang my $B = 3; # same as: my$π = 3; say $B; # 3 same as: say$π; =end code =head2 Superscripts and subscripts A limited set of L can be created directly in unicode by using the C, C and (less often) the C ranges. However, to produce a value squared (to the power of 2) or cubed (to the power of 3), one needs to use C and C since these are defined in the L. Thus, to write the L expansion around zero of the function C one would input into e.g. vim the following: =begin code :allow :lang exp(x) = 1 + x + xB/2! + xB/3! + ... + xB/n! # which would appear as exp(x) = 1 + x + x²/2! + x³/3! + ... + xⁿ/n! =end code Or to specify the elements in a list from C<1> up to C: =begin code :allow :lang AB, AB, ..., AB # which would appear as A₁, A₂, ..., Aₖ =end code =end pod # vim: expandtab softtabstop=4 shiftwidth=4 ft=perl6