Perl6 Supply which takes a supply of bytes as input and returns a supply of unicode characters
Perl 6 Perl
Switch branches/tags
Nothing to show
Latest commit be59958 May 2, 2017 @krunen META.json -> META6.json
Failed to load latest commit information.
lib/Unicode .pm -> .pm6 Dec 7, 2015
t initial commit Dec 6, 2015
.gitignore initial commit Dec 6, 2015
LICENSE initial commit Dec 6, 2015
META6.json META.json -> META6.json May 2, 2017 added info about non-utf8 bytes Dec 6, 2015


Rakudo's built-in UTF8 parser will wait for a possible combining character before getc() returns. Using this module, you can read bytes from $*IN and use this module to get unicode characters

use Unicode::UTF8-Parser;

my $stdin = supply { while (my $b = $*[0]).defined { emit($b) } };
my $utf8 = parse-utf8-bytes($stdin);

$utf8.tap({ say "got $_" });

The module exports the sub parse-utf8-bytes, which take a supply as input and returns a new supply.

If there are Non-UTF8 bytes in the stream, they will be emitted as Int. You have to implement handling of these yourself ($val ~~ Int).