Unicode::Security - Check scripts for confusables and mixed script strings
use v6.d; use Unicode::Security; say "Nope" if mixed-script( "abcdef" ); say "Yea" if mixed-script( "aαbβ" ); say "Looks fake, Rick" if confusable('paypal', "p\xyp\xl"); say "No problem" unless confusable('Paypal', 'paypal'); say "Not confusing" unless whole-script-confusable("Latin", "DFRVz"); say "Confusing" if whole-script-confusable("Cyrillic", "scope");
Unicode::Security is a (partial) transcription of its namesake Perl 5 module. It incorporates confusion tables from the Unicode consortium to detect which graphemes can cause confusion between two alphabets, or which strings could be confused between two or more alphabets; also detect when some strings have mixed scripts, which can be used to slip by literal-string detectors.
The list of confusables is generated from the list published by the unicode consortium using the scripts in
resources/script. You should have received a copy along with this. The scripts generate two JSON files, which must be moved by hand to the
resources/data directory. You don't need to do this unless you're certain that the supplied copy is really out of sync with the real ones.
sub confusable( $string-a, $string-b )
Returns true if one string could be confusable for the other.
sub whole-script-comfusable( $script, $string )
Returns True if the string would be confusable for another written in the indicated script
sub mixed-script( $str )
True if the string includes several scripts,
sub mixed-script-confusable( $str )
True if the scripts present in a string could make it confusable for any of them.
JJ Merelo email@example.com
COPYRIGHT AND LICENSE
Copyright 2019 JJ Merelo
This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.