You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.
In order to avoid TR39 confusable security hacks, we add the following unicode rules for identifiers and literals:
The first non-Latin and not-Common unicode script for an identifier is the only allowed one. Others lead to parsers errors.
Additional unicode scripts can and should be declared via `use utf8 'Greek', 'script-name2'... to prevent mixed script errors. This allows more scripts than in rule 1. This can be scoped in blocks.
The 'Common' and 'Latin' scripts are always enabled and don't need to be declared.
This holds for all identifiers (all names: package, gv, sub, variables) and literal numbers.
The scriptname is returned by Unicode::UCD::charscript($codepoint_as_uv)
Currently there exist 131 scripts:
perl -alne'/; (\w+) #/ && print $1' lib/unicore/Scripts.txt | sort -u > scripts.lst
The remaining question if certain languages need alias for sets of Scripts, because they use multiple scripts by default. Such as Japanese for Hiragana and Katakana (what about Kanji? = Han?),
Korean for Hangul and Han (Chinese).
Document the new unicode mixed script confusable security
restriction. Declare valid unicode scripts via use utf8 arguments.
This bug was introduced with 5.16.
See #229.
let them be generated by regen/regcharclass.pl
change API from UV cp to U8*s. Spares a costly utf8n_to_uvchr
conversion in hot code. Do this only when throwing the error.
Slowdown <1%
See GH #229.
In order to avoid TR39 confusable security hacks, we add the following unicode rules for identifiers and literals:
See http://www.unicode.org/reports/tr39/#Mixed_Script_Detection
This holds for all identifiers (all names: package, gv, sub, variables) and literal numbers.
The scriptname is returned by
Unicode::UCD::charscript($codepoint_as_uv)
Currently there exist 131 scripts:
perl -alne'/; (\w+) #/ && print $1' lib/unicore/Scripts.txt | sort -u > scripts.lst
The text was updated successfully, but these errors were encountered: