Skip to content

Commit 3e4710b

Browse files
committed
Test that bitwise string ops maintain normalization
This adds tests for MoarVM/MoarVM#867. I fixed this in MoarVM recently and had already added tests to make sure that synthetics were handled properly and the bitwise ops operated on a codepoint basis. I constructed codepoints which result in a codepoint that is normalized to multiple codepoints: for example 2940 +& 2910 = 2908. But 2908.chr.ords will give (2849, 2876). Test AND, OR and XOR string ops.
1 parent cb49456 commit 3e4710b

File tree

1 file changed

+23
-5
lines changed

1 file changed

+23
-5
lines changed

S03-operators/bit.t

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ use Test;
44

55
# Mostly copied from Perl 5.8.4 s t/op/bop.t
66

7-
plan 60;
7+
plan 63;
88

99
# test the bit operators '&', '|', '^', '+<', and '+>'
1010

@@ -63,10 +63,6 @@ plan 60;
6363
is( "ok \xFF\xFF\n" ~& "ok 19\n", "ok 19\n", 'stringwise ~&, arbitrary string' );
6464
is( "ok 20\n" ~| "ok \0\0\n", "ok 20\n", 'stringwise ~|, arbitrary string' );
6565

66-
# MoarVM/MoarVM#867
67-
# TODO: Also test to ensure string is returned normalized. i.e. constructing
68-
# a bitwise operation whose naive result (doing the op on each codepoint individually)
69-
# would be different than those individual codepoints normalized.
7066
sub check_string_bitop (Str:D $a, Str:D $b) {
7167
my @a = $a.ords;
7268
my @b = $b.ords;
@@ -98,6 +94,28 @@ sub check_string_bitop (Str:D $a, Str:D $b) {
9894
#?DOES 3
9995
check_string_bitop("\c[united states]", "\c[canada, semicolon]");
10096
check_string_bitop("P" ~ ("\c[BRAHMI VOWEL SIGN VOCALIC RR]" x 5), 'zzzzzzz');
97+
#?rakudo.jvm 3 todo "JVM does not support NFG strings and normalization"
98+
# Test that normalization is retained
99+
# MoarVM/MoarVM#867 (currently fixed)
100+
# Test to ensure string is returned normalized. i.e. constructing
101+
# a bitwise operation whose naive result (doing the op on each codepoint individually)
102+
# would be different than those individual codepoints normalized.
103+
{
104+
my $a = 2940; # 2940 +& 2910 = 2908; But 2908 normalizes to (2849, 2876)
105+
my $b = 2910;
106+
is-deeply $a.chr ~& $b.chr, (2849, 2876).chrs, "Normalization is retained after string bitwise AND";
107+
}
108+
{
109+
my $a = 2910; # 2910 +& 2 = 2908; But 2908 normalizes to (2849, 2876)
110+
my $b = 2;
111+
is-deeply $a.chr ~^ $b.chr, (2849, 2876).chrs, "Normalization is retained after string bitwise XOR";
112+
}
113+
{
114+
my $a = 2904; # 2904 +& 4 = 2908; But 2908 normalizes to (2849, 2876)
115+
my $b = 4;
116+
is-deeply $a.chr ~| $b.chr, (2849, 2876).chrs, "Normalization is retained after string bitwise OR";
117+
}
118+
101119
# bit shifting
102120
is( 32 +< 1, 64, 'shift one bit left' );
103121
is( 32 +> 1, 16, 'shift one bit right' );

0 commit comments

Comments
 (0)