RT #122341 treat all of 0x0a-0x0d,\x85,\x2028,\x2029 as newlines #1087

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
2 participants
@FROGGS
Contributor

FROGGS commented Jul 22, 2014

The script tools/dev/gen_charset_tables.pl was not used because it removes
character properties of chars in the range of 0x81 to 0xFF. Additionally,
u_iscclass now checks for characters with enum_cclass_newline property,
which it did not do at all before.

RT #122341 treat all of 0x0a-0x0d,\x85,\x2028,\x2029 as newlines
The script tools/dev/gen_charset_tables.pl was not used because it removes
character properties of chars in the range of 0x81 to 0xFF. Additionally,
u_iscclass now checks for characters with enum_cclass_newline property,
which it did not do at all before.
@rurban

This comment has been minimized.

Show comment Hide comment
@rurban

rurban Jul 22, 2014

Member

But we need to keep tools/dev/gen_charset_tables.pl up-to-date also.

Member

rurban commented Jul 22, 2014

But we need to keep tools/dev/gen_charset_tables.pl up-to-date also.

rurban pushed a commit that referenced this pull request Oct 4, 2014

Reini Urban
[tools] typetables: fix gen_charset_tables.pl and regenerate
Removes defunct and since 2010 unused Parrot_ascii_typetable.
Adds \v to CCLASS_NEWLINE manually (confirmed),
\x85\xa0 already is in the whitespace cclass.
Several chars 160..191 are not in the [[:punct:]] class anymore.

Added bootstrap-tables make target
Improved src/string/encoding/tables.c pod.
Closes PR #1087

Beware: My new up-to-date libc removed the [[:punct::]] class
of several chars 160..191.
@rurban

This comment has been minimized.

Show comment Hide comment
@rurban

rurban Oct 4, 2014

Member

smoking as smoke-me/newlines2-gh1087 failed 2 tests: https://travis-ci.org/parrot/parrot/jobs/37042852#L1919

Member

rurban commented Oct 4, 2014

smoking as smoke-me/newlines2-gh1087 failed 2 tests: https://travis-ci.org/parrot/parrot/jobs/37042852#L1919

@rurban rurban self-assigned this Oct 4, 2014

@rurban rurban added this to the 6.9.0 milestone Oct 4, 2014

rurban pushed a commit that referenced this pull request Oct 4, 2014

Reini Urban
[tools] typetables: fix gen_charset_tables.pl and regenerate
Removes defunct and since 2010 unused Parrot_ascii_typetable.
Adds \v to CCLASS_NEWLINE manually (confirmed),
\x85\xa0 already is in the whitespace cclass.
Several chars 160..191 are not in the [[:punct:]] class anymore.

Added bootstrap-tables make target
Improved src/string/encoding/tables.c pod.
Closes PR #1087

Beware: My new up-to-date libc removed the [[:punct::]] class
of several chars 160..191.
@rurban

This comment has been minimized.

Show comment Hide comment
@rurban

rurban Oct 4, 2014

Member

Remaining errors with travis (link above), not repro on my systems:

t/op/string_cclass.t ................ 1/11 
#   Failed test 'unicode is_cclass whitespace'
#   at t/op/string_cclass.t line 319.
#          got: '110
# char 0xa0 not reported as ws
# 11110
# char 0x85 not reported as ws
# 11111111111111111
# '
#     expected: '1111111111111111111111111
# '
#   Failed test 'unicode find_not_ccclass whitespace'
#   at t/op/string_cclass.t line 362.
#          got: '28 2
# '
#     expected: '28 25
# '
# Looks like you failed 2 tests of 11.
t/op/string_cs.t .................... 1/58 
#   Failed test 'is_whitespace'
#   at t/op/string_cs.t line 109.
#          got: '01110
# 01110
# '
#     expected: '01111
# 01110
# '
Member

rurban commented Oct 4, 2014

Remaining errors with travis (link above), not repro on my systems:

t/op/string_cclass.t ................ 1/11 
#   Failed test 'unicode is_cclass whitespace'
#   at t/op/string_cclass.t line 319.
#          got: '110
# char 0xa0 not reported as ws
# 11110
# char 0x85 not reported as ws
# 11111111111111111
# '
#     expected: '1111111111111111111111111
# '
#   Failed test 'unicode find_not_ccclass whitespace'
#   at t/op/string_cclass.t line 362.
#          got: '28 2
# '
#     expected: '28 25
# '
# Looks like you failed 2 tests of 11.
t/op/string_cs.t .................... 1/58 
#   Failed test 'is_whitespace'
#   at t/op/string_cs.t line 109.
#          got: '01110
# 01110
# '
#     expected: '01111
# 01110
# '

rurban pushed a commit that referenced this pull request Oct 4, 2014

Reini Urban
[tools] typetables: fix gen_charset_tables.pl and regenerate
Removes defunct and since 2010 unused Parrot_ascii_typetable.
Adds \v to CCLASS_NEWLINE manually (confirmed),
\x85\xa0 confirmed to be now in the whitespace cclass, but
several old systems fail with the \xa0 (non-breaking whitespace)
test for whitespace.
Several chars 160..191 are not in the [[:punct:]] class anymore.

Added bootstrap-tables make target, update the tables automatically.
Improved src/string/encoding/tables.c pod.
Closes PR #1087

rurban pushed a commit that referenced this pull request Oct 4, 2014

Reini Urban
[tools] typetables: fix gen_charset_tables.pl and regenerate
Removes defunct and since 2010 unused Parrot_ascii_typetable.
Adds \v to CCLASS_NEWLINE manually (confirmed),
\x85\xa0 confirmed to be now in the whitespace cclass, but
several old systems fail with the \xa0 (non-breaking whitespace)
test for whitespace.
Several chars 160..191 are not in the [[:punct:]] class anymore.

Added bootstrap-tables make target, update the tables automatically.
Improved src/string/encoding/tables.c pod.
Closes PR #1087

rurban pushed a commit that referenced this pull request Oct 5, 2014

Reini Urban
[tools] typetables: fix gen_charset_tables.pl and regenerate
Removes defunct and since 2010 unused Parrot_ascii_typetable.
Adds \v to CCLASS_NEWLINE manually (confirmed),
\x85\xa0 confirmed to be now in the whitespace cclass, but
several old systems fail with the \xa0 (non-breaking whitespace)
test for whitespace.
Several chars 160..191 are not in the [[:punct:]] class anymore.

Added bootstrap-tables make target, update the tables automatically.
Improved src/string/encoding/tables.c pod.
Closes PR #1087

@rurban rurban added the Patch label Oct 14, 2014

rurban pushed a commit that referenced this pull request Oct 14, 2014

Reini Urban
[tools] typetables: fix gen_charset_tables.pl and regenerate
Removes defunct and since 2010 unused Parrot_ascii_typetable.
Adds \v to CCLASS_NEWLINE manually (confirmed),
\x85\xa0 confirmed to be now in the whitespace cclass, but
several old systems fail with the \xa0 (non-breaking whitespace)
test for whitespace.
Several chars 160..191 are not in the [[:punct:]] class anymore.

Added bootstrap-tables make target, update the tables automatically.
Improved src/string/encoding/tables.c pod.
Closes PR #1087
@rurban

This comment has been minimized.

Show comment Hide comment
@rurban

rurban Oct 14, 2014

Member

Merged into 6.9.0 with f62bc76

Member

rurban commented Oct 14, 2014

Merged into 6.9.0 with f62bc76

@rurban rurban closed this Oct 14, 2014

@rurban

This comment has been minimized.

Show comment Hide comment
@rurban

rurban Oct 15, 2014

Member

This fixed

t/spec/S05-metachars/newline.rakudo.parrot                    (Wstat: 0 Tests: 15 Failed: 0)
  TODO passed:   7, 14
Member

rurban commented Oct 15, 2014

This fixed

t/spec/S05-metachars/newline.rakudo.parrot                    (Wstat: 0 Tests: 15 Failed: 0)
  TODO passed:   7, 14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment