Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

endless loop in perl unicode converter #12345

Closed
p5pRT opened this issue Aug 23, 2012 · 7 comments
Closed

endless loop in perl unicode converter #12345

p5pRT opened this issue Aug 23, 2012 · 7 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 23, 2012

Migrated from rt.perl.org#114558 (status was 'resolved')

Searchable as RT114558$

@p5pRT
Copy link
Author

p5pRT commented Aug 23, 2012

From wosch@FreeBSD.org

Hi,

I run a script which fetch data from a web site. The site changed the
character from utf8 to latin1. I didn't notice this until I saw some
perl scripts running in an endless loop.

how to repeat​:

perl utf8.pl < utf8.html
utf8 "\xFF" does not map to Unicode at utf8.pl line 11, <> line 1.
Cro

[ endless loop ]

I guess this is an off-by-one error, or a buffer overflow for a buffer
of the size 1024 bytes. The input file utf8.html is 1028 bytes long, 3
bytes "Cro" followed by \377, and 1023 times "x", followed by "X". If
you remove "X", the script will not hang.

Affected OS​: linux, MacOS, FreeBSD
Affected perl version​: 5.8.9 - 5.14.2

perl 5.16.1 and later seems to work fine.

see the attachments for the test case.

-Wolfram

--
Wolfram Schneider <wosch@​FreeBSD.org> http​://wolfram.schneider.org

@p5pRT
Copy link
Author

p5pRT commented Aug 23, 2012

From wosch@FreeBSD.org

utf8.pl

@p5pRT
Copy link
Author

p5pRT commented Aug 24, 2012

From @jkeenan

On Thu Aug 23 14​:55​:28 2012, wosch@​FreeBSD.org wrote​:

Hi,

I run a script which fetch data from a web site. The site changed the
character from utf8 to latin1. I didn't notice this until I saw some
perl scripts running in an endless loop.

how to repeat​:

perl utf8.pl < utf8.html
utf8 "\xFF" does not map to Unicode at utf8.pl line 11, <> line 1.
Cro

[ endless loop ]

I guess this is an off-by-one error, or a buffer overflow for a buffer
of the size 1024 bytes. The input file utf8.html is 1028 bytes long, 3
bytes "Cro" followed by \377, and 1023 times "x", followed by "X". If
you remove "X", the script will not hang.

Affected OS​: linux, MacOS, FreeBSD
Affected perl version​: 5.8.9 - 5.14.2

perl 5.16.1 and later seems to work fine.

see the attachments for the test case.

-Wolfram

I reproduced this problem with 5.12.0 and 5.14.2; 5.16.0 ran fine. So
there is a bug in pre-5.16 versions of perl.

However, to reproduce this I had to save your sample data as 'utf8.txt'
-- not 'utf8.html'. The latter included all the Javascript found on the
rt.perl.org web page and so was much larger than 1028 bytes.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Aug 24, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Aug 24, 2012

From @jkeenan

On Fri Aug 24 16​:23​:01 2012, jkeenan wrote​:

perl 5.16.1 and later seems to work fine.

If someone can explain how we fixed this problem in Perl 5.16, then we
can close this ticket.

jimk

@p5pRT
Copy link
Author

p5pRT commented Aug 26, 2012

From @andk

"James E Keenan via RT" <perlbug-followup@​perl.org> writes​:

If someone can explain how we fixed this problem in Perl 5.16, then we
can close this ticket.

v5.15.9-259-geb83ed8

commit eb83ed8
Author​: Karl Williamson <public@​khwilliamson.com>
Date​: Wed Apr 18 17​:36​:01 2012 -0600

  utf8.c​: refactor utf8n_to_uvuni()

--
andreas

@p5pRT
Copy link
Author

p5pRT commented Aug 26, 2012

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant