Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re: [LONG] Possible utf8 implementation #558

Closed
p5pRT opened this issue Sep 20, 1999 · 4 comments
Closed

Re: [LONG] Possible utf8 implementation #558

p5pRT opened this issue Sep 20, 1999 · 4 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Sep 20, 1999

Migrated from rt.perl.org#1408 (status was 'resolved')

Searchable as RT1408$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 20, 1999

From The RT System itself

I doubt it. Bear in mind \x{beef} is a placeholder for whatever it is
we choose to send out by default. Could just as easily be U+BEEF or
\uBEEF or \N{DEAD COW MEAT} or 뻯 or whatever.

: I don't mind the \x behaviour for error messages, but I'd really hate
: for it to happen when I'm writing what I think is raw data onto a rawsocket,
: and the data happens to contain unicode characters.
:
: Especially if I send out '"Content-length​: ". length($var) ."\r\n"' before.
:
: If the socket is in raw data mode, and I don't have "use bytes;" in effect,
: it had _better_ die if I try to send a UTF8 string on the wire....

Death is not good. I reject death. I will stay away from trucks today.

If you're trying to send out Content-length without "use bytes" or its
equivalent then you'll get what you deserve. No guarantees will be
made about the length of the resulting string if you make Perl guess
how to translate utf8 to 8-bit. Contrariwise, if you do tell it how to
translate, then it'll do what you expect, up to and including dying
and losing track of some portion of the data you were trying to output.

Larry

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 20, 1999

From The RT System itself

:-)

If you're trying to send out Content-length without "use bytes" or its
equivalent then you'll get what you deserve. No guarantees will be
made about the length of the resulting string if you make Perl guess
how to translate utf8 to 8-bit. Contrariwise, if you do tell it how to
translate, then it'll do what you expect, up to and including dying
and losing track of some portion of the data you were trying to output.

"use bytes;" is inaccurate in many situations. And death isn't _that_
bad, especially seeing as you can throw an "eval{}" around it...

I, as a module author, do not want to be concerned with always testing
every string that gets passed to me. This goes for being compatible with
old code as well, as the old code is DEFINATELY not aware of any utf-8
issues.

At least a -w warning would be desirable. Similar to the warning
generated by​:

  $ perl -we '$a = "happy"; $b = "sad"; $c = $a + $b'
  Name "main​::c" used only once​: possible typo at -e line 1.
-> Argument "sad" isn't numeric in add at -e line 1.
-> Argument "happy" isn't numeric in add at -e line 1.

Personally, i still think that if "use bytes;" is not used, and "use utf8"
is not used in a module, then it must be assumed that this module may not
know anything about utf8, and if it happens to write a utf8 string to
a socket not marked as being able to write utf8, well then, an error seems
the only *proper* thing to do.

mark

--
markm@​nortelnetworks.com/mark@​mielke.cc/markm@​ncf.ca __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | CUE Development (4Y21)
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | Nortel Networks
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
  and in the darkness bind them...

  http​://mark.mielke.cc/

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 22, 2003

@iabyn - Status changed from 'stalled' to 'resolved'

@p5pRT p5pRT closed this Apr 22, 2003
@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 22, 2003

@iabyn - Status changed from 'stalled' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant