-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utf8-bracket support #11271
Comments
From perl-diddler@tlinx.orgCreated by perl-diddler@tlinx.orgI was trying to quote a block of code. Thing is, to do that, you have to It's a shame actually, that someone confused less-than and greater-than It wouldn't be to hard, I woudn't think, to pair up "Right & Left" At the very least, the manpage (and any other references to the Perl Info
|
From @iabynOn Wed, Apr 20, 2011 at 10:02:45PM -0700, Linda Walsh wrote:
I think you meant *U+232A* for the right bracket. But having said that, $ cat /tmp/p #!/usr/bin/perl $ ./perl /tmp/p > /tmp/pp $ cat /tmp/pp use utf8; $x = q〈abc〉; print qq{x=[$x]\n}; $ ./perl /tmp/pp Can't find string terminator "�" anywhere before EOF at /tmp/pp line 1. -- |
The RT System itself - Status changed from 'new' to 'open' |
From @khwilliamsonOn 04/20/2011 11:02 PM, Linda Walsh (via RT) wrote:
If we were to do this, the criteria should probably be members of the |
From @khwilliamsonOn 04/21/2011 08:32 PM, Karl Williamson wrote:
But perhaps I should have included the initial and final quotes, of Note that some of the first set have the name QUOTATION, but aren't |
From tchrist@perl.com
Except that the pair she had suggested, LEFT- and RIGHT-POINTING ANGLE « 00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK Of those, these four are *not* Bidi Mirrored: ‘ 2018 LEFT SINGLE QUOTATION MARK I do agree that of the BidiM Symbols, probably only "<" and ">" ﹤ FE64 SMALL LESS-THAN SIGN But I dunno. Here are the only BidiM full/halfwidth code points: ( FF08 GC=Ps FULLWIDTH LEFT PARENTHESIS I don't know whether you really want to include the verticals: ⸠ 2E20 GC=Pi LEFT VERTICAL BAR WITH QUILL --tom |
From tchrist@perl.com
I get only two: % unichars -c '\pP' '\P{QMark}' 'NAME =~ /QUOT/' And they aren't from the Pi/Pf set. --tom |
From perl-diddler@tlinx.orgtchrist1 via RT wrote:
By she are you meaning me? I do like the double angle brackets that are called Honestly, when I submitted this, I thought the easiest thing Is there something wrong in that 'simple' approach? It would seem |
From @TuxOn Fri, 22 Apr 2011 00:30:56 -0700, Linda Walsh
That would give you more than you ask for: LEFT has ± 328 entries, and RIGHT has ± 331 The list only gets interesting when the "LEFT" code point has a 000028 ( LEFT PARENTHESIS
000028 ( LEFT PARENTHESIS -- |
From jwkrahn@shaw.caLinda Walsh wrote:
Have you thought about using a here-document to quote your code: my $code = <<CODE_BLOCK; # your code here CODE_BLOCK And if you use single quotes it won't be interpolated: my $code = <<'CODE_BLOCK'; # your code here CODE_BLOCK John |
From @AbigailOn Wed, Apr 20, 2011 at 10:02:45PM -0700, Linda Walsh wrote:
If in your block of code, your '{}', '[]' or '()' are balanced, you can And if your code uses "real angle brackets", it fails to work. Abigail |
From tchrist@perl.com
That's putting it mildly. Another area where I find things have been
Is there an easier way to to pull those out of BidiMirroring.txt than --tom |
From @AbigailOn Fri, Apr 22, 2011 at 09:16:47AM -0600, Karl Williamson wrote:
No 'd' => 'b' or 'p' => 'q' ? ;-) And then there are '|' => '|', '!' => '!', and other symmetric glyphs - (I'd pick d/b and p/q over any of the non-ASCII mirrored glyphs; my I sometimes wish that Perl would do delimiter as POD does. So one could say qq<<< a > b >>>; Print "a > b" Abigail |
From perl-diddler@tlinx.orgkarl williamson via RT wrote:
Something broken with your system? They don't change RtL semantics anywhere I used them. Where are you seeing this behavior?
Unless there is a "RIGHT BAGGAGE" to match it up with, I |
From @khwilliamsonOn 04/22/2011 10:10 AM, Tom Christiansen wrote:
It is somewhat easier to use lib/unicore/To/Bmg.pl |
From @khwilliamsonOn 04/22/2011 10:37 AM, Linda Walsh wrote:
On the email I received from H. Merijn Brand, the RIGHT-TO-LEFT OVERRIDE
|
From perl-diddler@tlinx.orgkarl williamson via RT wrote:
What email program do you use? FF doesn't display that behavior ... Oh, you mean after U+200E/U+200F I thought you meant inherent the characters for the 2nd part of the I'd rule out those characters because they contain Alot of these objections are trivial details that would be What I gave was a general concept -- not a tested algorithm. |
From @obraOn Wed 20.Apr'11 at 22:02:45 -0700, Linda Walsh wrote:
I'd be curious to know if the Perl 6 community can offer us any useful -Jesse |
From vadim.konovalov@alcatel-lucent.com
STD.pm and STD.pm6 have This list is useful, but I think even more complete list have I see "\x{2329}" => "\x{232A}", rather than \x{2330}, though. Regards, |
From @cpansproutOn Aug 2, 2011, at 10:15 PM, Brian Fraser wrote:
Are we sure it’s even a good idea to allow Unicode paired delimiters? I know we already allow for Unicode identifiers, but it has proven to be problematic, simply because Unicode is a moving target. Every Unicode upgrade changes Perl syntax just slightly. If we allow Unicode paired brackets, that will just aggravate the problem. Also, it would not be backward-compatible, as these currently work: $ perl -Mutf8 -le 'print q «foo«' perlop states that it is only the four ASCII brackets that are treated specially. That implies that my example works. Since it’s documented, we can’t easily change it without a deprecation cycle, can we? |
From perl-diddler@tlinx.orgFather Chrysostomos via RT wrote:
use unicode_brackets; |
From @HugmeirOn 8/7/11, Father Chrysostomos <sprout@cpan.org> wrote:
I think this is a valid concern, but I don't think the decision should |
From @nwc10On Sun, Aug 07, 2011 at 03:05:38PM -0700, Linda Walsh wrote:
Fails to address the valid concern that Unicode is a moving target - what's Nicholas Clark |
From @HugmeirOn Fri, Aug 12, 2011 at 6:46 AM, Nicholas Clark <nick@ccl4.org> wrote:
And in any case, I think that, if you want to change the syntax, you should use charnames qw( :full ); or use unicode_brackets Unicode => v6; or somesuch. Though all of this would need a new API -- Might be less wrong |
From zefram@fysh.orgBrian Fraser wrote:
Ah, that's reasonably nice.
I've been thinking about how to handle delimiters in plugged-in syntax. So syntax plugins aren't a solution here, they're another source of -zefram |
From perl-diddler@tlinx.orgNicholas Clark via RT wrote:
That's not exactly true. It shouldn't happen. Characters can't be changed once they are created. They can be That 'not deleting anything that has been published', rule was required New delimiters may come, but they'll come out of what are now, invalid |
From @nwc10On Fri, Aug 12, 2011 at 07:50:18AM -0700, Linda Walsh wrote:
Character properties can change. U-00B5 was Greek once. It isn't now. So the concern is that if we drive parsing using Unicode properties, then Nicholas Clark |
From perl-diddler@tlinx.org` Nicholas Clark via RT wrote:
Could you find a better example? As it is still is. Unless you can come up with a more firm example, I'm only willing Unwritten principle #11: permanent stability. We have taken the liberty of adding an eleventh principle the official I'm probably as much an outsider as you (my largest claim to I don't see any evidence to support the type changes you express concern
They may have happened, but the cited example is not one of those cases.
Hey -- we an always blame it on them!.. ;-) |
From @nwc10On Sat, Aug 13, 2011 at 02:02:32AM -0700, Linda Walsh wrote:
$ ~/Sandpit/583/bin/perl -le '$_ = chr 0xB5; utf8::upgrade $_; print /\p{isGreek}/ ? "Greek!" : "not :-("'
Not FUD. See above. I don't know *why* Perl's implementation changed, but it Nicholas Clark |
From perl-diddler@tlinx.orgchromatic via RT wrote:
=== |
From perl-diddler@tlinx.orgDave Mitchell wrote:
No... You didn't read what I wrote... The beauty of them is they are not perl operators, so they could safely
They are for bug fixes....not just 'important post RC0 unless things Since « », are in the non-unicode region, it could be argued that it
|
From chromatic@wgz.orgOn Friday, July 13, 2012 06:35:00 PM Linda W wrote:
That's why my version works. Later, the same document (perldoc perlop) says: A backslash represents a backslash unless followed by the delimiter or -- c |
From @ikegamiOn Fri, Jul 13, 2012 at 9:43 PM, Linda W <perl-diddler@tlinx.org> wrote:
|
From perl-diddler@tlinx.orgEric Brine wrote:
Except what you wrote starts out with a false statement. "Totally I think I mentioned they could occur in strings, aren't those string What is with your attitude about my posts that causes you to respond Let me stress this point...and think before claiming it is false. Unlike the the current pair'ed operators: <>{}()[], «» would not They'd even work for qr« » and you'd not have to worry about perl I did use single quote for my usage, as someone else suggested as
|
From @arcLinda W <perl-diddler@tlinx.org> wrote:
Consider this code: my $s = q«a«;#»»; In every current release of Perl 5, that sets $s to "a". Under your So your proposal would change the meaning of valid programs. That's -- |
From perl-diddler@tlinx.orgAaron Crane via RT wrote:
So did changes in 5.16, 5.14, 5.12... It wouldn't be the first time...but no notice would be no fun... so "use curquotes" Then set a deprecation schedule for curquotes (or not) -- as those are I think the number of people affected by something like the above would be |
From @doyOn Sun, Jul 15, 2012 at 07:41:43PM -0700, Linda W wrote:
If you read the paragraph directly following the one you just quoted, -doy |
From perl-diddler@tlinx.orgJesse Luehrs via RT wrote:
You mean that addressed by the part you elided?: It wouldn't be the first time... but no notice would be no fun... so "use curquotes" Then set a deprecation schedule for curquotes (or not) -- as those are Adding something with a use feature latinquotes or curquotes, in a minor |
From @doyOn Mon, Jul 16, 2012 at 02:20:41PM -0700, Linda W wrote:
Regardless of how little impact it should have, we don't add new -doy |
From perl-diddler@tlinx.orgJesse Luehrs via RT wrote:
==== |
From @ikegamiOn Mon, Jul 16, 2012 at 5:20 PM, Linda W <perl-diddler@tlinx.org> wrote:
You mean UTF-8 or UTF-16 or something similar, not Unicode. Unicode is not |
From perl-diddler@tlinx.orgEric Brine wrote:
The [Perl] "Unicode Bug" In character semantics they are interpreted as Unicode code In byte semantics, they are considered to be unassigned characters, I.e. by default they are illegal as characters and due to the mixed By default, /I would submit/ that this range should be elevated, at Not doing so broke basic perl functionality from Perl4 days in its Most of email comes through looking like: But I understand change doesn't come overnight as well... |
From tchrist@perl.comI'm afraid that you're really rather horribly confused about all this. You have managed to get yourself into a snit because you've unwittingly --tom |
From perl-diddler@tlinx.orgtchrist1 via RT wrote:
So you are saying that no matter if my terminology wasn't exactly you Normally, have PERL5OPT set to -CSA, "use utf8" in my source and a I'll have to think of a different way to to explain this ... |
From @LeontOn Wed, Jul 18, 2012 at 12:07 PM, Linda W <perl-diddler@tlinx.org> wrote:
Because you're making no sense that way.
Code examples would be helpful. -CSA manages @ARGV and Leon |
From tchrist@perl.comLinda W <perl-diddler@tlinx.org> wrote
I have no idea what the answers to those particular questions are
That's still vague. Are you using unicode_strings in your source? And are you reading Unicode data?
Good idea, that. --tom |
From perl-diddler@tlinx.orgOn Wed Jul 18 05:01:47 2012, tom christiansen wrote:
Really. Perhaps my perceptions are not always correct, are you really so
I filed a bug, that more clearly elucidates what I am seeing as a problem. You can call it confusion, but, if such exists, its because someone Example. I have "use utf8" in my code and have a sub name using the script 'f': Now Perl -- it seems confused, as it thinks the UTF-8 encoding of U+192 When it prints out I see: "�Register_FStype" So Please tell me, who doesn't understand the difference between code Is this clear enough for you? |
Fixed by 0b6e3da and preceding commits |
Migrated from rt.perl.org#89032 (status was 'open')
Searchable as RT89032$
The text was updated successfully, but these errors were encountered: