Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible typos in src/HLL/CommandLine.nqp #346

Closed
nanis opened this issue Feb 7, 2017 · 18 comments

Comments

Projects
None yet
5 participants
@nanis
Copy link

commented Feb 7, 2017

It is entirely possible that I am missing something obvious, but while trying to figure out what happens between typing

C:\> perl6 -e "say 'yağmur'"

and getting the output

yagmur

on Windows, I was struck by some spelling variations in src/HLL/CommandLine.nqp such as $paser.add-stopper('-e') and my @args := $pasre.arguments .

These look like typos to me, but, not being familiar with the ins and outs of the source, I can't be sure if they are not referencing some globals $paser and $pasre .

So, either HLL::CommandLine has been broken for a while or there are some poorly named global variables.

If these are spelling errors, I am a bit confused by why I haven't been able to spot any repercussions.

zoffixznet added a commit that referenced this issue Feb 7, 2017

Fix typos in POD
Part of fixing #346
@stmuk

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2017

It's in pod documentation rather than code. A pull request to fix it would be useful except Zoffix got there first!

@zoffixznet

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2017

If these are spelling errors, I am a bit confused by why I haven't been able to spot any repercussions.

Thanks. Fixed in 4e7eee0. No repercussion because that area of the file is the POD (documentation), so it doesn't actually get executed.

but while trying to figure out what happens between typing

Confirmed on 2016.11 on default Windows 7 command prompt with and without chcp 65001. Looks like there's a ticket for it already, so I'll close this in favor of that: https://rt.perl.org/Ticket/Display.html?id=127925#ticket-history

@samcv perhaps you would have any idea what makes Windows command line unhappy with Unicode?

@zoffixznet zoffixznet closed this Feb 7, 2017

@stmuk

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2017

I don't think cmd.exe supports unicode. Maybe powershell does (?)

@zoffixznet

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2017

No idea. If I use perl6 -e "say qq|\x[11f]|" then it prints the proper ğ char.

nqp -e "say('yağmur')" and perl -C -wlE "use utf8; say('yağmur')" have the same issue, so maybe you're right.

perl6 -e "@*ARGS.ords.say" ğ gives (103)

@vendethiel

This comment has been minimized.

Copy link
Member

commented Feb 7, 2017

My guess then would be that it's an input issue?

@nanis

This comment has been minimized.

Copy link
Author

commented Feb 7, 2017

No repercussion because that area of the file is the POD

Argh! I got fooled by GitHub syntax highlighting.

I don't think cmd.exe supports unicode.

Of course it does. The problem is that moar.exe uses main rather than wmain, so it has no hope of getting the right characters if anything outside of the code page is used.

However, I wrote a wrapper around moar.exe which basically does what perl6.bat does but UTF-8 encodes the command line arguments which moar.exe relays to whatever executes Perl 6 code. In that case, I get incorrectly encoded output, and I was trying to locate where that happened.

Am I correct in assuming the culprit is

$nqpcomp.command_line(@ARGS, :encoding('utf8'), :transcode('ascii iso-8859-1'));

in NQP::Compiler or am I barking up the wrong tree?

@zoffixznet zoffixznet reopened this Feb 7, 2017

@zoffixznet

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2017

Personally, I have no idea. But if that isn't it, another area to check would probably be: https://github.com/rakudo/rakudo/blob/6990133410d5b9c11c991b00751cdaead3ca04da/src/Perl6/Compiler.nqp#L31-L43

@nanis

This comment has been minimized.

Copy link
Author

commented Feb 7, 2017

I am on a roll today ... I meant to reply to RT #127925 but accidentally created RT #130736. I would appreciate it if you could merge the information in the erroneously created ticket to the correct one.

FWIW, my wrapper seems to work with perl in that I get:

C:\> p5run -CS -Mutf8 -E "say 'yağmur'"
yağmur

C:\> p5run -CS -Mutf8 -E "say 'yağmur'" | xxd
00000000: 7961 c49f 6d75 720d 0a                   ya..mur..
@nanis

This comment has been minimized.

Copy link
Author

commented Feb 7, 2017

perl6 -e "@*ARGS.ords.say" ğ gives (103)

C:\> p5run -E "say ord for @ARGV" ğ
287

which is correct.

@zoffixznet

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2017

appreciate it if you could merge

Merged.

my wrapper seems to work with perl

Sweet. Hopefully the same method can be applied to Perl 6.

@nanis

This comment has been minimized.

Copy link
Author

commented Feb 7, 2017

Hopefully the same method can be applied to Perl 6.

Well, that's the motivation for my search ... When I do the same thing with moar, I get yaÄŸmur instead of yağmur, so I want to find out where UTF-8 encoded command line arguments are getting mangled and I'd rather avoid a wild goose chase.

@timo

This comment has been minimized.

Copy link
Member

commented Feb 7, 2017

I think the right place to look is moarvm's src/io/procops.c in function MVM_proc_clargs where it uses ANSIToUTF8 on all commandline arguments. That function uses MultiByteToWideChar and then WideCharToMultiByte to get the result... i think.

@timo

This comment has been minimized.

Copy link
Member

commented Feb 7, 2017

oh, also: we can totally #ifdef how our main function works to use wmain and change the other places to use utf16 instead of ... whatever ANSI codepage we currently seem to be using? I myself can't write that code because i have no windows to test it on, but we'd definitely accept a pull request that does that.

@nanis

This comment has been minimized.

Copy link
Author

commented Feb 7, 2017

@timo Thank you, I'll look. My wrapper also uses WideCharToMultiByte ... it works with wrapping perl but not moar. Anyway, that's neither here or there. I'll see what I can figure out a little later.

I have to pay attention to something else right now. I'll let you know if I figure it out.

@zoffixznet

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2017

Any chance you could provide the current wrapper that you have? This way if you won't have time to finish this, at least another volunteer won't have to start from scratch.

@nanis

This comment has been minimized.

Copy link
Author

commented Feb 7, 2017

Not yet.

@nanis

This comment has been minimized.

Copy link
Author

commented Feb 7, 2017

@timo I am fairly certain I know how to fix this. In principle it is a small change, but it's a fiddly change.

Now, I know it annoys people, but I think through things by writing about them, and I may write a blog post or two before I can formulate an actual pull request.

Rest assured, it will be coming.

@nanis

This comment has been minimized.

Copy link
Author

commented Feb 8, 2017

I've made some progress and explained my process on my blog. I still need to touch the code that sets up the environment, but once I do that, I'll have a pull request. Hope it works on others' machines, too ;-)

I don't this ticket is the right place to continue the discussion, so I am going to close this, and open an issue with MoarVM.

Thank you @timo for pointing me in the right direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.