Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to show utf-8 debug messages correctly(noconsole mode)/convert to… #3477

Merged
merged 9 commits into from Aug 25, 2018

Conversation

Projects
None yet
2 participants
@Suzumizaki
Copy link
Contributor

commented Apr 23, 2018

… mbcs strings before write to stderr(console mode) on Windows.

Until this patch applied, all non-ASCII messages are failed to show(mojibake).
For no-console(windowed) mode, uses MessageBoxW instead of MessageBoxA.
For console mode, strings encoded with user console encodings.

Note1: waf build system has a bug to detect MSVC with Japanese (or DBCS localized) version of Windows.
Note2: Because of C/C++ specification, we can only use one of printf or wprintf for each process. C/C++ stream can have 8bit mode or 16bit mode which is selected by first (w)printf, but cannot switch after first output. That's the reason I use MBCS for console mode.

@htgoebel htgoebel added the bootloader label Apr 24, 2018

@htgoebel
Copy link
Member

left a comment

Thanks for this pull-request. I added some comments and requests for enhancement.

Please also rework the commits and the commit message following the https://pyinstaller.readthedocs.io/en/latest/development/commit-messages.html. I think Note 2 is worth going into the commit message body. Thanks.

show_message_box(
msg,
L"Fatal error detected",
L"pyi_win32_utils_from_utf8 failed during making fatal error message",

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

This test is the same in all cases, thus it should be moved into show_message_box. Also the caller does not know pyi_win32_utils_from_utf8 is used there, thus it must not pass such a message.

Additionally this message is much to technical for being of any use the some user being confronted with it. I suggest using "Fatal error detected! Also an error occurred when formatting the error message" (with "Fatal error detected" being the msg).

Now this long text in gone, this call can be formatted on less lines, .g.

    show_message_box(msg, L"Fatal error detected",
                     MB_ICONEXCLAMATION);

Same for the other cases below.

@@ -180,12 +239,21 @@ pyi_global_printf(const char *fmt, ...)
{
va_list v;

fprintf(stderr, "[%d] ", getpid());

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

This line is the same in both cases. Please keep it outside of the #if defined

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 24, 2018

Author Contributor

I agree. Thanks.

fprintf(stderr, "[%d] ", getpid());
/* Sent 'LOADER text' messages to stderr. */
#if defined(_WIN32)
char utf8_formatted[1024];

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member
  • Where do these 1024 come from? Is it MBTXTLEN? Is this large enough?
  • Inconsistent intention here and below. Please sue spaces, not tabs.

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 24, 2018

Author Contributor

OK, I will use vsnprintf() twice, malloc() and free(). Indentation Issue also I Agree.

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 25, 2018

Member

I would not go for malloc and free since these can fail, too. Using the stack is less error prone.

Just reason (in a commend) why 1024 and not 512 or 2048?

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 26, 2018

Author Contributor

Sorry, I can't explain the logical reason why 1024 I used. No source exists the value can rely on.
Do you have any suggestion about another buffer size or the other techniques?

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 26, 2018

Member

How did you choose 1024 then? Is this from some MS example code?

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 26, 2018

Author Contributor

No. Just MBTXTLEN. That's because I said not logical.

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 26, 2018

Member

So what about MBTXTLEN * 2?

pyi_print_encoded_if_possible(utf8_formatted);

vsprintf(utf8_formatted, "%s: %s", funcname, GetWinErrorString(error_code));
pyi_print_encoded_if_possible(utf8_formatted);

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

Why is this printed twice?

wchar_t wmsg[MBTXTLEN];
wchar_t * result;
result = pyi_win32_utils_from_utf8(wmsg, msg, MBTXTLEN);
if (!result) {

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

Why not use if pyi_win32_utils_from_utf8(…) ... like you did in pyi_print_encoded_if_possible?

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 24, 2018

Author Contributor

Simply misunderstading. I agree with you.

@@ -203,12 +271,19 @@ pyi_global_printf(const char *fmt, ...)
*/
void pyi_global_perror(const char *funcname, const char *fmt, ...) {
va_list v;

#ifdef _WIN32
char utf8_formatted[1024];

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member
  • Where do these 1024 come from? Is it MBTXTLEN? Is this large enough?
/* TODO improve following for windows. */
#if defined(_WIN32)
void pyi_print_encoded_if_possible(const char* utf8_msg)

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

What do you think about making this function into vsprintf_encoded_if_possible, accepting variable arguments? When looking a the it is alays used like

    va_start(v, fmt);
    vsprintf(utf8_formatted, fmt, v);
    va_end(v);
    pyi_print_encoded_if_possible(utf8_formatted); 

When we could using it like this

    va_start(v, fmt);
    vsprintf_encoded_if_possibel(stderr, fmt, v);
    va_end(v);

we would save defining utf8_formatted all over the places, reduce the lines changed and reduce the line in 'if defined` clauses.

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 24, 2018

Author Contributor

That's good idea. And May I name the vprintf_to_stderr to erase #ifdef _WIN32, #else and #endif from the callers?

I mean like below:

void vprintf_to_stderr(const char *fmt, ...) {
    va_list v;
#if defined(_WIN32)
    int count;
    char *utf8_buffer;
    char *mbcs_buffer;

    va_start(v, fmt);
    count = vsnprintf(NULL, 0, fmt, v);
    va_end(v);
    if (count == -1) {
        vfprintf(stderr, "PyInstaller: Invalid format string \"%s\" ", v);
        return;
    }
    utf8_buffer = (char*)malloc( (count + 1) * sizeof(char) ); 
    va_start(v, fmt);
    vsnprintf(utf8_buffer, count + 1, fmt, v);
    va_end(v);

    /* We can use count here. UTF-8 is more redundant than local encodings. */
    mbcs_buffer = (char*)malloc( (count + 1) * sizeof(char) ); 
    
    if (pyi_win32_utf8_to_mbs(mbcs_buffer, utf8_buffer, count)) {
        fprintf(stderr, mbcs_buffer);
    }
    else {
        fprintf(stderr, utf8_buffer);
    }
    free(mbcs_buffer);
    free(utf8_buffer);
#else
    va_start(v, fmt);
    vfprintf(stderr, fmt, v);
    va_end(v);
#endif /* if defined(_WIN32) */
}

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 25, 2018

Member

And May I name the vprintf_to_stderr

This is a great idea! (Please mind that I did not yet review the code you posted above. Beside thsi I would try to avoid malloc/free, see comment above).

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 26, 2018

Author Contributor

Ok, I'll continue after the issue about {1024/malloc/free} resolved.

);

if (NULL == errorString) {
wchar_t local_buffer[ERROR_STRING_MAX / 3];

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

Why / 3? Where does this come from? Is this large enough?

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 24, 2018

Author Contributor

As you may know, I fixed next patch about this. Sorry!

return "FormatMessage failed.";
}
result2 = pyi_win32_utils_to_utf8(errorString, local_buffer, ERROR_STRING_MAX);
if (!result2) {

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

if pyi_win32_utils_to_utf8() ?

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 24, 2018

Author Contributor

OK, I agree. Thanks

return "FormatMessage failed.";
}
result2 = pyi_win32_utils_to_utf8(errorString, local_buffer, ERROR_STRING_MAX);
if (!result2) {
return "pyi_win32_utils_to_utf8 failed.";

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 24, 2018

Member

Same as above: An user can not gain any useful information from this. Please think about another message. (The same applies for the (unchanged) "FormatMessage failed" a few lines above.

This comment has been minimized.

Copy link
@Suzumizaki

Suzumizaki Apr 24, 2018

Author Contributor
  • Hmmm. But the code below from here:

    if (0 == ret) {
        FATAL_WINERROR("WideCharToMultiByte", "Failed to encode filename as ANSI.\n");
        return NULL;
    }
    

    You mean just use FATAL_WINERROR ?

  • When pyi_win32_utils_to_utf8 fails, FATAL_WINERROR called before returned from the function and no string returns. I think GetWinErrorString should not return empty string, how about it do you think?

  • I think this is usable information(even not 'useful' though), because the true end users don't know the .exe file is built with PyInstaller. S/he will report to developers this message, and the developers find where the accident occurs. Of course that's very rare case and maybe no way to fix, how do you think besides them?

This comment has been minimized.

Copy link
@htgoebel

htgoebel Apr 25, 2018

Member

I agree, let's keep the messages as you proposed. I did not seen we have more messages like this.

@htgoebel

This comment has been minimized.

Copy link
Member

commented Apr 25, 2018

Thanks for the comments. I'm looking forward to your update :-)

When updating a pull-request, you can simply (force) push the updated branch to github again. This will automatically update the pull-request (which follows the branch, not the commit). So you do not need to close the pull-request and open a new one. This also has the benefit that the discussion history is kept. For detailed instructions please read Updating a Pull-Request in the manual.

@Suzumizaki

This comment has been minimized.

Copy link
Contributor Author

commented May 5, 2018

Sorry to many try and errors. But now the work is completed, I believe.

@Suzumizaki

This comment has been minimized.

Copy link
Contributor Author

commented May 6, 2018

Sorry again, I found the error at Line 270 of pyi_global.c... fix later.
[Update: 2018-05-07 11:23JST] Fixed this problem as following commits.

@htgoebel
Copy link
Member

left a comment

Thanks you very much for your updates. I'm sorry I have not answered earlier, I'm busy in my day job.

I added some minor comments to the new code, which overall looks good.

May I ask you to rebase and squash your changes. Also please add a not about what the change for dwLanguageID does.

}
else {
pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN);
MessageBoxW(NULL, wmsg, wcaption, MB_OK | uType);

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jun 28, 2018

Member

Huh, this code tricked me, thus I suggest changing it into (scratch):

  if (pyi_win32_utils_from_utf8(wmsg, msg, MBTXTLEN)) {
        // converting the caption is expected to pass since  .... ADD EXPLANATION HERE
        pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN);
        MessageBoxW(NULL, wmsg, wcaption, MB_OK | uType);
   }
   else {
        MessageBoxA(NULL, msg, caption, MB_OK | uType);
  }

And I'm curious:

  • Does pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN); never fail?
  • If converting message fails, will MessageBoxA(NULL, msg, caption, …) show some reasonable message? Or will it be garbage? If garbage: is it better to show garbage instead of somethinkg like "Also an error occurred when formatting the error message"? What do you think?

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jul 8, 2018

Member

Currently, all caption parameters with calling show_message_box are written in hard-coded US-ASCII.

Okay then.

It will be 'patterned' garbage. That would be the hint of what the real message is. I think this garbage is better than the messages which don't have the given string.

You are right, maybe add a comment to the code.

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jul 8, 2018

Member

Please add a short comment about the captions being ASCII, too.

fprintf(stderr, "%s", mbcs_buffer);
}
else {
fprintf(stderr, "%s", utf8_buffer);

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jun 28, 2018

Member

Like above: If converting utf8_buffer fails, will fprintf(stderr, "%s", mbcs_buffer) print some reasonable message? Or will it be garbage? Which one is better? What do you think?

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jul 8, 2018

Member

Like above: Okay then.


perror(funcname); // perror() writes to stderr

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jun 28, 2018

Member

Please move back in front of the blank line to reduce the number of changes. Non-functional changes shall be avoided.

/*
* Wrap printing debug messages to console.
*/
void
pyi_global_printf(const char *fmt, ...)
{
pyi_global_printf(const char *fmt, ...) {

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jun 28, 2018

Member

Please restore old formatting to reduce the number of changes. Non-functional changes shall be avoided. Sorry for be picky about this.

const char *msg,
const char *caption,
UINT uType)
{

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jun 28, 2018

Member

Please keep the arguments in the same line as the func name, like it is done in the other parts of this file.

@@ -113,7 +131,8 @@ mbothererror(const char *fmt, ...)

msg[MBTXTLEN-1] = '\0';

MessageBoxA(NULL, msg, "Fatal Error!", MB_OK | MB_ICONEXCLAMATION);
show_message_box(msg, "Fatal error detected", MB_ICONEXCLAMATION);

This comment has been minimized.

Copy link
@htgoebel

htgoebel Jun 28, 2018

Member

Useless blank line.

@Suzumizaki

This comment has been minimized.

Copy link
Contributor Author

commented Jun 30, 2018

Does pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN); never fail?

Currently, all caption parameters with calling show_message_box are written in hard-coded US-ASCII. But should I better like below?:

wchar_t wmsg[MBTXTLEN];
wchar_t wcaption[MBTXTLEN];
if (pyi_win32_utils_from_utf8(wmsg, msg, MBTXTLEN)) {
    if (pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN)) {
        MessageBoxW(NULL, wmsg, wcaption, MB_OK | uType);
    }
    else {
        MessageBoxW(NULL, wmsg, L"PyInstaller (failed)", MB_OK | uType);
    }
else {
    MessageBoxA(NULL, msg, caption, MB_OK | uType);
}

If converting message fails, will MessageBoxA(NULL, msg, caption, …) show some reasonable message? Or will it be garbage? If garbage: is it better to show garbage instead of somethinkg like "Also an error occurred when formatting the error message"? What do you think?

It will be 'patterned' garbage. That would be the hint of what the real message is. I think this garbage is better than the messages which don't have the given string.

Like above: If converting utf8_buffer fails, will fprintf(stderr, "%s", mbcs_buffer) print some reasonable message? Or will it be garbage? Which one is better? What do you think?

Because I told above, I won't fix this fprintf part.

May I ask you to rebase and squash your changes.

Ok, I'll do after your answer to the my asks at the first and last of this post.

Also please add a not about what the change for dwLanguageID does.

See Language Identifier Constants and Strings on MSDN. We should use SUBLANG_DEFAULT instead of SUBLANG_NEUTRAL. Giving 0 to dwLanguageID means MAKELANGID(LANG_NEUTRAL, SUBLANG_NEUTRAL), I think that's simply wrong. Should I wrote into pyi_win32_utils.c as comment?

@htgoebel

This comment has been minimized.

Copy link
Member

commented Jul 8, 2018

Also please add a not about what the change for dwLanguageID does.

[...]. Should I wrote into pyi_win32_utils.c as comment?

Yes please.

Suzumizaki added some commits Apr 23, 2018

Fix to show utf-8 debug messages correctly(noconsole mode)/convert to…
… mbcs strings before write to stderr(console mode) on Windows.
Fix: Use MessageBoxW and FormatMessageW functions instead of -A postf…
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of the bug of Microsoft's vswhere.exe, we cannot run waf under Japanese version of Windows. See below:

waf-project/waf#2155
microsoft/vswhere#146
Fix: Use MessageBoxW and FormatMessageW functions instead of -A postf…
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of the bug of Microsoft's vswhere.exe, we cannot run waf under Japanese version of Windows. See below:

waf-project/waf#2155
microsoft/vswhere#146
Fix: Use MessageBoxW and FormatMessageW functions instead of -A postf…
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of the bug of Microsoft's vswhere.exe, we cannot run waf under Japanese version of Windows. See below:

waf-project/waf#2155
microsoft/vswhere#146
Fix: Use MessageBoxW and FormatMessageW functions instead of -A postf…
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of C/C++ specification, we can only use one of printf or wprintf for each process. C/C++ stream can have 8bit mode or 16bit mode which is selected by first (w)printf, but cannot switch after first output. That's the reason I use MBCS for console mode.
Note 3: Because of the bug of Microsoft's vswhere.exe and waf, we cannot run waf under Japanese version of Windows. See below:

waf-project/waf#2155
microsoft/vswhere#146
Fix: Use MessageBoxW and FormatMessageW functions instead of -A postf…
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of C/C++ specification, we can only use one of printf or wprintf for each process. C/C++ stream can have 8bit mode or 16bit mode which is selected by first (w)printf, but cannot switch after first output. That's the reason I use MBCS for console mode.
Note 3: Because of the bug of Microsoft's vswhere.exe and waf, we cannot run waf under Japanese version of Windows. See below:

waf-project/waf#2155
microsoft/vswhere#146
@htgoebel

This comment has been minimized.

Copy link
Member

commented Jul 11, 2018

Please abstain from merging other branches into the branch this pull-request is based on. This will make it impossible to review. If you really need to check if your code works with current development head, please rebase on develop. But be aware that his will destroy the discussion history, So typically this is done after the code-review passed.

Add comments about the 'caption' and garbaged 'msg' when failed.(boot…
…loader/src/pyi_global.c)

Add comments about MAKELANGID.(bootloader/src/pyi_win32_utils.c)
Revert some python formatting.(bootloader/src/pyi_global.c)
@Suzumizaki

This comment has been minimized.

Copy link
Contributor Author

commented Jul 12, 2018

Sorry again, I reverted the problematic one and add new commit.

@htgoebel htgoebel merged commit df91baf into pyinstaller:develop Aug 25, 2018

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@Suzumizaki

This comment has been minimized.

Copy link
Contributor Author

commented Aug 30, 2018

Thank you!

@htgoebel htgoebel added this to the PyInstaller 3.4 milestone Sep 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.