Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to show utf-8 debug messages correctly(noconsole mode)/convert to… #3477

Merged
merged 9 commits into from
Aug 25, 2018
Merged

Fix to show utf-8 debug messages correctly(noconsole mode)/convert to… #3477

merged 9 commits into from
Aug 25, 2018

Conversation

Suzumizaki
Copy link
Contributor

… mbcs strings before write to stderr(console mode) on Windows.

Until this patch applied, all non-ASCII messages are failed to show(mojibake).
For no-console(windowed) mode, uses MessageBoxW instead of MessageBoxA.
For console mode, strings encoded with user console encodings.

Note1: waf build system has a bug to detect MSVC with Japanese (or DBCS localized) version of Windows.
Note2: Because of C/C++ specification, we can only use one of printf or wprintf for each process. C/C++ stream can have 8bit mode or 16bit mode which is selected by first (w)printf, but cannot switch after first output. That's the reason I use MBCS for console mode.

@htgoebel htgoebel added the area:bootloader Caused be or effecting the bootloader label Apr 24, 2018
Copy link
Member

@htgoebel htgoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this pull-request. I added some comments and requests for enhancement.

Please also rework the commits and the commit message following the https://pyinstaller.readthedocs.io/en/latest/development/commit-messages.html. I think Note 2 is worth going into the commit message body. Thanks.

show_message_box(
msg,
L"Fatal error detected",
L"pyi_win32_utils_from_utf8 failed during making fatal error message",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is the same in all cases, thus it should be moved into show_message_box. Also the caller does not know pyi_win32_utils_from_utf8 is used there, thus it must not pass such a message.

Additionally this message is much to technical for being of any use the some user being confronted with it. I suggest using "Fatal error detected! Also an error occurred when formatting the error message" (with "Fatal error detected" being the msg).

Now this long text in gone, this call can be formatted on less lines, .g.

    show_message_box(msg, L"Fatal error detected",
                     MB_ICONEXCLAMATION);

Same for the other cases below.

@@ -180,12 +239,21 @@ pyi_global_printf(const char *fmt, ...)
{
va_list v;

fprintf(stderr, "[%d] ", getpid());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is the same in both cases. Please keep it outside of the #if defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Thanks.

fprintf(stderr, "[%d] ", getpid());
/* Sent 'LOADER text' messages to stderr. */
#if defined(_WIN32)
char utf8_formatted[1024];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Where do these 1024 come from? Is it MBTXTLEN? Is this large enough?
  • Inconsistent intention here and below. Please sue spaces, not tabs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will use vsnprintf() twice, malloc() and free(). Indentation Issue also I Agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not go for malloc and free since these can fail, too. Using the stack is less error prone.

Just reason (in a commend) why 1024 and not 512 or 2048?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I can't explain the logical reason why 1024 I used. No source exists the value can rely on.
Do you have any suggestion about another buffer size or the other techniques?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you choose 1024 then? Is this from some MS example code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Just MBTXTLEN. That's because I said not logical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what about MBTXTLEN * 2?

pyi_print_encoded_if_possible(utf8_formatted);

vsprintf(utf8_formatted, "%s: %s", funcname, GetWinErrorString(error_code));
pyi_print_encoded_if_possible(utf8_formatted);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this printed twice?

wchar_t wmsg[MBTXTLEN];
wchar_t * result;
result = pyi_win32_utils_from_utf8(wmsg, msg, MBTXTLEN);
if (!result) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use if pyi_win32_utils_from_utf8(…) ... like you did in pyi_print_encoded_if_possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply misunderstading. I agree with you.

@@ -203,12 +271,19 @@ pyi_global_printf(const char *fmt, ...)
*/
void pyi_global_perror(const char *funcname, const char *fmt, ...) {
va_list v;

#ifdef _WIN32
char utf8_formatted[1024];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Where do these 1024 come from? Is it MBTXTLEN? Is this large enough?

/* TODO improve following for windows. */
#if defined(_WIN32)
void pyi_print_encoded_if_possible(const char* utf8_msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about making this function into vsprintf_encoded_if_possible, accepting variable arguments? When looking a the it is alays used like

    va_start(v, fmt);
    vsprintf(utf8_formatted, fmt, v);
    va_end(v);
    pyi_print_encoded_if_possible(utf8_formatted); 

When we could using it like this

    va_start(v, fmt);
    vsprintf_encoded_if_possibel(stderr, fmt, v);
    va_end(v);

we would save defining utf8_formatted all over the places, reduce the lines changed and reduce the line in 'if defined` clauses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good idea. And May I name the vprintf_to_stderr to erase #ifdef _WIN32, #else and #endif from the callers?

I mean like below:

void vprintf_to_stderr(const char *fmt, ...) {
    va_list v;
#if defined(_WIN32)
    int count;
    char *utf8_buffer;
    char *mbcs_buffer;

    va_start(v, fmt);
    count = vsnprintf(NULL, 0, fmt, v);
    va_end(v);
    if (count == -1) {
        vfprintf(stderr, "PyInstaller: Invalid format string \"%s\" ", v);
        return;
    }
    utf8_buffer = (char*)malloc( (count + 1) * sizeof(char) ); 
    va_start(v, fmt);
    vsnprintf(utf8_buffer, count + 1, fmt, v);
    va_end(v);

    /* We can use count here. UTF-8 is more redundant than local encodings. */
    mbcs_buffer = (char*)malloc( (count + 1) * sizeof(char) ); 
    
    if (pyi_win32_utf8_to_mbs(mbcs_buffer, utf8_buffer, count)) {
        fprintf(stderr, mbcs_buffer);
    }
    else {
        fprintf(stderr, utf8_buffer);
    }
    free(mbcs_buffer);
    free(utf8_buffer);
#else
    va_start(v, fmt);
    vfprintf(stderr, fmt, v);
    va_end(v);
#endif /* if defined(_WIN32) */
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And May I name the vprintf_to_stderr

This is a great idea! (Please mind that I did not yet review the code you posted above. Beside thsi I would try to avoid malloc/free, see comment above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll continue after the issue about {1024/malloc/free} resolved.

);

if (NULL == errorString) {
wchar_t local_buffer[ERROR_STRING_MAX / 3];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why / 3? Where does this come from? Is this large enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you may know, I fixed next patch about this. Sorry!

return "FormatMessage failed.";
}
result2 = pyi_win32_utils_to_utf8(errorString, local_buffer, ERROR_STRING_MAX);
if (!result2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if pyi_win32_utils_to_utf8() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I agree. Thanks

return "FormatMessage failed.";
}
result2 = pyi_win32_utils_to_utf8(errorString, local_buffer, ERROR_STRING_MAX);
if (!result2) {
return "pyi_win32_utils_to_utf8 failed.";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: An user can not gain any useful information from this. Please think about another message. (The same applies for the (unchanged) "FormatMessage failed" a few lines above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Hmmm. But the code below from here:

    if (0 == ret) {
        FATAL_WINERROR("WideCharToMultiByte", "Failed to encode filename as ANSI.\n");
        return NULL;
    }
    

    You mean just use FATAL_WINERROR ?

  • When pyi_win32_utils_to_utf8 fails, FATAL_WINERROR called before returned from the function and no string returns. I think GetWinErrorString should not return empty string, how about it do you think?

  • I think this is usable information(even not 'useful' though), because the true end users don't know the .exe file is built with PyInstaller. S/he will report to developers this message, and the developers find where the accident occurs. Of course that's very rare case and maybe no way to fix, how do you think besides them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's keep the messages as you proposed. I did not seen we have more messages like this.

@htgoebel
Copy link
Member

Thanks for the comments. I'm looking forward to your update :-)

When updating a pull-request, you can simply (force) push the updated branch to github again. This will automatically update the pull-request (which follows the branch, not the commit). So you do not need to close the pull-request and open a new one. This also has the benefit that the discussion history is kept. For detailed instructions please read Updating a Pull-Request in the manual.

@Suzumizaki
Copy link
Contributor Author

Sorry to many try and errors. But now the work is completed, I believe.

@Suzumizaki
Copy link
Contributor Author

Suzumizaki commented May 6, 2018

Sorry again, I found the error at Line 270 of pyi_global.c... fix later.
[Update: 2018-05-07 11:23JST] Fixed this problem as following commits.

Copy link
Member

@htgoebel htgoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you very much for your updates. I'm sorry I have not answered earlier, I'm busy in my day job.

I added some minor comments to the new code, which overall looks good.

May I ask you to rebase and squash your changes. Also please add a not about what the change for dwLanguageID does.

}
else {
pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN);
MessageBoxW(NULL, wmsg, wcaption, MB_OK | uType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, this code tricked me, thus I suggest changing it into (scratch):

  if (pyi_win32_utils_from_utf8(wmsg, msg, MBTXTLEN)) {
        // converting the caption is expected to pass since  .... ADD EXPLANATION HERE
        pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN);
        MessageBoxW(NULL, wmsg, wcaption, MB_OK | uType);
   }
   else {
        MessageBoxA(NULL, msg, caption, MB_OK | uType);
  }

And I'm curious:

  • Does pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN); never fail?
  • If converting message fails, will MessageBoxA(NULL, msg, caption, …) show some reasonable message? Or will it be garbage? If garbage: is it better to show garbage instead of somethinkg like "Also an error occurred when formatting the error message"? What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, all caption parameters with calling show_message_box are written in hard-coded US-ASCII.

Okay then.

It will be 'patterned' garbage. That would be the hint of what the real message is. I think this garbage is better than the messages which don't have the given string.

You are right, maybe add a comment to the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a short comment about the captions being ASCII, too.

fprintf(stderr, "%s", mbcs_buffer);
}
else {
fprintf(stderr, "%s", utf8_buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above: If converting utf8_buffer fails, will fprintf(stderr, "%s", mbcs_buffer) print some reasonable message? Or will it be garbage? Which one is better? What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above: Okay then.


perror(funcname); // perror() writes to stderr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move back in front of the blank line to reduce the number of changes. Non-functional changes shall be avoided.

/*
* Wrap printing debug messages to console.
*/
void
pyi_global_printf(const char *fmt, ...)
{
pyi_global_printf(const char *fmt, ...) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please restore old formatting to reduce the number of changes. Non-functional changes shall be avoided. Sorry for be picky about this.

const char *msg,
const char *caption,
UINT uType)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the arguments in the same line as the func name, like it is done in the other parts of this file.

@@ -113,7 +131,8 @@ mbothererror(const char *fmt, ...)

msg[MBTXTLEN-1] = '\0';

MessageBoxA(NULL, msg, "Fatal Error!", MB_OK | MB_ICONEXCLAMATION);
show_message_box(msg, "Fatal error detected", MB_ICONEXCLAMATION);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless blank line.

@Suzumizaki
Copy link
Contributor Author

Does pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN); never fail?

Currently, all caption parameters with calling show_message_box are written in hard-coded US-ASCII. But should I better like below?:

wchar_t wmsg[MBTXTLEN];
wchar_t wcaption[MBTXTLEN];
if (pyi_win32_utils_from_utf8(wmsg, msg, MBTXTLEN)) {
    if (pyi_win32_utils_from_utf8(wcaption, caption, MBTXTLEN)) {
        MessageBoxW(NULL, wmsg, wcaption, MB_OK | uType);
    }
    else {
        MessageBoxW(NULL, wmsg, L"PyInstaller (failed)", MB_OK | uType);
    }
else {
    MessageBoxA(NULL, msg, caption, MB_OK | uType);
}

If converting message fails, will MessageBoxA(NULL, msg, caption, …) show some reasonable message? Or will it be garbage? If garbage: is it better to show garbage instead of somethinkg like "Also an error occurred when formatting the error message"? What do you think?

It will be 'patterned' garbage. That would be the hint of what the real message is. I think this garbage is better than the messages which don't have the given string.

Like above: If converting utf8_buffer fails, will fprintf(stderr, "%s", mbcs_buffer) print some reasonable message? Or will it be garbage? Which one is better? What do you think?

Because I told above, I won't fix this fprintf part.

May I ask you to rebase and squash your changes.

Ok, I'll do after your answer to the my asks at the first and last of this post.

Also please add a not about what the change for dwLanguageID does.

See Language Identifier Constants and Strings on MSDN. We should use SUBLANG_DEFAULT instead of SUBLANG_NEUTRAL. Giving 0 to dwLanguageID means MAKELANGID(LANG_NEUTRAL, SUBLANG_NEUTRAL), I think that's simply wrong. Should I wrote into pyi_win32_utils.c as comment?

@htgoebel
Copy link
Member

htgoebel commented Jul 8, 2018

Also please add a not about what the change for dwLanguageID does.

[...]. Should I wrote into pyi_win32_utils.c as comment?

Yes please.

… mbcs strings before write to stderr(console mode) on Windows.
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of the bug of Microsoft's vswhere.exe, we cannot run waf under Japanese version of Windows. See below:

https://github.com/waf-project/waf/pull/2155
microsoft/vswhere#146
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of the bug of Microsoft's vswhere.exe, we cannot run waf under Japanese version of Windows. See below:

https://github.com/waf-project/waf/pull/2155
microsoft/vswhere#146
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of the bug of Microsoft's vswhere.exe, we cannot run waf under Japanese version of Windows. See below:

https://github.com/waf-project/waf/pull/2155
microsoft/vswhere#146
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of C/C++ specification, we can only use one of printf or wprintf for each process. C/C++ stream can have 8bit mode or 16bit mode which is selected by first (w)printf, but cannot switch after first output. That's the reason I use MBCS for console mode.
Note 3: Because of the bug of Microsoft's vswhere.exe and waf, we cannot run waf under Japanese version of Windows. See below:

https://github.com/waf-project/waf/pull/2155
microsoft/vswhere#146
…ixed ones on Windows.

Note 1: We should run waf and rebuild run/run-d/runw/runw_d.exe before distributing.
Note 2: Because of C/C++ specification, we can only use one of printf or wprintf for each process. C/C++ stream can have 8bit mode or 16bit mode which is selected by first (w)printf, but cannot switch after first output. That's the reason I use MBCS for console mode.
Note 3: Because of the bug of Microsoft's vswhere.exe and waf, we cannot run waf under Japanese version of Windows. See below:

https://github.com/waf-project/waf/pull/2155
microsoft/vswhere#146
@htgoebel
Copy link
Member

Please abstain from merging other branches into the branch this pull-request is based on. This will make it impossible to review. If you really need to check if your code works with current development head, please rebase on develop. But be aware that his will destroy the discussion history, So typically this is done after the code-review passed.

…loader/src/pyi_global.c)

Add comments about MAKELANGID.(bootloader/src/pyi_win32_utils.c)
Revert some python formatting.(bootloader/src/pyi_global.c)
@Suzumizaki
Copy link
Contributor Author

Sorry again, I reverted the problematic one and add new commit.

@htgoebel htgoebel merged commit df91baf into pyinstaller:develop Aug 25, 2018
@Suzumizaki
Copy link
Contributor Author

Thank you!

@htgoebel htgoebel added this to the PyInstaller 3.4 milestone Sep 2, 2018
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area:bootloader Caused be or effecting the bootloader
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants