Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Unicode runtime: rename _T macro, do not use it for character literals #2075

Merged
merged 4 commits into from Apr 14, 2019

Conversation

@nojb
Copy link
Contributor

commented Sep 28, 2018

See MPR#7729.

The issue is as follows: in order to support Unicode under Windows, we define a macro _T in misc.h to be used with string literals so that they have the right type under both Windows (WCHAR *) and Unix (char *). The problems are:

  • Names starting with underscores are reserved and should not be used by user code; this can trigger a compiler warning.
  • The Windows headers define a similarly named macro; when compiling C code that includes the OCaml headers this can trigger a redefinition warning; worse: if _UNICODE is not defined, both macros have different meanings.

Thus, this PR:

  • Renames the _T macro simply to T. This solves both issues above. Suggestions welcome if someone prefers a different name.
  • Removes the use of _T for character literals. Character literals in C have type char and using the macro converts them into values of type WCHAR. However, this conversion is a no-op and in fact the compiler will silently make this conversion without using the macro. Thus, it is simpler to not to use the macro at all in this case.
@moosotc

This comment has been minimized.

Copy link

commented Sep 28, 2018

In the name of pedantry let me quote the mantis report that quotes the standard:
'''
7.1.3 of ISO/IEC 9899:201x in part reads:

All identifiers that begin with an underscore and either an uppercase letter or another
underscore are always reserved for any use.
'''

@nojb

This comment has been minimized.

Copy link
Contributor Author

commented Nov 6, 2018

Could someone review this PR? Maybe @dra27?

@damiendoligez damiendoligez added this to the 4.08 milestone Nov 19, 2018
@XVilka

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Is this going to be a part of 4.08 as well?

@gasche

This comment has been minimized.

Copy link
Member

commented Mar 29, 2019

@nojb: My understanding is that @dra27 is working on dune/opam stuff for now, so maybe we should try to do without our Windows guru.

I can confirm that the only changes in this PR touch the _T macro.

Not knowing anything about wchar, I don't know about the justification for removing the macro on character literals (Does the implicit conversion always happen quietly or do some compiler emit warnings in that case that will break the build in annoying ways later?)

@nojb nojb force-pushed the nojb:rename_T_macro branch from a986e78 to d98a2c1 Mar 29, 2019
@nojb

This comment has been minimized.

Copy link
Contributor Author

commented Mar 29, 2019

Not knowing anything about wchar, I don't know about the justification for removing the macro on character literals (Does the implicit conversion always happen quietly or do some compiler emit warnings in that case that will break the build in annoying ways later?)

The rationale for removing the macro in this case is that it is easy to forget and the compiler will not complain, leading to a false sense of safety. For example, we had forgotten to use it in

if (*opt++ == ',') break;

This fact that char -> wchar_t is the identity is specified by C99 (probably earlier as well): https://port70.net/~nsz/c/c99/n1256.html#6.3.1.3p1. Whether a (7-bit) ASCII character literal is represented the same way as a char or wchar_t is probably implementation-defined, but on Windows, where wchar_t represents UTF-16 code units, that is always the case.

@dra27

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

@xavierleroy

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Please don't rush this into 4.08, it is not a serious bug and it could break 3rd-party libraries that use _T in C stub code. Please remove the "consider for release" label and the 4.08 milestone.

@nojb nojb removed this from the 4.08 milestone Mar 29, 2019
Copy link
Contributor

left a comment

I agree with the removal of _T from the character literals (sorry - I think I may have been responsible for adding quite a few of them!).

I think that either the new name for _T should either be CAML_OS_TEXT (or something like) or the definition of T should be protected by #ifdef CAML_INTERNALS. I think the latter is better, as the macro is only used by runtime/ - if users want it, they can include tchar.h themselves. There has also been __T since the dawn of time, which I believe is compatible (but switching to that wouldn't solve the problem of misc.h and tchar.h defining it).

There is a risk of a compatibility break with any of these approaches - @stedolan recently had a trick for grepping opam-repository sources, I think?

Arguably, including <tchar.h> in something also using OCaml headers is an error if you haven't already defined UNICODE and _UNICODE (in fact, I'm not totally sure why -DUNICODE -D_UNICODE is only in the internal C compiler flags, but I haven't checked the discussions from when we did it).

@dra27

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Ah, @xavierleroy - snap!

@dra27

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

A possible compatibility story on the Windows side would be for us to always include <tchar.h> (but still use our own macro name internally - i.e. T or CAML_OS_TEXT or whatever) … but that doesn't solve the non-Windows compatibility story.

@nojb

This comment has been minimized.

Copy link
Contributor Author

commented Mar 29, 2019

Made the macro internal as suggested.

@xavierleroy

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Ah, @xavierleroy - snap!

My English is not good enough to know whether this is an expression of mild disappointment or of strong irritation :-)

At any rate, I agree we could use a more descriptive macro name than T, especially now that there are fewer uses of this macro (no longer on character literals).

@dra27

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Ah, @xavierleroy - snap!

My apologies - it's definitely not irritation, it refers to the card game "snap" and as an expression means that we said the same thing at roughly the same time (see https://en.wikipedia.org/wiki/Snap_(card_game)), in this case because I didn't see that you'd commented before I hit submit on my review!

@dra27

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

@xavierleroy - @nojb has made the definition T internal to OCaml. Is that OK, or would you also prefer it to be more descriptive? Forgetting to use it is guarded by CI - using "" instead of L"" is an error on the Windows builds.

@xavierleroy

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

I can certainly live with the name T if the macro is not exposed.

I was thinking that for 64-bit integer literals our macro is called INT64_LITERAL, but maybe you'll find WSTRING_LITERAL or OS_TEXT_LITERAL too verbose.

Now I'll go learn playing snap :-) Thanks for the pointer.

@nojb

This comment has been minimized.

Copy link
Contributor Author

commented Apr 10, 2019

@dra27 Do you care to approve this PR if you think it is good to go?

@dra27 dra27 merged commit 8afe2db into ocaml:trunk Apr 14, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@dra27

This comment has been minimized.

Copy link
Contributor

commented Apr 14, 2019

Done! If when 4.09 is being tested we find _T was being used, we may want to mark the issue as breaking and advertise this code, which I think is the compatible way to do it:

#ifndef _T
#ifdef _WIN32
#include <tchar.h>
#else
#define _T(x) x
#endif
#endif
@nojb

This comment has been minimized.

Copy link
Contributor Author

commented Apr 15, 2019

Thanks!

@nojb nojb deleted the nojb:rename_T_macro branch Apr 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.