Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

locale.c: fix format string for __LINE__ output (debugging) #21956

Merged
merged 1 commit into from
Feb 11, 2024

Conversation

mauke
Copy link
Contributor

@mauke mauke commented Feb 8, 2024

On some platforms, building a -DDEBUGGING perl triggers the following compiler warnings:

In file included from locale.c:385:
locale.c: In function ‘S_bool_setlocale_2008_i’:
locale.c:2494:29: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 7 has type ‘int’ [-Wformat=]
                             "bool_setlocale_2008_i: creating new object"    \
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
perl.h:4983:33: note: in definition of macro ‘DEBUG__’
                 DEBUG_PRE_STMTS a; DEBUG_POST_STMTS                     \
                                 ^
locale.c:2493:7: note: in expansion of macro ‘DEBUG_L’
       DEBUG_L(PerlIO_printf(Perl_debug_log,                                 \
       ^~~~~~~
locale.c:2523:17: note: in expansion of macro ‘DEBUG_NEW_OBJECT_FAILED’
                 DEBUG_NEW_OBJECT_FAILED(category_names[index], new_locale,
                 ^~~~~~~~~~~~~~~~~~~~~~~
In file included from locale.c:322:
config.h:4052:18: note: format string is defined here
 #define U32uf  "lu"  /**/

This is because the code tries to format __LINE__ with a varargs function using %"LINE_Tf".

Things are slightly tricky here because in a varargs function, no type context is available, so the format string absolutely has to match the intrinsic type of each argument. The __LINE__ macro expands to a simple (decimal) integer constant. According to C, such a constant has type int if its value fits, otherwise unsigned int if it fits, otherwise long int, etc. None of the *.c files in the perl distribution exceed 32767 lines (the minimum INT_MAX required by C), so even on ancient 16-bit systems, our __LINE__ will always be of type int.

The %"LINE_Tf" format is designed to match a line_t argument, not int. (On some platforms, line_t is defined as unsigned long and incompatible with int for formatting purposes.) Therefore it is an error to use %"LINE_Tf" with __LINE__.

One way to fix this is to convert the argument to match the format string: ... %"LINE_Tf" ...", (line_t)__LINE__.

The other way is to change the format string to match the (int) argument: "... %d ...", __LINE__.

I chose option #2 because it is by far the most common way to output __LINE__ elsewhere in the perl source.

@khwilliamson
Copy link
Contributor

I prefer the opposite approach, to cast to (line_t).

One reason is LINE_Tf is already used in the statements this commit changes. And using a cast is the current convention in this file, which should be a more important precedent than the conventions in other files. I believe these four cases are the only ones in this file that there isn't such a cast.

The other reason is that it is generally preferable to not rely on sizes across platforms. line_t has been defined for 20+ years to be U32. Heaven forbid that we should exceed this number of lines in a file, and have to convert to a long, but I could see someone more likely wanting to define line_t to be PERL_UINT_FAST16_T as a micro optimization. And then everything that used the cast would just work.

I get that LINE_Tf is uglier than %d, and that an argument can be made that the ugliness cost outweighs the benefit of the generality that likely will never be needed. But this file and these two statements already were intended to have the generality, and I argue that it's better to keep it.

On some platforms, building a -DDEBUGGING perl triggers the following
compiler warnings:

    In file included from locale.c:385:
    locale.c: In function ‘S_bool_setlocale_2008_i’:
    locale.c:2494:29: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 7 has type ‘int’ [-Wformat=]
                                 "bool_setlocale_2008_i: creating new object"    \
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    perl.h:4983:33: note: in definition of macro ‘DEBUG__’
                     DEBUG_PRE_STMTS a; DEBUG_POST_STMTS                     \
                                     ^
    locale.c:2493:7: note: in expansion of macro ‘DEBUG_L’
           DEBUG_L(PerlIO_printf(Perl_debug_log,                                 \
           ^~~~~~~
    locale.c:2523:17: note: in expansion of macro ‘DEBUG_NEW_OBJECT_FAILED’
                     DEBUG_NEW_OBJECT_FAILED(category_names[index], new_locale,
                     ^~~~~~~~~~~~~~~~~~~~~~~
    In file included from locale.c:322:
    config.h:4052:18: note: format string is defined here
     #define U32uf  "lu"  /**/

This is because the code tries to format __LINE__ with a varargs
function using %"LINE_Tf".

Things are slightly tricky here because in a varargs function, no type
context is available, so the format string absolutely has to match the
intrinsic type of each argument. The __LINE__ macro expands to a simple
(decimal) integer constant. According to C, such a constant has type int
if its value fits, otherwise unsigned int if it fits, otherwise long
int, etc. None of the *.c files in the perl distribution exceed 32767
lines (the minimum INT_MAX required by C), so even on ancient 16-bit
systems, our __LINE__ will always be of type int.

The %"LINE_Tf" format is designed to match a line_t argument, not int.
(On some platforms, line_t is defined as unsigned long and incompatible
with int for formatting purposes.) Therefore it is an error to use
%"LINE_Tf" with __LINE__.

One way to fix this is to convert the argument to match the format
string: ... %"LINE_Tf" ...", (line_t)__LINE__.

The other way is to change the format string to match the (int)
argument: "... %d ...", __LINE__.

Option 1 is what is used elsewhere in locale.c, so use line_t
consistently for all line numbers.
@mauke
Copy link
Contributor Author

mauke commented Feb 11, 2024

@khwilliamson I've pushed a commit that standardizes on line_t everywhere in locale.c.

@mauke mauke merged commit 49d95ca into Perl:blead Feb 11, 2024
28 checks passed
@mauke mauke deleted the fix-line-type-mismatch branch February 11, 2024 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants