Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doxygen won't document functions after #define'd numerical literals with ' separator #9824

Open
SagaraBattousai opened this issue Jan 30, 2023 · 3 comments

Comments

@SagaraBattousai
Copy link

SagaraBattousai commented Jan 30, 2023

Describe the bug
Under standard circumstances (explained later) if there is a #define of a numeric literal followed by a documented function, said function will not be documented.

Example:

#define INTEGRAL_MASK          0x3F'FF'FF'FF


/**
 * @brief This function will not be documented due to the above
 * @pre shift >= 0
 * @param shift
 * @return The bits that were shifted out
****************************************************/
int f (int shift);

Mitigating Circumstances:
If ENABLE_PREPROCESSING = NO this works but it doesn't if it's set to YES even with MACRO_EXPANSION = YES EXPAND_ONLY_PREDEF = YES and PREDEFINED = .......

There is also a weird case with commenting out the #define (when there are others above with the ' separators) as long as it looks like a doc comment (however this doesn't work with just a random doc in-between); however this is difficult to reproduce and so hacky I cant imagine it's too relevant.

Expected behavior
This should be documented as if the #define did not use the ' as this is now legal as of C++14; from cppreference.com "Optional single quotes (') may be inserted between the digits as a separator. They are ignored by the compiler." (see: https://en.cppreference.com/w/cpp/language/integer_literal)

Version
doxygen (via CMAKE) on Windows 10, 64 bit

Additional context
Ordinarily I dislike using ' in integer literals however there are cases (such as aligning hexadecimal digits for masks and shifts) where it can be useful

@albert-github
Copy link
Collaborator

  • You wrote: doxygen (via CMAKE) on Windows 10, 64 bit but which version did you build? (i.e. the result of doxygen -v)
  • what do you mean by PREDEFINED = ....... or is this just a shortcut for some predefines (doesn't seem to be relevant here)

With the current master (1.9.7 (aed5276)) I can reproduce the problem as described.

@SagaraBattousai
Copy link
Author

My apologies, I forgot to add the output of ${DOXYGEN_VERSION} from CMAKE which was: 1.9.3.

by PREDEFINED = ....... I did indeed mean any relevant predefines. In my case it was

PREDEFINED = __declspec(x)= \
                         __attribute__(x)= \
                         XYZ_EXPORT_SYMBOL \
                         XYZ_LOCAL_SYMBOL

where XYZ_EXPORT_OR_LOCAL_SYMBOL are #defined as declspec or attribute according to the compiler etc

They are irrelevant but I wanted to show all the fixes I tried on the client side to see if there was a simple fix.

@albert-github
Copy link
Collaborator

@doxygen
Looks like the problem in this case is caused by an odd number of single or double double quotes(' / "). The rule in the pre.l is:

<DefineText>\'                          {
                                          outputChar(yyscanner,' ');
                                          yyextra->defText += *yytext;
                                          yyextra->defLitText+=yytext;
                                          if (!yyextra->insideComment)
                                          {
                                            BEGIN(SkipSingleQuote);
                                          }
                                        }

In a draft 2017 standard (N4659.pdf) I see the "explanation":

An integer literal is a sequence of digits that has no period or exponent part, with optional separating single
quotes that are ignored when determining its value. An integer literal may have a prefix that specifies its base
and a suffix that specifies its type. The lexically first digit of the sequence of digits is the most significant. A
binary integer literal (base two) begins with 0b or 0B and consists of a sequence of binary digits. An octal
integer literal (base eight) begins with the digit 0 and consists of a sequence of octal digits.23 A decimal
integer literal (base ten) begins with a digit other than 0 and consists of a sequence of decimal digits. A
hexadecimal integer literal (base sixteen) begins with 0x or 0X and consists of a sequence of hexadecimal
digits, which include the decimal digits and the letters a through f and A through F with decimal values ten
through fifteen. [ Example: The number twelve can be written 12, 014, 0XC, or 0b1100. The integer literals
1048576, 1’048’576, 0X100000, 0x10’0000, and 0’004’000’000 all have the same value. —end example ]

In the draft 20xx (N4901.pdf) I see:

In an integer-literal, the sequence of binary-digits, octal-digits, digits, or hexadecimal-digits is interpreted as
a base N integer as shown in table Table 7; the lexically first digit of the sequence of digits is the most
significant.
[Note 1 : The prefix and any optional separating single quotes are ignored when determining the value. —end note]

An integral literal can, of course, occur at many places in a source and in (probably) many cases this can lead to problems when there is an odd number of single quotes.
Question is what will the best way to handle this (also thinking of single quotes inside a string and a double quote as a character (like '"') definition)?

(in case of ENABLE_PREPROCESSING = NO the problem does not occur as in scanner.l are characters of the #define are ignored).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants