Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NoNewGlobals for HttpHdrCc:ccLookupTable #1750

Closed
wants to merge 14 commits into from

Conversation

kinkie
Copy link
Contributor

@kinkie kinkie commented Mar 19, 2024

Detected by Coverity. CID 1554655: Initialization or destruction
ordering is unspecified (GLOBAL_INIT_ORDER).

Also switched to compile-time checks for table initialization records.

@kinkie
Copy link
Contributor Author

kinkie commented Mar 19, 2024

As an implementation note: the CcAttrs table needs to be accessible,
as there are several lookup operations done from ID to SBuf.

There is an abandoned branch where I tried to do a number of extra things, which might make sense to extract and do their own PR on:

  1. add a const_iterator to LookupTable (done)
  2. make LookupTable bidirectional (I've implemented a scan-based reverse_lookup function, but it would make sense to use a std::vector one instead)
  3. make HttpHdrCc::packInto based on PackableStream

Do you think it's worth investing any time in these efforts?

@kinkie kinkie changed the title NoNewGlobals for HttpHdrCc NoNewGlobals for HttpHdrCc:ccLookupTable Mar 19, 2024
Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth investing any time in [the following] efforts?

  1. add a const_iterator to LookupTable (done)

The table is not ordered by directive name or ID. Creating a useful iterator for such a table would not be trivial. Do it only if there are multiple uses for such an iterator.

FWIW, the current attempt to replace trivial CcAttrs[id].name with ccLookupTable().reverse_lookup(static_cast<HttpHdrCcType>(id)) seems like a step in the wrong direction to me.

  1. make LookupTable bidirectional

The current hash table is not "directional" at all, so I am not sure what you mean by bidirectional. What use cases are you targeting?

  1. make HttpHdrCc::packInto based on PackableStream

Converting HttpHdrCc::packInto() to use std::ostream instead of Packable would be a useful step towards similar packInto() upgrades for Http::Message and friends.

src/HttpHdrCc.cc Outdated Show resolved Hide resolved
src/HttpHdrCc.cc Outdated Show resolved Hide resolved
src/HttpHdrCc.cc Outdated Show resolved Hide resolved
@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Mar 19, 2024
@kinkie kinkie requested a review from rousskov March 19, 2024 15:50
@kinkie kinkie added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-author author action is expected (and usually required) labels Mar 19, 2024
@kinkie
Copy link
Contributor Author

kinkie commented Mar 19, 2024

It's hopefully ready for review now

Branch commit 1f283fa broke httpHdrCcStatDumper(): The new code assumed
that the given value is always positive while the caller calls that
function with a -1.0 value. That value cannot be (safely) casted to
HttpHdrCcType because the latter is based on an _unsigned_ type.

Official code did not have this bug because it casted to int instead of
HttpHdrCcType. The new branch code also does not convert out-of-range
integers into HttpHdrCcType values.

This change also places the burden of handling invalid entries (and the
choice of an "invalid" spelling, where applicable) on the caller.
Official code did not use the same spelling and, arguably, different
callers may prefer different spelling (or not to deal with invalid
entries at all because they cannot see them).
rousskov
rousskov previously approved these changes Mar 20, 2024
src/base/EnumIterator.h Show resolved Hide resolved
src/HttpHdrCc.cc Outdated Show resolved Hide resolved
static std::optional<const char *>
ccNameByType(const RawId rawId)
{
// TODO: Store a by-ID index in (and move this functionality into) LookupTable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other LookupTable code can benefit from this PR changes. Moreover, I suspect that many LookupTable users can be significantly improved by addressing this TODO and modernizing that code. However, it is probably best to do that refactoring in a dedicated PR rather than this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.
This was exactly what I meant with my attempt for a reverse_lookup functionality

src/HttpHdrCc.cc Outdated Show resolved Hide resolved
#include <ostream>

// invariant: row[j].id == j
static LookupTable<HttpHdrCcType>::Record CcAttrs[] = {
constexpr LookupTable<HttpHdrCcType>::Record attrsList[] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this compile-time constant is no longer a "global" in "No New Globals" sense. It cannot be localized inside CcAttrs() AFAICT because the compiler does not like us returning a reference to a local constexpr. Making this constexpr allows us to check the invariant at compile time, which is a significant improvement. Long-term, I suspect that this initialization-only code will be significantly refactored to avoid nil names and CC_ENUM_END hacks. However, if constexpr is not enough to make Coverity or others happy, we can go back to local statics and runtime assertions. I do not insist on trying constexpr here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to try it. It might become a useful pattern, as you say.
I'm not sure we should strive to make Coverity happy; either we are solving real (if small) problems or we should just tell Coverity to be quiet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a reference to a variable which exists only in compiler memory makes no sense. But leaving it a static member like all our other NoNewGlobals functions have should be fine to return and also more compatible with future conversion of the array to storing SBuf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR: @yadij, if you think that the current PR code does not resolve these and other tradeoffs well and should go back to runtime checks, please say so explicitly, and I will adjust the code accordingly. I do not think we should merge this PR unless there is consensus regarding these compile-time checks. If you do not think these compile-time checks are worth their associated risks and limitations, we should not merge.


Returning a reference to a variable which exists only in compiler memory makes no sense.

Why not?! IMHO, it may make sense to return a reference to a variable used by the compiler.

Moreover, constexpr does not imply that the variable exists only in compiler memory: The specifier means that the value of that variable can be computed at compile time. The variable existence duration is governed by other considerations. For example, in the current PR code, attrsList exists at runtime in my tests:

$ nm src/squid | c++filt | grep attrsList
0000000000b241a0 d attrsList
(gdb) p attrsList
$1 = {{name = 0x555555e990a8 "public", id = CC_PUBLIC}, ...}

If we were to add extern ... attrList[] declaration, then attrList would become visible in other translation units as well (while still being constexpr).

leaving it a static member ... should be fine to return

Yes, but, as I tried to explain, doing so will kill compile-time checks because static variables are prohibited inside constexpr functions (until C++23):

HttpHdrCc.cc:34:53: error: ‘attrsList’ declared ‘static’ in ‘constexpr’ function

and also more compatible with future conversion of the array to storing SBuf

I am not sure we would want to convert this particular array, but if we do start storing SBuf in it, then we would not be able to check its invariants at compile time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know whether ASAN correctly detects the limits of a loop scanning across the constexpr when only that virtual-reference is provided (ie before the expression is calculated?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know whether ASAN correctly detects the limits of a loop scanning across the constexpr when only that virtual-reference is provided (ie before the expression is calculated?).

I assume that by "ASAN" you mean "address sanitizer" (more commonly abbreviated as ASan AFAICT).

I do not know much about address sanitizers, but I would expect their limitations to depend on the address sanitizer implementation. I can speculate that if a sanitizer is reusing initial processing stages from the "compiler", then it may not even have to deal with constexpr computations -- it would rely on their computed values received from those earlier processing stages.

I also cannot guess what "virtual-reference" means in this context.

@rousskov rousskov added S-waiting-for-author author action is expected (and usually required) and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Mar 20, 2024
Co-authored-by: Alex Rousskov <rousskov@measurement-factory.com>
static std::optional<const char *>
ccNameByType(const RawId rawId)
{
// TODO: Store a by-ID index in (and move this functionality into) LookupTable.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.
This was exactly what I meant with my attempt for a reverse_lookup functionality

src/HttpHdrCc.cc Outdated Show resolved Hide resolved
src/base/EnumIterator.h Show resolved Hide resolved
src/HttpHdrCc.cc Show resolved Hide resolved
@kinkie
Copy link
Contributor Author

kinkie commented Mar 21, 2024

I don't think there's anything to add here (maybe some unit tests, but that's a separate topic)
Thanks for taking the time to explain so much @rousskov !

@kinkie kinkie added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-author author action is expected (and usually required) labels Mar 21, 2024
@rousskov rousskov requested a review from yadij March 21, 2024 14:57
// invariant: each row has a name except the last one
static_assert(!attrsList[idx].name == (ev == HttpHdrCcType::CC_ENUM_END));
// invariant: row[idx].id == idx
static_assert(attrsList[idx].id == ev);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to share how this static_assertion failure looks like (after injecting an incorrect attrsList entry):

HttpHdrCc.cc: In instantiation of ‘CcAttrs()::<lambda(auto:1)> [with auto:1 = std::integral_constant<HttpHdrCcType, CC_PUBLIC>]’:
HttpHdrCc.cc:62:39: error: static assertion failed
   62 |         static_assert(attrsList[j].id == j);

IMO, this is fairly readable and more informative than a runtime assertion would be (without using a debugger).

I wish I had tested the error message after injecting a bad value at the end of the table!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did. Entries past HttpHdrCcType::CC_ENUM_END are ignored

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a bad entry before the end marker. I am worried about error message readability due to deep recursion...

We can add more static assertions to check for the presence of "extra" entries after the end marker as well, but, unlike duplicate or out-of-order entries before the end marker, such bugs would be pretty visible in the diff, so I am not particularly worried about them.

Copy link
Contributor Author

@kinkie kinkie Mar 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On MacOS (clang):

/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:237:5: note: in instantiation of function template specialization 'CcAttrs()::(anonymous class)::operator()<std::integral_constant<HttpHdrCcType, CC_ENUM_END>>' requestediff --git a/src/HttpHdrCc.cc b/src/HttpHdrCc.cc
index 2bd9303208..179be7de24 100644
--- a/src/HttpHdrCc.cc
+++ b/src/HttpHdrCc.cc
@@ -44,8 +44,8 @@ constexpr LookupTable<HttpHdrCcType>::Record attrsList[] = {
     {"stale-if-error", HttpHdrCcType::CC_STALE_IF_ERROR},
     {"immutable", HttpHdrCcType::CC_IMMUTABLE},
     {"Other,", HttpHdrCcType::CC_OTHER}, /* ',' will protect from matches */
-    {nullptr, HttpHdrCcType::CC_ENUM_END}
-};
+    {nullptr, HttpHdrCcType::CC_OTHER},
+    {nullptr, HttpHdrCcType::CC_ENUM_END}};

 constexpr const auto &
 CcAttrs() {

results in

/Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:58:9: error: static assertion failed due to requirement 'attrsList[idx].id == ev'
        static_assert(attrsList[idx].id == ev);
        ^             ~~~~~~~~~~~~~~~~~~~~~~~
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:237:5: note: in instantiation of function template specialization 'CcAttrs()::(anonymous class)::operator()<std::integral_constant<HttpHdrCcType, CC_ENUM_END>>' requested here
    f(std::integral_constant<enumType, First>()); // including when First is Last
    ^
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_ENUM_END, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
        ConstexprForEnum<enumType(static_cast<underlyingType>(First) + 1), Last>(f);
        ^
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_OTHER, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_IMMUTABLE, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_STALE_IF_ERROR, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: (skipping 7 contexts in backtrace; use -ftemplate-backtrace-limit=0 to see all)
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_NO_TRANSFORM, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_NO_STORE, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_NO_CACHE, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
/Users/kinkie/src/squid/coverity-cid-1554655/src/base/EnumIterator.h:239:9: note: in instantiation of function template specialization 'ConstexprForEnum<CC_PRIVATE, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76) &>' requested here
/Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:5: note: in instantiation of function template specialization 'ConstexprForEnum<CC_PUBLIC, CC_ENUM_END, (lambda at /Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:53:76)>' requested here
    ConstexprForEnum<HttpHdrCcType::CC_PUBLIC, HttpHdrCcType::CC_ENUM_END>([](const auto ev) {
    ^
/Users/kinkie/src/squid/coverity-cid-1554655/src/HttpHdrCc.cc:58:41: note: expression evaluates to '14 == 15'
        static_assert(attrsList[idx].id == ev);
                      ~~~~~~~~~~~~~~~~~~^~~~~
1 error generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly the test I was thinking about, thank you!

As I feared, it is not trivial to discover the enum value that violated the static assertion. The 14 == 15 error detail is probably enough to (eventually) figure out what is going. IMHO, this is not much worse than running a debugger after a runtime assertion (or even adding more code to report values that would lead to the assertion).

Overall, I still think that this static assertion experiment is worth merging, but I certainly do not insist on merging it.

@squid-anubis squid-anubis added M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels and removed M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels labels Apr 5, 2024
@kinkie kinkie requested a review from rousskov April 5, 2024 10:18
@rousskov rousskov removed their request for review April 5, 2024 16:54
Copy link
Contributor

@yadij yadij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I objections from me, though I have enough doubts about the constexpr not to sign off explicitly.

@yadij yadij requested a review from rousskov April 8, 2024 05:23
@kinkie
Copy link
Contributor Author

kinkie commented Apr 8, 2024

I objections from me, though I have enough doubts about the constexpr not to sign off explicitly.

Fair enough. This is a stepping stone towards improving LookupTable so that callers will be relieved of some of the invariant checks duty.

@kinkie kinkie added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Apr 8, 2024
@kinkie
Copy link
Contributor Author

kinkie commented Apr 8, 2024

Merging on slow burner

@rousskov rousskov removed their request for review April 8, 2024 12:38
squid-anubis pushed a commit that referenced this pull request Apr 8, 2024
Detected by Coverity. CID 1554655: Initialization or destruction
ordering is unspecified (GLOBAL_INIT_ORDER).

Also switched to compile-time checks for table initialization records.
@squid-anubis squid-anubis added the M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels label Apr 8, 2024
@squid-anubis squid-anubis added M-merged https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels labels Apr 8, 2024
kinkie added a commit to kinkie/squid that referenced this pull request Apr 9, 2024
Detected by Coverity. CID 1554655: Initialization or destruction
ordering is unspecified (GLOBAL_INIT_ORDER).

Also switched to compile-time checks for table initialization records.
kinkie added a commit to kinkie/squid that referenced this pull request Apr 9, 2024
Detected by Coverity. CID 1554655: Initialization or destruction
ordering is unspecified (GLOBAL_INIT_ORDER).

Also switched to compile-time checks for table initialization records.
kinkie added a commit to kinkie/squid that referenced this pull request Apr 14, 2024
Detected by Coverity. CID 1554655: Initialization or destruction
ordering is unspecified (GLOBAL_INIT_ORDER).

Also switched to compile-time checks for table initialization records.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
M-merged https://github.com/measurement-factory/anubis#pull-request-labels
Projects
None yet
4 participants