Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply clang-tidy autofixes #15894

Merged
merged 17 commits into from
Jun 12, 2024
Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented May 30, 2024

Description

This changeset is large, but it's not very substantial. It's all the automated fixes produced by clang-tidy using our script. The bulk of the changes are either adding [[nodiscard]] to many functions or changing const ref args to pass by value and then move in cases where the parameter is only used to set a value. There are also some places where clang-tidy preferred either more or less namespacing of objects depending on the current namespace. The goal is to enable clang-tidy in CI, which we made progress towards in #9860 but stalled in #10064. This PR contains the first set of changes that will required for such a check to pass.

I've marked this PR as breaking because some of the functions now marked as [[nodiscard]] are public APIs, so if consumers were ignoring the return values they will now see warnings, and if they are compiling with warnings as errors then the builds will break.

Contributes to #584

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function breaking Breaking change labels May 30, 2024
@vyasr vyasr self-assigned this May 30, 2024
@vyasr vyasr requested a review from a team as a code owner May 30, 2024 23:48
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few initial comments -- I got through 55 / 235 files and decided to share my comments before going further, since some apply throughout the PR.

cpp/include/cudf/detail/structs/utilities.hpp Show resolved Hide resolved
cpp/include/cudf/interop.hpp Show resolved Hide resolved
cpp/include/cudf/io/types.hpp Show resolved Hide resolved
cpp/include/cudf/join.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/utilities/span.hpp Show resolved Hide resolved
cpp/src/io/parquet/predicate_pushdown.cpp Outdated Show resolved Hide resolved
cpp/src/io/parquet/predicate_pushdown.cpp Outdated Show resolved Hide resolved
find_schema_child(schema_elem, col_name_info->children[idx].name),
output_col.children,
has_list_parent || col_type == type_id::LIST);
for (const auto& idx : col_name_info->children) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of west-const in this PR... can we autoformat to east const to match the rest of the library? (We don't have this option enabled in clang-format because it's supposedly bug-prone, but I don't think we observed any actual bugs when we last ran a manual formatting pass.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah at minimum I can do a manual pass on this PR and fix anything that goes wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual pass worked fine. The warning is still present in the clang-format docs, though, so I've left the option out of our configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug up the original reviews for the PR where this feature and that warning were added. https://reviews.llvm.org/D69764#inline-1055542

Basically the breaking cases are things like:

#define INTPTR int *

const INTPTR a;

where a macro would mess up the semantics of const and * by swapping the order.

I'd be okay with accepting that edge case and enabling this setting across the code base.

Copy link
Contributor Author

@vyasr vyasr Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this more in a follow-up. The rule applies to all qualifiers, not just const, so if we turn it on we'll also either need to accept "east volatile" or we will need to see if there are additional settings we can tweak to avoid that. Personally I would be fine with moving volatile as well, but I'd rather not do that in this PR since I'm sure that will be debated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

east const formatting does not work 100% with any tool. Each tool misses some. Each tool takes different approach to convert to east const. We can run a few tools to minimize it and some manual tweaks. I used westerly tool, and clang-tidy.

@vyasr
Copy link
Contributor Author

vyasr commented May 31, 2024

I'm not sure what the right thing to do is for generated and vendored files. There are two in particular that are a bit problematic, cxxopts.hpp and Schema_generated.h. The problem is that the way clang-tidy works is recursive, i.e. you point it to a source file and it checks all headers included by that file. As a result, it's not sufficient to use something like pre-commit filtering of files beforehand to tell it what files to ignore. The way to specify what files to include is using the -header-filter argument to clang-tidy itself. However, this argument does not support lookahead/lookbehind in regexes, which is the only way I can imagine filtering out a specific set of files right now. The as-yet-unreleased version 19 includes a new feature to exclude headers, though. Our options are to either go ahead and apply changes to these files, or to wait until clang-tidy 19 is released and update to using that and assume that the new option will allow us to do the necessary filtering.

I should note that while we do currently have pre-commit filtering out the cxxopts.hpp file, the Schema_generated.h file has not been filtered, so we are already applying clang-format to it. That being the case, I would be fine with just removing the filters and applying clang-tidy and clang-format uniformly to both files, but I wanted to see what others thought first.

@davidwendt
Copy link
Contributor

I should note that while we do currently have pre-commit filtering out the cxxopts.hpp file, the Schema_generated.h file has not been filtered, so we are already applying clang-format to it. That being the case, I would be fine with just removing the filters and applying clang-tidy and clang-format uniformly to both files, but I wanted to see what others thought first.

I'd rather add the Schema_generated.h to the pre-commit filter if that is possible.
I'm ok with updating the files manually using this PR and we can revisit again once version 19 is out.

I'm more concerned that we have some cpp-tests failing with these changes.

@vyasr vyasr requested a review from a team as a code owner May 31, 2024 18:22
@github-actions github-actions bot added the Java Affects Java cuDF API. label May 31, 2024
@vyasr
Copy link
Contributor Author

vyasr commented May 31, 2024

@davidwendt apologies, I had fixed the failing test but just realized I was pushing to the wrong branch. It was just an issue of a missed NOLINT; clang-tidy doesn't understand all of the extra braces that we sometimes need in tests in order to make things like nested list columns work.

As I mentioned, we can't use a pre-commit filter:

The problem is that the way clang-tidy works is recursive, i.e. you point it to a source file and it checks all headers included by that file. As a result, it's not sufficient to use something like pre-commit filtering of files beforehand to tell it what files to ignore.

It's not really about pre-commit, it's about clang-tidy itself. If you run clang-tidy on any source file, it will also detect errors in any of the headers that file includes.

@davidwendt
Copy link
Contributor

Ok, this part sounded like we have a mistake. I'm proposing we fix this:

I should note that while we do currently have pre-commit filtering out the cxxopts.hpp file, the Schema_generated.h file has not been filtered, so we are already applying clang-format to it.

And this part sounded like we can solve the clang-tidy issue with version 19:

The as-yet-unreleased version 19 includes a new feature to exclude headers, though.

I'm proposing we merge this PR (once it passes) and wait for 19.

@vyasr
Copy link
Contributor Author

vyasr commented May 31, 2024

Got it. Yes, I can fix the exclusion for clang-format for Schema_generated.h

As far as waiting, I'm observing a biannual release cadence in March and September. I would (admittedly selfishly) like to get this work done before then personally, but I understand the desire to not have that behavior, so I'll defer to you on that if you think it's worth avoiding the modifications to the generated file.

@vyasr vyasr requested a review from a team as a code owner May 31, 2024 19:05
@vyasr vyasr requested a review from KyleFromNVIDIA May 31, 2024 19:05
cpp/src/io/parquet/predicate_pushdown.cpp Outdated Show resolved Hide resolved
cpp/src/io/parquet/predicate_pushdown.cpp Outdated Show resolved Hide resolved
cpp/include/cudf/detail/structs/utilities.hpp Show resolved Hide resolved
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read enough of this that I think it can be merged. I didn't read the changes to cpp/tests but I read everything else.

Aside from the other comments I just left, I would like one more spot-check for things like indices -> indice in the for-loop simplifications. I'm not sure how to check for those. Maybe grep the diff for for (.*:.* or something like that.

@vyasr
Copy link
Contributor Author

vyasr commented Jun 3, 2024

I did the spot-check and didn't find anything. Thanks for being so diligent!

I'll do a follow-up PR where I fix the clang-tidy errors that don't get autofixed. I figured this changeset was large enough as is.

cpp/include/cudf/detail/structs/utilities.hpp Show resolved Hide resolved
chrono_scalar() = delete;
~chrono_scalar() = default;
chrono_scalar() = delete;
~chrono_scalar() override = default;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: This is interesting. Why is override required here?

Copy link
Contributor Author

@vyasr vyasr Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is coming from this rule. In this particular case (a virtual, no-op destructor) the arguments on that page don't really apply. However, it does help in one way: it ensures a compiler error if the parent class's destructor is not declared virtual (which can result in UB when you have inheritance).

cpp/include/cudf/scalar/scalar_factories.hpp Outdated Show resolved Hide resolved
@@ -237,7 +237,7 @@ inline void throw_cuda_error(cudaError_t error, char const* file, unsigned int l
// Calls cudaGetLastError to clear the error status. It is nearly certain that a fatal error
// occurred if it still returns the same error after a cleanup.
cudaGetLastError();
auto const last = cudaFree(0);
auto const last = cudaFree(nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be detected using compiler warnings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this can be detected with host compilers but not nvcc. I'd have to double-check that, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, yes, this can be enforced with -Wzero-as-null-pointer-constant. However, that leads to warnings from cuda_runtime_api.h. I'm not sure if there is a good way to avoid that.

@PointKernel
Copy link
Member

Question: do we enforce this linting in CI?

@vyasr
Copy link
Contributor Author

vyasr commented Jun 5, 2024

Question: do we enforce this linting in CI?

Not yet. The plan is to get things fixed first, then add a CI check. However, that may have to wait a few months, see #15894 (comment).

Copy link
Contributor

@srinivasyadav18 srinivasyadav18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good to me!

Copy link
Contributor

@ttnghia ttnghia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java approval.

@vyasr
Copy link
Contributor Author

vyasr commented Jun 11, 2024

/merge

cudf::test::fixed_width_column_wrapper<int32_t> col1_1{{5, 4, 3, 5, 8, 5, 6},
{0, 1, 1, 1, 1, 1, 1}};
cudf::test::fixed_width_column_wrapper<int32_t> col1_1{
{5, 4, 3, 5, 8, 5, 6}, {false, true, true, true, true, true, true}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: my personal preference is 0, 1 style for validity mask, it is easy to read quickly, aligns well with data in above line.

@@ -124,8 +125,8 @@ TEST_F(GatherTestStr, GatherDontCheckOutOfBounds)
rmm::mr::get_current_device_resource());

std::vector<char const*> h_expected;
for (auto itr = h_map.begin(); itr != h_map.end(); ++itr) {
h_expected.push_back(h_strings[*itr]);
for (int itr : h_map) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not auto here?

Comment on lines +484 to +502
{true,
true,
true,
true,
true,
true,
true,
true,
true,
true,
true,
true,
true,
true,
true,
true,
false,
false,
false}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same comment about validity mask. readability is better, if the each data aligns with its validity too.

@@ -139,13 +139,13 @@ inline const MetadataVersion (&EnumValuesMetadataVersion())[5]
return values;
}

inline const char* const* EnumNamesMetadataVersion()
inline char const* const* EnumNamesMetadataVersion()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file has this comment on the top.
// automatically generated by the FlatBuffers compiler, do not modify
if it is possible to enforce east const with FlatBuffers compiler, please add that as comment in top of the file too.

@rapids-bot rapids-bot bot merged commit 2b10299 into rapidsai:branch-24.08 Jun 12, 2024
74 checks passed
wence- added a commit to wence-/cudf that referenced this pull request Jun 14, 2024
The clang-tidy changes in rapidsai#15894 introduce a new exclude regex list to
the pre-commit clang-format hook. However, it was a single character
too long, ending with a |. Consequently, the exclude regex matched the
empty string, and hence excluded every C++ file.

Fix this, and apply formatting changes to the files that were modified
in the interim and were not clang-format compatible.
rapids-bot bot pushed a commit that referenced this pull request Jun 14, 2024
The clang-tidy changes in #15894 introduce a new exclude regex list to the pre-commit clang-format hook. However, it was a single character too long, ending with a |. Consequently, the exclude regex matched the empty string, and hence excluded every C++ file.

Fix this, and apply formatting changes to the files that were modified in the interim and were not clang-format compatible.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - David Wendt (https://github.com/davidwendt)
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Bradley Dice (https://github.com/bdice)

URL: #16030
@KyleFromNVIDIA
Copy link
Contributor

#15894 (comment)

which is the only way I can imagine filtering out a specific set of files right now.

CMake handles this by putting vendored files in their own directory and then adding a .clang-tidy file to that directory. For example: https://gitlab.kitware.com/cmake/cmake/-/blob/72607d3402b91c3a9f4579c86175e4ca1ce6aa4e/Utilities/.clang-tidy

@vyasr vyasr deleted the chore/clang_tidy_fixes branch August 27, 2024 19:09
@vyasr
Copy link
Contributor Author

vyasr commented Aug 27, 2024

That could definitely work too, although we'd potentially need a separate such directory for tests and cudf source. At this point I'm more inclined to wait for clang 19 since it's around the corner, but if for some reason the tooling that provides isn't sufficient I'd be happy to see the reorg. Alternatively, if you're motivated to get this done sooner feel free to try your approach. I agree that it ought to work and it should be pretty easy.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants