Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate C++ include file grouping and ordering using clang-format #4185

Closed
Tracked by #20
harrism opened this issue Feb 21, 2024 · 0 comments · Fixed by #4205
Closed
Tracked by #20

Automate C++ include file grouping and ordering using clang-format #4185

harrism opened this issue Feb 21, 2024 · 0 comments · Fixed by #4205
Assignees
Labels
improvement Improvement / enhancement to an existing function

Comments

@harrism
Copy link
Member

harrism commented Feb 21, 2024

Currently the C++ include files in the RAPIDS codebase are not consistently ordered and grouped.
Inconsistent ordering makes it harder to maintain the codebase, to understand the dependencies,
and to automate changes.

Using clang-format to automate the ordering and grouping of include files will not only add
consistent style, but it will ease automation of changes. For example, We are undertaking a
refactoring of RMM that will replace rmm::mr::device_memory_resource* with
rmm::device_async_resource-ref everywhere in RAPIDS (not just cuDF). This requires adding an
include to MANY files across multiple RAPIDS repos. Getting the location of this include correct
everywhere is very difficult without automatic grouping of headers.

I propose to use clang-format's IncludeCategories settings to automate the ordering and
grouping of include files in the C++ codebase consistently.

Proof-of-concept PRs have been created for RMM and cuDF:

The ordering used for cuDF is given by the following settings.

IncludeBlocks: Regroup
IncludeCategories:
  - Regex:           '^"' # quoted includes
    Priority:        1
  - Regex:           '^<(benchmarks|tests)/' # benchmark includes
    Priority:        2
  - Regex:           '^<cudf_test/' # cuDF includes
    Priority:        3
  - Regex:           '^<cudf/' # cuDF includes
    Priority:        4
  - Regex:           '^<(nvtext|cudf_kafka)' # other libcudf includes
    Priority:        5
  - Regex:           '^<(cugraph|cuml|cuspatial|raft|kvikio)' # Other RAPIDS includes
    Priority:        6
  - Regex:           '^<rmm/' # RMM includes
    Priority:        7
  - Regex:           '^<(thrust|cub|cuda)/' # CCCL includes
    Priority:        8
  - Regex:           '^<(cooperative_groups|cuco|cuda.h|cuda_runtime|device_types|math_constants|nvtx3)' # CUDA includes
    Priority:        8
  - Regex:           '^<.*\..*' # other system includes (e.g. with a '.')
    Priority:        9
  - Regex:           '^<[^.]+' # STL includes (no '.')
    Priority:        10
@harrism harrism added the improvement Improvement / enhancement to an existing function label Feb 21, 2024
@BradReesWork BradReesWork added non-breaking Non-breaking change and removed non-breaking Non-breaking change labels Feb 29, 2024
rapids-bot bot pushed a commit that referenced this issue Feb 29, 2024
…4205)

This uses the `IncludeCategories` settings in` .clang-format` to automate include ordering and grouping and to make include ordering more consistent with the rest of RAPIDS. For discussion, see rapidsai/cudf#15063. This PR uses a similar set of header grouping categories used in that PR, adapted for cuGraph.

One purpose of this is to make it easier to automate injection of a header change with an upcoming RMM refactoring (and in the future).

Note that this PR also updates all of cugraph to use quotes ("") when includeing local *source* headers. That is, whenever something is included from `cugraph/src/*`, it should be included with quotes rather than angle brackets. This makes it much easier to order includes from "nearest to farthest", and matches the approach now taken in other RAPIDS repos.  Without switching to quotes, keeping these includes at the top requires special casing each subdirectory of `src` in the clang-format group regular expressions. 

Includes from the `include` directory stay as angle brackets, because they always use `#include <cugraph/...>`.

cuGraph tests include a LOT of internal source headers. I think we should generally avoid this, but it depends on the testing philosophy. If one believes that tests should only cover the external interfaces of the library, then there is no need to include internal headers. But if we want to test internal functionality, then the tests need to be more "internal" to the library.

The header reordering in this PR also uncovered some places where headers were not included where they are used, which I have fixed.

Closes #4185

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)
  - Naim (https://github.com/naimnv)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #4205
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants