Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c/c++] Formatting tool for c/c++ Includes repo-wide #12441

Closed
electronjoe opened this issue Apr 12, 2022 · 13 comments
Closed

[c/c++] Formatting tool for c/c++ Includes repo-wide #12441

electronjoe opened this issue Apr 12, 2022 · 13 comments
Labels
wontfix This will not be worked on

Comments

@electronjoe
Copy link
Member

electronjoe commented Apr 12, 2022

What and the Why

The purpose of this tooling is to automate the checking and formatting of #include directives across our c/c++ codebase. Existing tooling appears insufficient due to our mixture of C/C++ and frequent extern "C" annotations (e.g. clang-format must be told not to format includes in today's codebase).

Why do this? Primarily - today there is no standard applied across the code-base, and even blocks of includes are not sorted (in part because sorting them sometimes breaks unfortunate inter-header-file-sequence assumptions). This has severe implications for anybody moving code around our repository (as one must either manually touch 100+ files to fix up the order of new #include "new/path/to/header.h" or leave the now-modified-path out of sorted order.

The goal is to author a tool that:

  1. Enables automated fix-up of the #includes in the repo
  2. Can be used by CI to enforce this formatting specification

Specification

Top level Ordering

Where possible we adhere to the Google Style for include ordering. The only modification is to enable the use of extern "C" blocks.

The includes formatter will res-sort includes such that they adhere to the following.

  1. If present, this c/cpp's interface e.g. lte/gateway/.../foo.h.
  2. A blank line
  3. C system headers (more precisely: headers in angle brackets with the .h extension), e.g., <unistd.h>, <stdlib.h>.
  4. A blank line
  5. C++ standard library headers (without file extension), e.g., , .
  6. A blank line
  7. If existing, extern "C" guarded includes
  8. A blank line
  9. Other libraries' .h files.
  10. A blank line
  11. Your project's .h files. Note that this includes dynamically generated files (e.g. asn1c or protoc generated).

Sub-block level ordering

Where the blocks of #include statements are sorted alphabetically within 3, 5, 7, 9, 11 above.

Per-Include Formatting

  1. System headers will be of the form #include <header.h>
  2. In-magma headers will be full path of the form #include "lte/gateway/path/to/header.h". This includes generated headers, e.g. protobuf and grpc.
  3. Third-party headers will be relative to their source root e.g. #include "json.hpp"

Example

For example in made-up file lte/gateway/c/core/common/log.cpp:

#include "lte/gateway/c/core/common/log.h"  // Header for this c/cpp resource

#include <sys/types.h>  // c system includes
#include <unistd.h>

#include <string>  // c++ system includes
#include <vector>

extern "C" {  // C headers requiring special c-style linkage (if any)
#include "lte/gateway/c/core/oai/common/one_c_lib.h"
#include "lte/gateway/c/core/oai/common/some_c_lib.h"
}

#include "json.hpp"  // Third party libraries

#include "lte/gateway/c/common/special_types.h"  // cpp includes from Magma repo
#include "lte/gateway/c/core/oai/tasks/ngap/custom_flags.h"

Corner Cases

There are some challenges that must be handled by any automated formatting tool.

  1. There exist (unfortunate) #ifdef guarded #include statements which must be preserved by the tool (ideally these would be removed from teh codebase).
  2. Protobuf includes - some are system protobuf includes (e.g. Any message type), others are in-repo but dynamically generated.
  3. ASN1 compiler generated headers
@nstng
Copy link
Contributor

nstng commented Apr 12, 2022

Big fan of standardized import formatting. Couple thoghts/questions:

  • From java projects I know that import sorting can be hard to achieve because different IDEs have different opinions about the correct order. Do you suggest that IDE sorting is turned off and only CI or a script takes action?
  • What about pre compiler statements with includes? Should they be excluded from the order or split up?

@ajahl
Copy link
Contributor

ajahl commented Apr 12, 2022

Looks good to me. I think it is a good idea to set the header file (1.) corresponding to the current cpp file in the first portion, if applicable. It ensures that the header file is self-contained. Currently, a couple of cpp source files in the repo don't follow this concept.

@themarwhal
Copy link
Member

Looks great! Will this be compatible with IWYU? As in, would running IWYU undo some of the formatting or vice versa? But since we don't really rely on IWYU in this project, it's probably not a blocker for this.

@ssanadhya
Copy link
Collaborator

Thanks for creating this issue @electronjoe ! This will be really useful in maintaining code quality without burdening the code-owners. I have two questions:

  1. Before we deploy any such tool, should we attempt to break the inter-header-file-sequence assumptions leading to the unsorted include blocks in the code base today? We will have to live with a mixed C/C++ codebase for the foreseeable future, but if there are some easy clean-ups for include ordering, fixing those will make it easier to build the above tool.
  2. The cpp-lint job posts comments on PRs suggesting re-ordering of include statements. Will this tool be an enhancement to that same job?

@electronjoe
Copy link
Member Author

  • From java projects I know that import sorting can be hard to achieve because different IDEs have different opinions about the correct order. Do you suggest that IDE sorting is turned off and only CI or a script takes action?

This is an excellent point @nstng and I'm already concerned about this. I considered adding what "alphabetical" means - Golang's sort of these imports is for example putting capitals as topologically lesser than lowercase (e.g. I think Zebrah.h would be before abbot.h). I would estimate right now that turning off IDE sorting might be necessary (falling back to this tool to do any sorting if a human is confused).

What about pre compiler statements with includes? Should they be excluded from the order or split up?

Do you mean include statements that are ifdef guarded / conditionally imported? Good question... I do think we may need to leave them be (possibly sort within if multiple are included in the ifdef block?). Maybe we need to revise the above specification to handle these / give them their own "region" in the top level ordering?

@electronjoe
Copy link
Member Author

I think it is a good idea to set the header file (1.) corresponding to the current cpp file in the first portion, if applicable. It ensures that the header file is self-contained. Currently, a couple of cpp source files in the repo don't follow this concept.

@ajahl I'm not quite following this - can you clarify? Are you saying that a couple cpp source files do not include their header file interface anywhere? I could believe this.

@electronjoe
Copy link
Member Author

Looks great! Will this be compatible with IWYU? As in, would running IWYU undo some of the formatting or vice versa? But since we don't really rely on IWYU in this project, it's probably not a blocker for this.

@themarwhal I'm fearful that our codebase is not going to be IWYU compatible (or at least will require custom munging of IWYU outputs) due at least to our mixed c/c++ codebase (I do not believe IWYU automation supports C header includes in cpp with extern "C" annotations).

@electronjoe
Copy link
Member Author

  1. Before we deploy any such tool, should we attempt to break the inter-header-file-sequence assumptions leading to the unsorted include blocks in the code base today? We will have to live with a mixed C/C++ codebase for the foreseeable future, but if there are some easy clean-ups for include ordering, fixing those will make it easier to build the above tool.

@ssanadhya I'm not sure how to find these, aside from randomization of the header includes and looking for failures (which is how these have come up in the past - e.g. sorting header includes). So my plan was:

  1. To author the tool that sorts and fixes headers
  2. Run the tool locally and then fix any uncovered bad-practice dependencies on ordering of headers
  3. Commit those fixes that prevent the tool from success
  4. Commit the formatting changes suggested by the tool
  5. Commit the tool and turn it up in CI to enforce it's specification

How does that sound?

@electronjoe
Copy link
Member Author

2. The cpp-lint job posts comments on PRs suggesting re-ordering of include statements. Will this tool be an enhancement to that same job?

@ssanadhya I think the cpp-lint advice will likely be insufficient due to our mixed c/cpp codebase, but I'll go explore that now. In which case I would disable cpp-lint header findings in that linting tool, at the time of replacement with this CI tool discussed in this GH Issue.

@ssanadhya
Copy link
Collaborator

  1. Before we deploy any such tool, should we attempt to break the inter-header-file-sequence assumptions leading to the unsorted include blocks in the code base today? We will have to live with a mixed C/C++ codebase for the foreseeable future, but if there are some easy clean-ups for include ordering, fixing those will make it easier to build the above tool.

@ssanadhya I'm not sure how to find these, aside from randomization of the header includes and looking for failures (which is how these have come up in the past - e.g. sorting header includes). So my plan was:

  1. To author the tool that sorts and fixes headers
  2. Run the tool locally and then fix any uncovered bad-practice dependencies on ordering of headers
  3. Commit those fixes that prevent the tool from success
  4. Commit the formatting changes suggested by the tool
  5. Commit the tool and turn it up in CI to enforce it's specification

How does that sound?

@electronjoe , that sounds good! Thanks for the clarification!

@ajahl
Copy link
Contributor

ajahl commented Apr 18, 2022

I think it is a good idea to set the header file (1.) corresponding to the current cpp file in the first portion, if applicable. It ensures that the header file is self-contained. Currently, a couple of cpp source files in the repo don't follow this concept.

@ajahl I'm not quite following this - can you clarify? Are you saying that a couple cpp source files do not include their header file interface anywhere? I could believe this.

In some cpp source files, their header file interfaces are not at the top of the #include list but in between.

@stale
Copy link

stale bot commented Jul 19, 2022

This issue or pull request has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs in the next 14 days.

@stale stale bot added the wontfix This will not be worked on label Jul 19, 2022
@stale
Copy link

stale bot commented Aug 2, 2022

This issue or pull request has exceeded its lifecylce and was automatically closed. Should you wish to continue working on this, please reopen it.

@stale stale bot closed this as completed Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants