Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc][clang-tidy] Add llvm-header-guard to get consistant naming and prevent file copy/paste issues. #66477

Merged
merged 3 commits into from
Sep 21, 2023

Conversation

gchatelet
Copy link
Contributor

No description provided.

@gchatelet
Copy link
Contributor Author

I'll fix the bad headers as well if you agree with this patch.

@sivachandra
Copy link
Collaborator

This SGTM. We could have done this earlier but there were a few exceptions that we had to deal with so we punted on this, example the include directory. But, what you are proposing is of a much lower scope (restricted to the src directory) so go ahead.

@gchatelet
Copy link
Contributor Author

This SGTM. We could have done this earlier but there were a few exceptions that we had to deal with so we punted on this, example the include directory. But, what you are proposing is of a much lower scope (restricted to the src directory) so go ahead.

Thx for the review @sivachandra . Unfortunately some of the files under src do pull headers in include and so the clang-tidy tries to fix them...
I've had a look at the clang-tidy itself and it already supports customizations for subprojects so I think the correct way forward is to add a customization for llvm-libc as well.

Two questions:

  • What are the rules for formatting headers in include? (I couldn't find the documentation)
  • What should we do for directories starting with two underscores? Currently llvm-header-guard suggests the following edits but I think it doesn't really add any value.
>>>
#ifndef LLVM_LIBC_SRC_SUPPORT_CPP_TYPE_TRAITS_IS_NULL_POINTER_H
---
#ifndef LLVM_LIBC_SRC___SUPPORT_CPP_TYPE_TRAITS_IS_NULL_POINTER_H
<<<

@gchatelet gchatelet marked this pull request as draft September 18, 2023 07:36
@gchatelet
Copy link
Contributor Author

What should we do for directories starting with two underscores?

It seems like libcxx is fine with multiple underscores.

@sivachandra
Copy link
Collaborator

  • What are the rules for formatting headers in include? (I couldn't find the documentation)

We are not uniform, but I think we should follow this:

__LLVM_LIBC_<FILE_NAME_IN_UPPER_SNAKE_CASE>_H__

For header files in nested directories:

__LLVM_LIBC_<RELATIVE_PATH_TO_DIR_IN_UPPER_SNAKE_CASE>_<FILE_NAME_IN_UPPER_SNAKE_CASE>_H__

  • What should we do for directories starting with two underscores? Currently llvm-header-guard suggests the following edits but I think it doesn't really add any value.

Just putting the file name as is, including the double-underscores, seems fine to me. The header guards don't really come in the way of almost anything.

@gchatelet
Copy link
Contributor Author

So llvm-header-guard should not normally flag headers coming from isystem but there were a lot of false positives coming from clang and llvm libc so I took a look at what's going on.

Looking at the compile_commands.json file that is generated by CMake and which drives the clang-tidy it seems that all the headers are appended with the -I option. I believe this is because the headers are added manually via a bunch of

set(include_dirs ${LIBC_SOURCE_DIR} ${LIBC_INCLUDE_DIR})
target_include_directories(${gpu_target_name} PRIVATE ${include_dirs})

instead of using the proper libc-headers target which defines the isystem includes.

Example command line from compile_commands.json shows that all includes are done via -I and none via -isystem

/home/gchatelet/clang_head_install/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_18_0_0_git -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LIBCPP_ENABLE_HARDENED_MODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/gchatelet/build/libc_x86/projects/libc/src/fenv -I/home/gchatelet/git/llvm-project/libc/src/fenv -I/home/gchatelet/build/libc_x86/include -I/home/gchatelet/git/llvm-project/llvm/include -I/home/gchatelet/git/llvm-project/libc -I/home/gchatelet/build/libc_x86/projects/libc/include  -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -march=znver2 -O2 -fpie -ffreestanding -fno-builtin -fno-exceptions -fno-lax-vector-conversions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -Wall -Wextra -Wconversion -Wno-sign-conversion -Wimplicit-fallthrough -Wwrite-strings -Wextra-semi -Wnewline-eof -Wnonportable-system-include-path -Wstrict-prototypes -Wthread-safety -Wglobal-constructors -UNDEBUG -std=c++17 -o projects/libc/src/fenv/CMakeFiles/libc.src.fenv.feclearexcept.__internal__.dir/feclearexcept.cpp.o -c /home/gchatelet/git/llvm-project/libc/src/fenv/feclearexcept.cpp

I may not fully grasp how the header inclusion mechanism works but I believe that -for the purpose of llvm-header-guard - some of the headers should be declared isystem.

Also looking more closely at the above example command line, it seems to me that there are too many includes

  • -I/home/gchatelet/build/libc_x86/projects/libc/src/fenv
  • -I/home/gchatelet/git/llvm-project/libc/src/fenv
  • -I/home/gchatelet/build/libc_x86/include
  • -I/home/gchatelet/git/llvm-project/llvm/include
  • -I/home/gchatelet/git/llvm-project/libc
  • -I/home/gchatelet/build/libc_x86/projects/libc/include

@sivachandra do we expect that many includes?

On the other hand, the Bazel generated command line seems much more reasonable to me

/usr/lib/llvm-14/bin/clang -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++0x' -MD -MF bazel-out/k8-fastbuild/bin/external/llvm-project/libc/_objs/feclearexcept/feclearexcept.pic.d '-frandom-seed=bazel-out/k8-fastbuild/bin/external/llvm-project/libc/_objs/feclearexcept/feclearexcept.pic.o' -fPIC '-DLIBC_NAMESPACE=__llvm_libc_18_0_0_git' -DLIBC_COPT_PUBLIC_PACKAGING '-DLLVM_LIBC_FUNCTION_ATTR=__attribute__((visibility("default")))' '-DBAZEL_CURRENT_REPOSITORY="llvm-project"' -iquote external/llvm-project -iquote bazel-out/k8-fastbuild/bin/external/llvm-project -isystem external/llvm-project/libc -isystem bazel-out/k8-fastbuild/bin/external/llvm-project/libc -O1 -Wall -Wno-deprecated '-std=c++17' -Wno-range-loop-analysis -O3 -fno-builtin -fno-lax-vector-conversions '-fvisibility=hidden' -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/llvm-project/libc/src/fenv/feclearexcept.cpp -o bazel-out/k8-fastbuild/bin/external/llvm-project/libc/_objs/feclearexcept/feclearexcept.pic.o

@sivachandra
Copy link
Collaborator

I may not fully grasp how the header inclusion mechanism works but I believe that -for the purpose of llvm-header-guard - some of the headers should be declared isystem.

Your conclusion is correct! We should specify libc/include and/or <build-dir>/projects/libc/include via -isystem.

Also looking more closely at the above example command line, it seems to me that there are too many includes

  • -I/home/gchatelet/build/libc_x86/projects/libc/src/fenv
  • -I/home/gchatelet/git/llvm-project/libc/src/fenv
  • -I/home/gchatelet/build/libc_x86/include
  • -I/home/gchatelet/git/llvm-project/llvm/include
  • -I/home/gchatelet/git/llvm-project/libc
  • -I/home/gchatelet/build/libc_x86/projects/libc/include

@sivachandra do we expect that many includes?

A lot of it is not required. We should only require:

  • <build-dir>/projects/libc/include
  • llvm-project/libc/

Anything else seems like unnecessary at this point, but there could some land-mines hiding somewhere. I feel we can take them out with some cleanup.

@gchatelet
Copy link
Contributor Author

gchatelet commented Sep 19, 2023

Alright so it took me a while but some of these includes come from the main llvm CMakeLists llvm-project/llvm/CMakeLists.txt#L1115.

  • /home/gchatelet/build/libc_x86/include
  • /home/gchatelet/git/llvm-project/llvm/include

I don't quite know how to deal with this. We could remove them globally when entering llvm-project/libc/CMakeLists.txt ? 🤷

EDIT: I'm working on a fix, I'll report when I have something that works.

gchatelet added a commit that referenced this pull request Sep 19, 2023
We want to activate `llvm-header-guard` (#66477) but the current CMake
configuration includes paths that should be `isystem`. This PR restricts
the number of `-I` passed to the clang command line and correctly marks
the llvm libc include path as `isystem`.
gchatelet added a commit to gchatelet/llvm-project that referenced this pull request Sep 20, 2023
We want to activate `llvm-header-guard` (llvm#66477) but the current CMake
configuration includes paths that should be `isystem`. This PR restricts
the number of `-I` passed to the clang command line and correctly marks
the llvm libc include path as `isystem`.

This is a reland of llvm#66783 a35a3b7 fixing the benchmark breakage.
@gchatelet
Copy link
Contributor Author

gchatelet commented Sep 20, 2023

We are not uniform, but I think we should follow this:

__LLVM_LIBC_<FILE_NAME_IN_UPPER_SNAKE_CASE>_H__

llvm-header-guard derives from header-guard and forbids header guards starting with underscores (original bug).

std::string HeaderGuardCheck::sanitizeHeaderGuard(StringRef Guard) {
  // Only reserved identifiers are allowed to start with an '_'.
  return Guard.drop_while([](char C) { return C == '_'; }).str();
}

If we drop this requirement, llvm-header-guard can already be used as is and doesn't need to be adapted to llvm libc.

@@ -6,8 +6,8 @@
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_LIBC_SRC_SUPPORT_CPP_BYTE_H
#define LLVM_LIBC_SRC_SUPPORT_CPP_BYTE_H
#ifndef LLVM_LIBC_SRC___SUPPORT_CPP_CSTDDEF_H
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a real error for instance

@@ -6,8 +6,8 @@
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_LIBC_SRC_STDLIB_ABS_UTILS_H
#define LLVM_LIBC_SRC_STDLIB_ABS_UTILS_H
#ifndef LLVM_LIBC_SRC___SUPPORT_INTEGER_OPERATIONS_H
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a real error as well.

@@ -6,6 +6,9 @@
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_LIBC_SRC___SUPPORT_THREADS_CALLONCE_H
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was missing a header

@@ -6,6 +6,9 @@
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_LIBC_SRC_STDLIB_BSEARCH_H
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing guard as well

#ifndef LLVM_LIBC_SRC_STDLIB_DIV_H
#define LLVM_LIBC_SRC_STDLIB_DIV_H

#include <stdlib.h>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guard not at the correct location, same for the following ones.

@@ -6,13 +6,13 @@
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_LIBC_SRC_STRING_STRCASECMP_H
#define LLVM_LIBC_SRC_STRING_STRCASECMP_H
#ifndef LLVM_LIBC_SRC_STRING_STRCASESTR_H
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header is a copy/paste error from libc/src/string/strncasecmp.h

@gchatelet gchatelet marked this pull request as ready for review September 20, 2023 12:38
@sivachandra
Copy link
Collaborator

We are not uniform, but I think we should follow this:
__LLVM_LIBC_<FILE_NAME_IN_UPPER_SNAKE_CASE>_H__

llvm-header-guard derives from header-guard and forbids header guards starting with underscores (original bug).

std::string HeaderGuardCheck::sanitizeHeaderGuard(StringRef Guard) {
  // Only reserved identifiers are allowed to start with an '_'.
  return Guard.drop_while([](char C) { return C == '_'; }).str();
}

If we drop this requirement, llvm-header-guard can already be used as is and doesn't need to be adapted to llvm libc.

Clang's freestanding headers use a guard with __CLANG prefix so libc is not being special in that sense. The __ prefix is required so that we do not pollute user namespace.

What is the goal here? Is to use clang-tidy to catch header-guard errors in include/`${LIBC_INCLUDE_DIR}?

@gchatelet
Copy link
Contributor Author

gchatelet commented Sep 20, 2023

What is the goal here? Is to use clang-tidy to catch header-guard errors in include/`${LIBC_INCLUDE_DIR}?

The goal of this particular patch is to enable clang-tidy to make sure sources under libc/src have properly formatted header guards. Let me know if you believe they should start with __ if so I'll create a new clang-tidy to enforce this rule.

EDIT: Question, the __ prefix is mandatory for headers in both include/* and src/* because the libc can be compiled as part of a wider binary? Or only for include/*?

@sivachandra
Copy link
Collaborator

The goal of this particular patch is to enable clang-tidy to make sure sources under libc/src have properly formatted header guards. Let me know if you believe they should start with __ if so I'll create a new clang-tidy to enforce this rule.

If include and/or ${LIBC_INCLUDE_DIR} are beyond the scope of this change, then we can discuss that topic separately.

EDIT: Question, the __ prefix is mandatory for headers in both include/* and src/* because the libc can be compiled as part of a wider binary? Or only for include/*?

This is only for the include directory (which means, for ${LIBC_INCLUDE_DIR} directory). Even if libc is compiled as part of the wider binary, the user source code for that binary will not include libc-internal headers and hence we do not have to worry about user code pollution.

@gchatelet gchatelet merged commit 270547f into llvm:main Sep 21, 2023
2 checks passed
@gchatelet gchatelet deleted the add_llvm-header-guard_clang_tidy branch September 21, 2023 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants