-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang support #1156
clang support #1156
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! A few follow up questions:
By adding the integration tests, I assume they all pass (x86_64 and ppc64le)? And are the integration tests just for testing this branch (ie, not for merging)?
Do you think we should set a minimum kernel release that we will build with clang?
I had added a comment on renaming the --skip-gcc-check
but I see you covered that later. In similar fashion, would it be worth renaming kpatch-gcc
to kpatch-cc
as well? A small detail, but I didn't know how far you wanted to go abstracting the compiler name.
# $TARGETS used as list, no quotes. | ||
# shellcheck disable=SC2086 | ||
CROSS_COMPILE="$TOOLSDIR/kpatch-gcc " make "-j$CPUS" $TARGETS 2>&1 | logger || die | ||
make "${MAKEVARS[@]}" "-j$CPUS" $TARGETS 2>&1 | logger || die |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the use of the array rather than remembering to += a leading or trailing space to a plain string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also spent almost an hour to figure out how to properly quote these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep CROSS_COMPILE?, which would be the reason why we do a partial match in kpatch-cc script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, CROSS_COMPILE with clang means much more than with gcc
Just x86_64. RHEL/CentOS don't ship clang, so I used Fedora 33 for testing and there is no ppc64le Fedora.
I plan to add upstream kernel testing (both gcc and clang, although again x86 only) for all future prs/merges after we get clang support in.
No, I would assume that if kernel's config contains CONFIG_CC_IS_CLANG that that particular kernel version supports building with clang. Similarly, minimal clang version should be handled by compiler checks, i.e. the version used to build the kernel should be good enough to build kpatch.
Good idea, I'll rename it. |
Superb!. I did not get a chance to review/try out the series. I run one of my VM's with rawhide, ppc64le Fedora is available at https://alt.fedoraproject.org/alt/. |
Oh it is, neat. I'll try running on ppc64le as well and will get back here with the results. |
I think on RHEL7 you have to install software collections and then the devtoolset. On RHEL8 it should just be llvm-toolset. (Or are you talking about newer versions of clang?)
Ok. I didn't know if there were some versions of clang that could build minimally configured kernels. For example, maybe they didn't support ftrace or other livepatch requirements until a later version. I suppose kpatch-build's CONFIG checks would probably catch anything like that. |
As @kamalesh-babulal pointed out ppc64le Fedora is available after all. Since we are not likely to support clang kernel builds on RHEL7/8 I'd rather focus on a newer and fully supported versions of clang provided by fedora. |
FWIW, upstream kernel only supports clang 10+. |
So there are (implied) minimum supported versions after all :) How about we add a "Supported Compilers" section to the README.md and make note of this for clang? |
In linux-next there's a new |
Well I still don't see why should we. kpatch-build itself does not have a set minimum supported clang version (at least for now). Unless the user specifies '--skip-compiler-check' he won't be able to use a compiler version unsupported by the kernel. And using that option implies that the user knows what he is doing. |
@kamalesh-babulal @joe-lawrence clang on ppc64le is missing -mprofile-kernel option that gcc has. And it looks like this is currently required to build a kernel with livepatch support. |
Thinking more on this, perhaps this is only a matter of semantics: this PR adds clang support to kpatch-build, but not necessarily kpatch at large. The latter would require us blessing a distro (as done in README.md) and that would imply regression and integration tests running cleanly for that kernel + compiler + distro combination. My only reservation was with an upstream hat, ie, "these Red Hat guys say kpatch supports X, but have never even tested X on the distro listed on their page". So until the supported distro list is updated, consider my reservations moot. |
The changes in this PR look pretty reasonable to me. I can approve if you want to handle ppc64le support in the future? |
I was able to build the kernel with clang-11 on ppc64le, previous attempts a while a go failed to build kernels. Looking at the function prologue built by clang, it generates the calling sequence like one built by gcc without
We need clang to support a similar prologue sequence, where the |
Wasn't planning on doing it. Clang support is supposed to be experimental and will likely stay so for some time. I thought not mentioning it in any docs would hint on that, but maybe we need to state this explicitly somewhere, i.e. mention it during kpatch build when CONFIG_CC_IS_CLANG is detected.
Thank you for clarifying that. If no compiler changes are needed I'll try to update |
@sm00th Sorry for not being clear here. What I meant is, we need the change to the compiler and once the clang generates the expected sequence only change required in kernel in to teach the |
My fault, I read that wrong. As agreed on IRC let's hold off ppc64le support and integration tests for now. I've opened #1157 to track this issue. |
kpatch-build/kpatch-cc
Outdated
@@ -61,7 +61,8 @@ if [[ "$TOOLCHAINCMD" =~ "gcc" ]] ; then | |||
fi | |||
shift | |||
done | |||
elif [[ "$TOOLCHAINCMD" = "ld" ]] ; then | |||
elif [[ "$TOOLCHAINCMD" = "ld" || "$TOOLCHAINCMD" = "lld" || \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ever actually named "lld"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least on fedora ld.lld is a symlink to lld, so I added it to be sure we catch it. But you are right, kernel build never calls it as 'lld'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you try to invoke it directly you get the following error:
lld is a generic driver.
Invoke ld.lld (Unix), ld64.lld (macOS), lld-link (Windows), wasm-ld (WebAssembly) instead
So I guess we can safely remove lld from the list of possibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for kernel builds, we use make LD=ld.lld
.
kpatch-build/kpatch-build
Outdated
MAKEVARS+=("LD=$TOOLSDIR/kpatch-cc ld") | ||
fi | ||
else | ||
MAKEVARS+=("CROSS_COMPILE=$TOOLSDIR/kpatch-cc ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the CROSS_COMPILE hack doesn't work with clang, maybe we should just do the CC= and LD= trick for both compilers for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only issue I found w/ CROSS_COMPILE for clang is that $kdir/Makefile tries to get the CLANG_FLAGS, --target, from the CROSS_COMPILE. CC and LD could have other names rather than gcc and clang (e.g., x86-linux-gnu-gcc). which makes CROSS_COMPILE provide better flexibility. Let's keep the CROSS_COMPILE here. As we know, kaptch-cc expects regex matching for tools as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that using CROSS_COMPILE would be more robust, however it sets --target
, --prefix
and --gcc-toolchain
. And it also mangles them so here's how clang is called when using CROSS_COMPILE:
clang -Wp,-MMD,scripts/mod/.devicetable-offsets.s.d -nostdinc -isystem /usr/lib64/clang/10.0.1/include
-I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi
-I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h
-include ./include/linux/compiler_types.h -D__KERNEL__ -Qunused-arguments -Wall -Wundef
-Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE
-Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security
-std=gnu89
--target=kpatch-gcc --prefix=/usr/libexec/kpatch/ /usr/bin/kpatch-gcc --gcc-toolchain=/usr/libexec/kpatch /usr
-no-integrated-as -Werror=unknown-warning-option -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -m64
-mno-red-zone -mcmodel=kernel -Wno-sign-compare -fno-asynchronous-unwind-tables -O2
-Wframe-larger-than=1024 -fno-stack-protector -Wno-format-invalid-specifier -Wno-gnu -mno-global-merge
-pg -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-stack-check -Werror=date-time
-Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length
-Wno-tautological-constant-out-of-range-compare -DKBUILD_MODFILE='"scripts/mod/devicetable-offsets"'
-DKBUILD_BASENAME='"devicetable_offsets"' -DKBUILD_MODNAME='"devicetable_offsets"' -fverbose-asm
-S -o scripts/mod/devicetable-offsets.s scripts/mod/devicetable-offsets.c
The interesing part is: --target=kpatch-gcc --prefix=/usr/libexec/kpatch/ /usr/bin/kpatch-gcc --gcc-toolchain=/usr/libexec/kpatch /usr
The build ultimately fails with clang-10: error: no such file or directory: '/usr/bin/kpatch-gcc'
So I don't think we can keep using CROSS_COMPILE without changes to how kbuild treats it in clang's case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. We do have an issue w/ the Makefiles for kbuild w/ clang even though the fix in the Makefile is simple. The question is "can we (do we want to) update the Makefile not to parse CROSS_COMPILE?" Actually, CROSS_COMPILE is a prefix to CC, LD, ... So, those should be atomic rather than be parsed.
One concern on not using CROSS_COMPILE is that CC and LD needs to be explicitly defined in kpatch-build. This requires an assumption where CC and LD needs to be on $PATH. In addition, if someone wants to use something like, x86_64-linux-gnu-gcc, explicit definition for CC and LD, could be an another issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our usage of CROSS_COMPILE="kpatch-cc " is a hack, so it didn't surprise me too much when it broke clang. Anyway I'd be glad to get rid of it. Also, eventually we may want to support the use of CROSS_COMPILE for real cross compiling.
If we allow the user to optionally specify CC and LD in the environment and/or kpatch-build command-line, does that resolve your concerns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, to remove the hack, the only way seems to optionally specify CC and LD.
Pushed a new version addressing @jpoimboe's concerns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably also have a way for the user to compile kpatch itself with clang.
kpatch-build/create-diff-object.c
Outdated
@@ -1340,6 +1340,9 @@ static void kpatch_correlate_static_local_variables(struct kpatch_elf *base, | |||
target_sec = parent->sec->rela; | |||
} | |||
|
|||
if (is_special_static(target_sec->base->secsym)) | |||
continue; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still confused by this, can you give a more specific example of why it's needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With macro-callback.patch I am hitting diff_fatal on line 1347, because target_sec->twin is null. The section is '__dyndbg' (thats why it has no twin), the symbol is 'L.str'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.L.str
isn't actually a static local variable (static local means function scope). kpatch_is_normal_static_local() is getting confused by the "." in the variable name.
Also, looking at a clang binary, static locals are named differently compared to GCC: they have the function name prepended to them, like:
13: 0000000000000007 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.13
14: 0000000000000009 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.17
15: 0000000000000008 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.15
16: 000000000000000a 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.19
17: 000000000000000b 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.21
18: 0000000000000005 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.9
19: 0000000000000004 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.7
20: 0000000000000006 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.11
21: 0000000000000003 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once
22: 000000000000000c 1 OBJECT LOCAL DEFAULT 40 unwind_next_frame.__print_once.23
This explains the need for the other _UNIQUE_ID change: those were also static locals.
Also notice that in clang, we no longer can rely on static locals having a numbered suffix. In fact they usually don't have a numbered suffix, unless there's more than one of them (with the same name) in a function.
However, we can rely on them having a "." in the name, thanks to the function name being prepended.
So looking at the code I think we need some tweaks:
-
(not clang-specific, just something I found as shown above)
__print_once
should be added to the special static list. It's similar to__warned
in functionality. -
is_special_static() needs to look for both GCC and Clang variants, e.g.
__print_once.
(for GCC) and.__print_once
(for Clang). The prefix searching logic can use strstr() instead. -
kpatch_is_normal_static_local() should consider a ".L" symbol to be not a static local
-
kpatch_mangled_strcmp() needs to consider
unwind_next_frame.__print_once
andunwind_next_frame.__print_once.1
as equal, even though the former doesn't have a period at the end. -
it would probably be a good idea to add a unit test or two in this area
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also while you're changing that code, I think kpatch_is_normal_static_local() and is_special_static() should really return bool :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this explaination. I've covered all of the mentioned changes in my last push. I am having trouble with unit tests though. Any hints on how I can get create-diff-object fail without a proper clang-style special static detection in is_special_static()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make a patch which removes one of the special statics. Before your patch it thinks they're just normal static locals, so it tries to correlate them and errors out when one of them goes missing. For example, looking at the vmlinux.o symbol table for ideas:
25232: 0000000000000000 1246 FUNC LOCAL DEFAULT 18984 init_srcu_struct_fields
25233: 0000000000000000 0 OBJECT LOCAL DEFAULT 19034 init_srcu_struct_fields.__key
25234: 0000000000000000 0 OBJECT LOCAL DEFAULT 19035 init_srcu_struct_fields.__key.7
init_srcu_struct_fields() has two static locals named __key
(from the mutex_init macro). So remove one of those mutex_init() calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including __print_once
in special static list broke gcc-static-local-var-2.patch
on rhel8.1 and 8.2 on ppc64le. There are 2 __print_once
symbols that are placed in .data..read_mostly
section which is not in the list of data sections allowed for inclusion. I thought maybe this can be treated the same way as .data.once but it doesn't seem to be the case as it contains other symbols which are not selected for inclusion so create-diff-object fails in kpatch_create_intermediate_sections()
->need_dynrela()
trying to work with symbols that were already torn-down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of checking the variable name for __print_once
, maybe just check if the symbol lives in .data.once
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented by checking sym->sec->name in is_special_static(): https://github.com/dynup/kpatch/pull/1156/files#diff-b11891a551552359c664b4475a7ed9f84b1d569d4c649a775c9ba23e53f4bbc8R347
kpatch-build/create-diff-object.c
Outdated
@@ -1340,6 +1340,9 @@ static void kpatch_correlate_static_local_variables(struct kpatch_elf *base, | |||
target_sec = parent->sec->rela; | |||
} | |||
|
|||
if (is_special_static(target_sec->base->secsym)) | |||
continue; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make a patch which removes one of the special statics. Before your patch it thinks they're just normal static locals, so it tries to correlate them and errors out when one of them goes missing. For example, looking at the vmlinux.o symbol table for ideas:
25232: 0000000000000000 1246 FUNC LOCAL DEFAULT 18984 init_srcu_struct_fields
25233: 0000000000000000 0 OBJECT LOCAL DEFAULT 19034 init_srcu_struct_fields.__key
25234: 0000000000000000 0 OBJECT LOCAL DEFAULT 19035 init_srcu_struct_fields.__key.7
init_srcu_struct_fields() has two static locals named __key
(from the mutex_init macro). So remove one of those mutex_init() calls.
I will also later add a commit that will include dynup/kpatch-unit-test-objs#28 after that is merged. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Add support for clang-built kernels. This is completely automatic, we check if the kernel was built with clang by looking for CONFIG_CC_IS_CLANG in config. This has almost no effect on non-clang built kernels with one exception: we now do compliler checks _after_ we download kernel sources which is a waste of resources in case when compilers don't match. Signed-off-by: Artem Savkov <asavkov@redhat.com>
Clang adds .L.str* symbols to .rodata.str sections which are used for __FILE__ references. These are discarded during linking so add them to maybe_discarded_sym(). Signed-off-by: Artem Savkov <asavkov@redhat.com>
clang does not always use __UNIQUE_ID as prefix and can generate symbols with names like this: trace_nfsd_exp_get_by_name.__UNIQUE_ID___addressable___SCK__tp_func_nfsd_exp_get_by_name645 Signed-off-by: Artem Savkov <asavkov@redhat.com>
Add patches rebased on top of upstream 5.10.11 tarball. Integration tests for these can be ran as this: make PATCH_DIR="linux-5.10.11" KPATCHBUILD_OPTS="-s /path/to/src/linux-5.10.11" integration-slow Signed-off-by: Artem Savkov <asavkov@redhat.com>
gcc-plugin-devel needs to be installed on ppc64le fedora. Signed-off-by: Artem Savkov <asavkov@redhat.com>
New version contains better regexes in kpatch-cc |
gcc-generated static variables always have a numbered suffix, while clang-generated static variables are always prepended with a function name. Change is_special_static() so that it detects both cases. Signed-off-by: Artem Savkov <asavkov@redhat.com>
Make kpatch_is_normal_static_local() treat all symbols prefixed by '.L' as not static locals. Signed-off-by: Artem Savkov <asavkov@redhat.com>
Make kpatch_mangled_strcmp treat two strings as the same in case when one has a digit tail and the other one doesn't. Signed-off-by: Artem Savkov <asavkov@redhat.com>
Bump submodule pointer to include clang unit-tests. Signed-off-by: Artem Savkov <asavkov@redhat.com>
Support for building kpatches for kernels built with clang.