Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang builds don't enable additional config options #136

Closed
broonie opened this issue Aug 7, 2019 · 15 comments · Fixed by #486
Closed

clang builds don't enable additional config options #136

broonie opened this issue Aug 7, 2019 · 15 comments · Fixed by #486
Assignees
Labels
bug Something isn't working

Comments

@broonie
Copy link
Member

broonie commented Aug 7, 2019

It appears that clang builds for configurations which enable additional configuration options don't actually enable those options. For example:

https://storage.kernelci.org/next/master/next-20190807/arm64/defconfig+CONFIG_CPU_BIG_ENDIAN=y/clang-8/build.log

at no point enables CONFIG_CPU_BIG_ENDIAN resulting in a config:

https://storage.kernelci.org/next/master/next-20190807/arm64/defconfig+CONFIG_CPU_BIG_ENDIAN=y/clang-8/kernel.config

where CPU_BIG_ENDIAN is not set. In contrast the equivalent GCC build does have it set in the generated config. I'm not seeing any visible difference in the start of the build logs.

@mattface

@broonie
Copy link
Member Author

broonie commented Aug 7, 2019

@gctucker Potentially this is something in kci_build?

@gctucker
Copy link
Contributor

gctucker commented Aug 8, 2019

I'll take a look but it would be surprising that only CPU_BIG_ENDIAN doesn't work while there hasn't been any issues with the other extra config options in the past.

@gctucker
Copy link
Contributor

gctucker commented Aug 8, 2019

I've reproduced this problem with a build on staging, this defconfig doesn't have CPU_BIG_ENDIAN enabled:
https://staging-storage.kernelci.org/gtucker/kernelci-local/kernelci-local-snapshot-048/arm64/defconfig+CONFIG_CPU_BIG_ENDIAN=y/clang-8/kernel.config

# CONFIG_CPU_BIG_ENDIAN is not set

even though it's in the frag.config:
https://staging-storage.kernelci.org/gtucker/kernelci-local/kernelci-local-snapshot-048/arm64/defconfig+CONFIG_CPU_BIG_ENDIAN=y/clang-8/frag.config

When I run it locally it does enable it in the .config so either this gets changed later during the build process with clang for some obscure reason or there's a bug in kci_build that only shows on the Jenkins jobs. I'll add some more debug to narrow the issue down.

@broonie
Copy link
Member Author

broonie commented Aug 8, 2019

I suspect it's not only CPU_BIG_ENDIAN, it's just very visible on CPU_BIG_ENDIAN due to the fact that failing to set that option causes us to fail to boot userspace whereas many other options could reasonably be off and won't stop things booting.

@gctucker gctucker self-assigned this Aug 8, 2019
@gctucker gctucker added the bug Something isn't working label Aug 8, 2019
@gctucker
Copy link
Contributor

gctucker commented Aug 8, 2019

OK I've found several different issues in various places:

The first one is that running merge_config.sh calls make to build some host tools such as scripts/basic/fixdep and it needs to be told that the host compiler is clang otherwise it defaults to gcc which is not present in the Docker image used for Clang builds in Jenkins. So I've made a PR to fix that and tested it on staging, it should be good to be deployed tomorrow: #141.

The second issue is that the kernel Makefile doesn't take into account the HOSTCC set in the environment when building some host tools such as scripts/basic/fixdep. So I've sent an RFC patch about that as I don't see how it can possibly work otherwise but at the same time it seems to be such an obvious fix that I don't see how it hasn't been done yet:
https://marc.info/?l=linux-kbuild&m=156529846108691&w=2 (cover letter)
https://marc.info/?l=linux-kbuild&m=156529847208693&w=2 (patch)

The third issue which I haven't addressed is that merge_config.sh doesn't propagate errors from the make call it does to build the host tools. In this case, make failed because it couldn't find gcc (without the fixes described above):

make[1]: Entering directory '/home/gtucker/project/kernelci/kernelci-core/linux/build'
  HOSTCC  scripts/basic/fixdep
/bin/sh: 1: gcc: not found

But then merge_config.sh continued and silently failed to actually merge the defconfigs, and we know what happened after that. So another patch is probably required to fix merge_config.sh and propagate the error (adding set -e at the top might be enough...). Up for grabs...

@broonie
Copy link
Member Author

broonie commented Aug 8, 2019

set -e sounds like it's enough (it tends to be my default thing for shell scripts if I CBA writing error handling anyway and I'm not sure what much more could be done).

@broonie
Copy link
Member Author

broonie commented Aug 8, 2019

Sent https://lore.kernel.org/r/20190808222705.35973-1-broonie@kernel.org

@broonie
Copy link
Member Author

broonie commented Aug 13, 2019

Some pushback from Yamada-san, he has a version of my patch which he prefers (and which I prefer as well TBH, I wrote it the way I did since it seemed more idiomatic) and he thinks Guillame's fix should be done in KernelCI.

@gctucker
Copy link
Contributor

I guess we can first fix this in KernelCI (with a Makefile or something) and then see if the fix can get merged upstream in the kernel.

@gctucker gctucker added this to In Progress in KernelCI project board Aug 28, 2019
@gctucker
Copy link
Contributor

gctucker commented Sep 6, 2019

The merge_config.sh script now propagates errors when failing to merge configs with clang due to gcc not being present. So at least we shouldn't see any false big-endian boots with little-endian kernel builds.

However a fix is still needed as per the previous comment (i.e. call the script from make, or implement something in kci_build).

@broonie
Copy link
Member Author

broonie commented Nov 5, 2019

I think everything is now sorted here, at least for mainline and -next so we can close this?

@gctucker
Copy link
Contributor

gctucker commented Nov 5, 2019

@broonie Well, no it's still not working. We still can't build say, defconfig + CONFIG_CPU_BIG_ENDIAN=y with clang when gcc is not present because merge_config.sh calls make and relies on gcc to be there by default.

@broonie
Copy link
Member Author

broonie commented Nov 5, 2019

I thought we'd fixed that by ensuring that GCC would in fact be there? A bit of a workaround but effective.

@gctucker gctucker mentioned this issue Feb 28, 2020
6 tasks
@gctucker
Copy link
Contributor

gctucker commented Jul 6, 2020

Unfortunately, adding GCC to the Clang Docker image is not a way to fix the problem.

@nickdesaulniers Now that LLVM=1 is available as per #104, can we fix merge_config.sh by adding explicit support for it?

@gctucker
Copy link
Contributor

I've verified locally that LLVM=1 does fix the issue with merge_config.sh in mainline. We should be able to verify that on staging with the big-endian builds soon with #481.

@gctucker gctucker linked a pull request Sep 1, 2020 that will close this issue
KernelCI project board automation moved this from In Progress to Done Sep 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

Successfully merging a pull request may close this issue.

2 participants