Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: version: option to enable GNU build ID #51532

Closed

Conversation

JordanYates
Copy link
Collaborator

Adds a Kconfig option to enable the generation of the GNU build ID from the final zephyr.elf file. This build ID can be used to uniquely identify a firmware image.

The --build-id option is available for the ld and ldd linkers, which are used by all in-tree compilers except arcmwdt and armclang.

Inspired by the following blog post: https://interrupt.memfault.com/blog/gnu-build-id-for-firmware

Signed-off-by: Jordan Yates jordan.yates@data61.csiro.au

@Laczen
Copy link
Collaborator

Laczen commented Oct 24, 2022

@JordanYates, would it be possible to make the build-id available outside of the image? E.g. could it be read during/before firmware upgrade?

@JordanYates
Copy link
Collaborator Author

@JordanYates, would it be possible to make the build-id available outside of the image? E.g. could it be read during/before firmware upgrade?

This would require either the section is placed at a constant memory address (which is probably unworkable) or being part of the image metadata of whatever upgrade mechanism is being used.

From a quick look MCUBoot image trailers would probably be suitable for this purpose. This would require the signing tool to be updated to add the trailer (And defining the TLV type). I imagine MCUBoot would make the argument that they already have major.minor.path versioning though, so why have another version number.

@Laczen
Copy link
Collaborator

Laczen commented Oct 24, 2022

@JordanYates, would it be possible to make the build-id available outside of the image? E.g. could it be read during/before firmware upgrade?

This would require either the section is placed at a constant memory address (which is probably unworkable) or being part of the image metadata of whatever upgrade mechanism is being used.

It could be added to the front of the image reclaming part of the 512 byte header that is added to support e.g. mcuboot.

From a quick look MCUBoot image trailers would probably be suitable for this purpose. This would require the signing tool to be updated to add the trailer (And defining the TLV type). I imagine MCUBoot would make the argument that they already have major.minor.path versioning though, so why have another version number.

Adding it to the header could remove any need of TLV processing for the image to access the information. A solution could be to add it just after the 32 byte header that MCUBoot uses (if mcuboot is used). One could think of creating a zephyr header that leaves space for a MCUBoot header if required and after that placing extra zephyr image information.

@nordicjm
Copy link
Collaborator

Adding it to the header could remove any need of TLV processing for the image to access the information

I was under the impression that the header was the same TLV system as the footer, the difference being about encryption and if it's included in the final signature or not?

Copy link
Collaborator

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed ?

We already create version.h which looks like this:

cat build/zephyr/include/generated/version.h
#ifndef _KERNEL_VERSION_H_
#define _KERNEL_VERSION_H_

/* KERNEL and ZEPHYR_VERSION  values come from cmake/version.cmake
 * BUILD_VERSION  will be 'git describe', alternatively user defined BUILD_VERSION.
 */

#define ZEPHYR_VERSION_CODE 197219
#define ZEPHYR_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))

#define KERNELVERSION          0x3026300
#define KERNEL_VERSION_NUMBER  0x30263
#define KERNEL_VERSION_MAJOR   3
#define KERNEL_VERSION_MINOR   2
#define KERNEL_PATCHLEVEL      99
#define KERNEL_VERSION_STRING  "3.2.99"

#define BUILD_VERSION          zephyr-v3.2.0-1066-g217528f2dea8

#endif /* _KERNEL_VERSION_H_ */

which uses the VERSION file and git describe:
https://github.com/zephyrproject-rtos/zephyr/blob/main/cmake/modules/version.cmake

COMMAND ${GIT_EXECUTABLE} describe --abbrev=12 --always

but also provide users with the possibility of specifying a custom VERSION.

zephyr/CMakeLists.txt

Lines 474 to 475 in 217528f

if(DEFINED BUILD_VERSION)
set(build_version_argument "-DBUILD_VERSION=${BUILD_VERSION}")

All of which is toolchain independent.

Why is a GNU specific feature needed, and even one that is default y.
What are the benefit of one more way of achieving the same ?

@JordanYates
Copy link
Collaborator Author

JordanYates commented Oct 31, 2022

What are the benefit of one more way of achieving the same ?

Mostly because none of the methods you mention can guarantee that an image you are communicating with is the image you actually think it is.

git describe is dependent on the commit ID changing, you can build an infinite number of variations of hello_world and git will happily tell you they are all the same.

Same deal with -DBUILD_VERSION. Maybe you're under pressure to release something and forget to change the define, or you fat finger it, or any number of other potential reasons.

The GNU build ID tells you, without any doubt, the binary running on this device is the same as the binary I am inspecting on my computer.

@nordicjm
Copy link
Collaborator

Adding @stephanosio for comment. I don't think we want to introduce anything that is limited to a GCC-based compiler only, that is far too limiting.

@JordanYates
Copy link
Collaborator Author

JordanYates commented Oct 31, 2022

Adding @stephanosio for comment. I don't think we want to introduce anything that is limited to a GCC-based compiler only, that is far too limiting.

The limitation is on the linker, not the compiler. The option is supported by ld and lld, which are used by every in-tree compiler except arcmwdt and armclang.

@nordicjm
Copy link
Collaborator

The limitation is on the linker, not the compiler. The option is supported by ld and lld, which are used by every in-tree compiler except arcmwdt and armclang.

And IAR?

@JordanYates
Copy link
Collaborator Author

JordanYates commented Oct 31, 2022

The limitation is on the linker, not the compiler. The option is supported by ld and lld, which are used by every in-tree compiler except arcmwdt and armclang.

And IAR?

I don't see it under cmake/compiler or in the Zephyr SDK, so I'm not sure its fair to call that in-tree. Honestly I have never used it so I have no idea what options it may or may not support.

That is the point of the TOOLCHAIN_SUPPORTS_GNU_BUILD_ID option however.

Copy link
Member

@stephanosio stephanosio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GNU build ID tells you, without any doubt, the binary running on this device is the same as the binary I am inspecting on my computer.

I can see that is a very useful feature, but I am not convinced GNU build ID is the right solution for it.

As you and others pointed out, this feature only works with GNU and LLVM toolchains and will not work with any other toolchains in its current form.

IMHO, we should try to provide a more generic solution to this problem. For example, a Python script that injects the hash of the pre-stage ELF file to the final ELF file (basically the same thing that the ld does for GNU build ID).

@JordanYates
Copy link
Collaborator Author

IMHO, we should try to provide a more generic solution to this problem. For example, a Python script that injects the hash of the pre-stage ELF file to the final ELF file (basically the same thing that the ld does for GNU build ID).

I suspect this is more complicated than just calculating a hash of the .elf file, and also raises the question of which link stage are you calculating the hash over? Can you guarantee that having a hash of one of the "pre" stages is sufficient? Or do we add yet-another-link-stage?

My personal opinion is that ld and lld covers a sufficiently large segment of the userbase that adding the option, which only requires modifying a linker flag, provides sufficient value until a user of the non-supported toolchains is motivated to write the (probably complex) python parser/injecter. Or asks their friendly neighbourhood compiler vendor to implement a --build-id flag.

@ceolin
Copy link
Member

ceolin commented Dec 2, 2022

I liked it that is a good feature, and as @JordanYates mentioned, the current versioning does not give a whole picture of what was built, as Zephyr (almost by lucky) has reproducible builds, this feature becomes really powerful. I'm in favor in the proposed implementation, it is simple enough and probably covers most builds. To make it generic we problably would need to reserve an area and add one additional final step in the build that overrides it with the unique ID.

For me this feature is just simple and good enough, if one needs something more robust probably will go to a solution that involves sign the image.

Adds a Kconfig option to enable the generation of the GNU build ID from
the final `zephyr.elf` file. This build ID can be used to uniquely
identify a firmware image.

The `--build-id` option is available for the `ld` and `ldd` linkers,
which are used by all in-tree compilers except `arcmwdt` and `armclang`.

Inspired by the following blog post:
  https://interrupt.memfault.com/blog/gnu-build-id-for-firmware

Signed-off-by: Jordan Yates <jordan.yates@data61.csiro.au>
Jordan Yates added 2 commits January 8, 2023 14:53
Adds a helper to `zephyr/kernel_version.h` to retrieve the GNU build ID.

Signed-off-by: Jordan Yates <jordan.yates@data61.csiro.au>
Test that `sys_gnu_build_id_get` returns a valid pointer, and the entire
hash is not 0. The expected structure of the .elf note is validated by
an assert inside the function implementation.

Signed-off-by: Jordan Yates <jordan.yates@data61.csiro.au>
@JordanYates
Copy link
Collaborator Author

Looking to close this out with either accepted or rejected.
@tejlmand do you have any comments on my response to your question?

@nordicjm
Copy link
Collaborator

nordicjm commented Jan 9, 2023

I don't think should go in being limited to GCC only, does the xtensa compiler support it? ARC? For most devices we have the GCC zephyr toolchain but there was (not sure if there still is) talk about supporting other compilers and this just introduces too much of headache in my mind, and I don't see this being essential as per the requirements listed on #52226

@nordicjm nordicjm added this to In progress in Build system Jan 11, 2023
@nordicjm
Copy link
Collaborator

@felipe-iar can you comment on this?

@felipe-iar
Copy link

@felipe-iar can you comment on this?

Hi @nordicjm and thanks for mentioning, as this might be important for the portability efforts. At the moment, the IAR ILINK does not provide any option corresponding to the GCC's --build-id. Possibly would require an acceptable workaround in case this feature becomes mandatory for Zephyr.

Copy link
Member

@attie-argentum attie-argentum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following on from #54522 and comments above... I'd suggest that this is suitably "robust, low effort and inclusive" to be worth considering seriously - but I fully take the point regarding "GNUisms".

In my opinion, the alternative to this small patch would be a larger and potentially fragile solution, which may still not suit 100% of users.

@@ -200,6 +200,18 @@ config KERNEL_ENTRY
help
Code entry symbol, to be set at linking phase.

config TOOLCHAIN_SUPPORTS_GNU_BUILD_ID
bool
default y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make this default n, and explicitly set it y for those that support it.

/* Variable instantiated by the linker */
extern const struct elf_note_section __g_note_build_id;

const uint8_t *sys_gnu_build_id_get(void)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the event that this expands out into supporting other toolchains, the build ID may be differently sized... I'd suggest an interface more like this.

Suggested change
const uint8_t *sys_gnu_build_id_get(void)
void sys_build_id_get(const uint8_t **buf, size_t *len)

@Laczen
Copy link
Collaborator

Laczen commented Feb 7, 2023

@attie-argentum, have you seen the recent proposal from @yonsch (#54464) ?

@attie-argentum
Copy link
Member

@Laczen - I had not, definitely worth tying these all together, thanks!

@github-actions
Copy link

github-actions bot commented Apr 9, 2023

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@yonsch
Copy link
Contributor

yonsch commented Oct 1, 2023

For anyone still interested, another approach to implement build ID: #63350

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done in 3.4
Build system
In progress
Development

Successfully merging this pull request may close these issues.

None yet

10 participants