Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: POC of checksums using bindesc #63350

Closed
wants to merge 1 commit into from

Conversation

yonsch
Copy link
Contributor

@yonsch yonsch commented Oct 1, 2023

This is a proof of concept demonstrating how binary descriptors could be used to store a checksum/build-id of the executable.
This was first requested here: #51532.
Essentially the steps taken are:

  1. Create binary descriptors for the checksums, filling them with zeros.
  2. Compile normally, until the final zephyr.elf is linked.
  3. Run a script on the ELF, that collects all the raw data into an intermediate image.
  4. Scan the symbols to see what hashes the user is interested in.
  5. Calculate those hashes over the intermediate image, and modify the symbols inside zephyr.elf to those values.
  6. Continue regularly and create bin, hex and uf2 files.

Another important note is that the rom_start section is not included in the checksum calculation, because that section contains the binary descriptors. This ensures that the build time is not included in the calculation, producing consistent checksums on consecutive builds.

This was tested on native_posix and rpi_pico.

I'm submitting this as a draft, as there are still some questions to be answered:

  1. Should we modify zephyr.elf directly, or create a new ELF file?
  2. Should the build system itself set the checksums, or should it be done by the user like west sign?
  3. Does the method I use to calculate the hash capture everything that should be checksum-ed, and exclude everything that shouldn't? How portable is this? Should we just use objcopy?

Example output:

APP_VERSION_STRING "1.0.0"
BUILD_DATE_TIME_STRING "2023/10/01 17:22:43"
C_COMPILER_NAME "GNU"
C_COMPILER_VERSION "12.2.0"
CHECKSUM_MD5_BYTES b'&\\\r\x9d-|-\x10\xa2\xb1VC6Y\xf4;'
CHECKSUM_MD5_STRING "265c0d9d2d7c2d10a2b156433659f43b"
KERNEL_VERSION_MAJOR 3
KERNEL_VERSION_STRING "3.4.99"
0x2003 b'\x01\x02\x03\x04'
0x0002 5
0x1001 "Hello world!"

@yonsch
Copy link
Contributor Author

yonsch commented Oct 1, 2023

Extra feature: added the option to make the descriptors space of constant size, giving the descriptors room to change their sizes without affecting the checksum. For example, these two images have the same checksum, even though the my_string descriptor is of different lengths:

CHECKSUM_MD5_STRING "9a4e796a0530ae18d28099122c849ce9"
0x1001 "Hello world!"
CHECKSUM_MD5_STRING "9a4e796a0530ae18d28099122c849ce9"
0x1001 "Hello world!!!!!"

Signed-off-by: Yonatan Schachter <yonatan.schachter@gmail.com>
if segment.section_in_segment(section):
if not (section.name == "rom_start" and segment.section_in_segment(section)):
# Don't include rom_start in the checksum calculation
image_data += section.data()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd probably want to include the load address of the chunk in the hash somehow?

As you suggest, objcopy could be a good approach, and hashing an ihex file would give you an equivalent - where the hash operates on a "sparse but gaps are still encoded" basis.

... and possibly sort the chunks by their load address? (unless they are presented in-order already / is that guaranteed)


def set_checksums(filename):
with open(filename, 'rb+') as file_stream:
elffile = ELFFile(file_stream)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so sure this is really that useful, if the image is for MCUboot then the hex and bin files will have headers and footers added, so an MD5 hash (not a checksum) is not useful here. And if an elf file is split up, e.g. into different memory regions, this also does not have much use because the MCU would not ever really be able to verify that hash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not intended as a cryptographic hash for the MCU to verify (although it can be). It's a method for creating a "build ID". The use case, as phrased by @JordanYates is:

The GNU build ID tells you, without any doubt, the binary running on this device is the same as the binary I am inspecting on my computer.

It's harder to do with the hash given by imgtool: that only associates the binary running on a device with a binary on your computer, not an ELF, and it only works if your image is signed - it won't work on the bootloader itself or on a native_posix image. Also, if the image includes a build time, you will get a different hash for different builds.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you are trying to do here, in that case let's remove "checksum" from this because it is not a checksum. Build ID is a good way to refer to it.

Copy link

github-actions bot commented Dec 2, 2023

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

Copy link

github-actions bot commented Feb 1, 2024

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Feb 1, 2024
@github-actions github-actions bot closed this Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants