Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-objcopy] Segment based binary output format for objcopy #68569

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

quic-subhkedi
Copy link

@quic-subhkedi quic-subhkedi commented Oct 9, 2023

[llvm-objcopy] Segment based binary output format for objcopy

  1. Adds support for a new output format
    for objcopy that is segments based binary output
    formats. In this format the multiple output files
    are created based on the number of loadable program
    header count in the input elf with each files having
    the content from the sections mapped into it.

  2. The patch uses the segement count from elf file to
    determine the number of output files and append the segment
    index to the passed output file name for output emission.

  3. Adds a new tool called llvm-fromelf where the required
    options/features of ARM's fromelf can be added to.

@quic-subhkedi quic-subhkedi marked this pull request as draft October 9, 2023 09:29
@quic-subhkedi quic-subhkedi force-pushed the SegmentBasedObjcopy branch 4 times, most recently from 5daea5d to 75c431a Compare October 11, 2023 11:58
1. Adds support for a new output format
for objcopy that is segments based binary output
formats. In this format the multiple output files
are created based on the number of loadable program
header count in the input elf with each files having
the content from the sections mapped into it.

2. Adds a new tool called llvm-fromelf where the required
options/features of ARM's fromelf can be added to.
@quic-subhkedi quic-subhkedi marked this pull request as ready for review October 12, 2023 09:26
@llvmbot llvmbot added the cmake Build system in general and CMake in particular label Oct 12, 2023
@quic-subhkedi quic-subhkedi changed the title Segment based binary output format for objcopy [llvm-objcopy] Segment based binary output format for objcopy Oct 13, 2023
Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally not keen on having a llvm-fromelf alias for a few reasons, some might be resolved but some I don't think can be resolved.

The first is a comparison with llvm-readelf. The llvm-readelf tries to be a drop in replacement for readelf and tries to follow the same syntax for all options. This can't be done for fromelf in total as fromelf is a mix of objcopy, objdump, dwarfdump, strip et-al. Unless there is a commitment to mimic fromelf I think this sets expectations that cannot be matched. Even if we only mimic the binary/hex writing parts there are a lot more options that fromelf has and behaviour that are not implemented.

The second, and something that would need to be checked, is whether closely using the name of a commercial tool may bring down legal/trademark problems. Possible that there is no problem, but I would personally not want to take the risk.

Lastly I think the options can be added to llvm-objcopy with descriptive names and developers could easily find them, or be told that these were functionally equivalent to the fromelf options.

With that in mind it looks like the options that have been added are a new output format -O sbin.

If it is possible I personally would not add this as a separate format, but as an additional option modifying -O bin. In other words the output is still bin, it is just that the tool splits up the output into multiple parts, much like --only-section

. This can be combined with other output formats like hex without having to additional segement based outputs for all the other hex formats. It looks like in the original proposal it was --dump-segments.

It may be best to go back to discourse with the proposed interface as it will be easier to have a conversation with multiple people there. This is just my own opinion and I'm not a maintainer.

Some other thoughts:

  • The fromelf interface https://developer.arm.com/documentation/101754/0621/fromelf-Reference/fromelf-Command-line-Options/--bin?lang=en will output a single file with the desired output filename if there is only one segment, but a directory with the desired output filename containing the segments otherwise. Fromelf uses a convention that looks up the name of the first output section in the segment, but as mentioned on discourse this is something we should change to something like segment1 segment2 etc. as existing fromelf customers get confused with this convention. I can't see tests for both of these cases.
  • How does this combine with --only-section=
    ? In theory this is well defined and only the sections in the list could be written. Can we have tests added showing the interaction, in particular what happens when some segements are partially written and some not written at all.
  • Is there scope for --only-segment=? Could be added later though.

In summary:

  • I'd prefer that the llvm-fromelf alias is dropped, or at least renamed not to say fromelf.
  • Would a modifier like --dump-segments be preferable to a new output format? That would combine with hex output formats as well, which can also be written out per segment. That would require more tests to show that it worked with hex too.
  • It looks like some more test cases are required.

I've not gone through the code in detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmake Build system in general and CMake in particular llvm:binary-utilities
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants