Skip to content

New tool 'llvm-elf2bin'. (NOT READY FOR REVIEW – NO TESTS) #73625

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

statham-arm
Copy link
Collaborator

This implements conversion from ELF images to binary and hex file formats, in a similar way to some of llvm-objcopy's output modes.

The biggest difference is that it works exclusively with the ELF loadable segment view, that is, program header table entries with the PT_LOAD type. Thus, it will output the whole of every loadable segment whether or not a section covers the same area of the file. In particular, it can still operate on an ELF image which has no section header table at all. This difference is large enough that it would be much more difficult to put the same functionality into llvm-objcopy: it would involve working out the interactions with all llvm-objcopy's other options, particularly the section-oriented ones.

Other features:

  • supports both Intel hex and Motorola S-records

  • supports output of a binary file per ELF segment, or a single one with padding between the segments to place them at their correct offsets

  • option to use either p_paddr or p_vaddr from each segment

  • allows distributing the output binary data into multiple 'banks', to support hardware configurations in which each 32-bit word of memory is stored as 16 bits in each of two ROMs, or 8 in each of four, or similar

This PR is not really ready for review yet, because there are no tests in it. elf2bin was developed as a standalone tool, and it's only just been integrated into the LLVM code base. It was requested by quic-subhkedi in
https://discourse.llvm.org/t/bring-features-of-fromelf-of-arm-to-llvm-objcopy/73229/12
and I'll do the extra work of porting the tests into LLVM if there's agreement that this is acceptable as a separate tool.

This implements conversion from ELF images to binary and hex file
formats, in a similar way to some of llvm-objcopy's output modes.

The biggest difference is that it works exclusively with the ELF
loadable segment view, that is, program header table entries with the
PT_LOAD type. Thus, it will output the whole of every loadable segment
whether or not a section covers the same area of the file. In
particular, it can still operate on an ELF image which has no section
header table at all. This difference is large enough that it would be
much more difficult to put the same functionality into llvm-objcopy:
it would involve working out the interactions with all llvm-objcopy's
other options, particularly the section-oriented ones.

Other features:

 - supports both Intel hex and Motorola S-records

 - supports output of a binary file per ELF segment, or a single one
   with padding between the segments to place them at their correct
   offsets

 - option to use either p_paddr or p_vaddr from each segment

 - allows distributing the output binary data into multiple 'banks',
   to support hardware configurations in which each 32-bit word of
   memory is stored as 16 bits in each of two ROMs, or 8 in each of
   four, or similar

This was developed as a standalone tool, and it's only just been
integrated into the LLVM code base. Until now, it's been tested using
a Python test harness that constructs test ELF files with no section
view. In order to bring the tests into the LLVM infrastructure, we
would probably need to start by enhancing yaml2obj to be able to write
out ELF files of that type.

Requested by quic-subhkedi in
https://discourse.llvm.org/t/bring-features-of-fromelf-of-arm-to-llvm-objcopy/73229/12
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 1459c627f0bc1a4938b5b6a7f4c2bdc1f3ec6a2c 32eded2b00685a7468dad496d31e5533828cf853 -- llvm/tools/llvm-elf2bin/bin.cpp llvm/tools/llvm-elf2bin/elf.cpp llvm/tools/llvm-elf2bin/hex.cpp llvm/tools/llvm-elf2bin/llvm-elf2bin.cpp llvm/tools/llvm-elf2bin/llvm-elf2bin.h
View the diff from clang-format here.
diff --git a/llvm/tools/llvm-elf2bin/bin.cpp b/llvm/tools/llvm-elf2bin/bin.cpp
index 15044613aa..850939ea67 100644
--- a/llvm/tools/llvm-elf2bin/bin.cpp
+++ b/llvm/tools/llvm-elf2bin/bin.cpp
@@ -245,9 +245,9 @@ combined_prepare(InputObject &inobj, const std::vector<Segment> &segments_orig,
     for (; it != end; ++it) {
       const auto &prev = nonoverlapping.back(), curr = *it;
       if (curr.baseaddr - prev.baseaddr < prev.memsize)
-        fatal(inobj, Twine("segments at addresses 0x")
-              + Twine::utohexstr(prev.baseaddr) + " and 0x"
-              + Twine::utohexstr(curr.baseaddr) + " overlap");
+        fatal(inobj, Twine("segments at addresses 0x") +
+                         Twine::utohexstr(prev.baseaddr) + " and 0x" +
+                         Twine::utohexstr(curr.baseaddr) + " overlap");
       nonoverlapping.push_back(curr);
     }
   }
diff --git a/llvm/tools/llvm-elf2bin/llvm-elf2bin.h b/llvm/tools/llvm-elf2bin/llvm-elf2bin.h
index daebd8640b..73221e574d 100644
--- a/llvm/tools/llvm-elf2bin/llvm-elf2bin.h
+++ b/llvm/tools/llvm-elf2bin/llvm-elf2bin.h
@@ -162,7 +162,8 @@ void srec_write(InputObject &inobj, const std::string &outfile,
 /*
  * Error-reporting functions. These are all fatal.
  */
-[[noreturn]] void fatal(llvm::StringRef filename, llvm::Twine message, llvm::Error err);
+[[noreturn]] void fatal(llvm::StringRef filename, llvm::Twine message,
+                        llvm::Error err);
 [[noreturn]] void fatal(llvm::StringRef filename, llvm::Twine message);
 [[noreturn]] void fatal(InputObject &inobj, llvm::Twine message,
                         llvm::Error err);

@jh7370 jh7370 requested review from MaskRay and jh7370 November 28, 2023 10:43
@jrtc27
Copy link
Collaborator

jrtc27 commented Nov 28, 2023

  • supports both Intel hex and Motorola S-records

GNU objcopy can do this, IIRC, and U-Boot relies on that. Having llvm-objcopy be able to do this, rather than have it in a standalone tool, would be useful. In general I'd suggest having llvm-elf2bin be as integrated into the libraries and existing tools as possible rather than its own thing on the side (e.g. llvm-objcopy and llvm-elf2bin could be frontends to the same tool that expose different sets of options).


using namespace llvm;

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in LLVM we prefer // for block comments.

* segment in an ELF file. In more complex cases there might also be
* zero-byte padding, or one of these stream objects filtering out the
* even-index bytes of another.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anonymous namespace

(otherwise, if there is another BinaryDataStream in the global namespace, their vtable would conflict.)

hexstream << ':';
for (uint8_t c : binstring) {
static const char hexdigits[] = "0123456789ABCDEF";
hexstream << hexdigits[c >> 4] << hexdigits[c & 0xF];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm::hexdigit from StringExtras.h

//
// Written as (1 + ~firstres) to avoid Visual Studio
// complaining about negating an unsigned.
pos = (1 + ~firstres) & mask;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many places where we would cause MSVC warning C4146: unary minus operator applied to unsigned type, result still unsigned. We have /wd4146 in compiler-rt/cmake/config-ix.cmake and should just use - without needing a comment.

def : Separate<["--"], longname>,
Alias<!cast<Joined>(NAME # "_EQ")>;

if !not(!empty(shortname)) then {
Copy link
Member

@MaskRay MaskRay Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For new utilities, we prefer supporting only --foo x/--foo=x and not -foo x. -- is the majority for modern programs. For this new utility we don't even have compatibility goal for supporting -foo.

@statham-arm
Copy link
Collaborator Author

Closing this draft PR as thoroughly out of date. There was never consensus that it belonged in LLVM in the first place, and the code in this branch is doubly stale – it's against an old version of the LLVM code, and the elf2bin tool itself has had bug fixes since then.

The up-to-date version can be found in Arm Toolchain: https://github.com/arm/arm-toolchain/tree/arm-software/arm-software/shared/elf2bin

@statham-arm statham-arm closed this Aug 5, 2025
@statham-arm statham-arm deleted the elf2bin branch August 5, 2025 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants