Skip to content

Latest commit

 

History

History
94 lines (70 loc) · 3.98 KB

howto.md

File metadata and controls

94 lines (70 loc) · 3.98 KB

Using some tools that are usually distributed with Linux distros, we can inspect the binaries generated by our assembly code (and other binaries too).

hexdump

The first tool we are going to use is hexdump. If you are in the project's root directory and have assembled and linked our first.asm program, then type:

$ hexdump bin/first
0000000 457f 464c 0102 0001 0000 0000 0000 0000
0000010 0002 003e 0001 0000 0080 0040 0000 0000
0000020 0040 0000 0000 0000 0190 0000 0000 0000
0000030 0000 0000 0040 0038 0001 0040 0005 0004
0000040 0001 0000 0005 0000 0000 0000 0000 0000
0000050 0000 0040 0000 0000 0000 0040 0000 0000
0000060 008c 0000 0000 0000 008c 0000 0000 0000
0000070 0000 0020 0000 0000 0000 0000 0000 0000
0000080 3cb8 0000 bf00 0000 0000 050f 0000 0000
0000090 0000 0000 0000 0000 0000 0000 0000 0000
00000a0 0000 0000 0000 0000 0000 0000 0003 0001

(I truncated part of the output because it was too large.)

What we see here is the contents of our binary executable file, in hexadecimal format. The first number in each line is just the offset of that line with respect to the beginning of the file. Then each character after that represents 4 bits. Each group of 4 characters is then a group of 2 bytes.

In this way, we can see the exact output that nasm and ld generated for our assembly code. You will also notice that certain combinations of instruction/operand will always generate the same output.

objdump

We can also "disassemble" our binary, which means to generate the assembly code from the binary. To do this, type:

$ objdump -D bin/first

bin/first:     file format elf64-x86-64


Disassembly of section .text:

0000000000400080 <_start>:
  400080:	b8 3c 00 00 00       	mov    $0x3c,%eax
  400085:	bf 00 00 00 00       	mov    $0x0,%edi
  40008a:	0f 05                	syscall

We can see some interesting things here. First, our mov rax, 60 command from first.asm was translated to b8 3c 00 00 00. But more importantly, through the disassembly we learn that this is the code for mov $0x3c,%eax, which, in the syntax that we are writing (called "Intel" syntax) would be mov eax, 60. Recall from earlier that EAX is just the lower 32-bit portion of RAX.

B8, if I understand correctly, is the opcode for mov eax, and 3C is just 60 in hexadecimal. And EAX is a 32-bit register, so that is why the four pairs of hexadecimals (for a total of 4 bytes or 32 bits) passed after it.

xxd

Another cool tool to look into binaries is the program xxd (it was already pre-installed in my Linux Mint OS). Look at the output if we run it against our hello binary:

$ xxd bin/hello
000000a0: 0d00 0000 0000 0000 0000 2000 0000 0000  .......... .....
000000b0: b801 0000 00bf 0100 0000 48be d800 6000  ..........H...`.
000000c0: 0000 0000 ba0d 0000 000f 05b8 3c00 0000  ............<...
000000d0: 4831 ff0f 0500 0000 4865 6c6c 6f2c 2077  H1......Hello, w
000000e0: 6f72 6c64 0a00 0000 0000 0000 0000 0000  orld............
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000100: 0000 0000 0300 0100 b000 4000 0000 0000  ..........@.....

(This is just part of the output, I truncated it above and below.) As you can see, this program shows the ASCII characters corresponding to some bytes in the executable (and dots when it is not an ASCII char).

C versus Assembly binaries

Another interesting to compare, not even using these tools, is the size of the executables generated when we made standalone binaries (with a _start label) and when we linked with the C program. You can use the ls -lh command to check the sizes of files in a directory.

Since the programs linked with our caller.c helper (using the caller_c script) are much larger than "pure" assembly programs, you can also look into the .o object files with objdump, which will be much smaller, as they don't have the instructions inserted by the gcc.