Error when running on big endian host (such as s390x) #1931

tmfink · 2022-11-07T09:30:57Z

When built/run on a big endian host (such as s390x), capstone has unexpected output.

Expected

When running on an amd64 Linux host (little endian):

./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
 0  f2 3c 44 22 40 49 0e 56  fadd.s     #3.141500, fp0

Actual

When running on a s390x Linux host (big endian):

$ ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
 0  f2 3c 44 22 40 49 0e 56  fadd.s     #0.000000, fp0

This was originally discovered by @plugwash in capstone-rust/capstone-rs#137 for debian testing CI tests for rust-capstone (rust-lang bindings).

Reproducing/Testing

I was able to get a s390x virtualized using multiarch/qemu-user-static container as mentioned in these docs:
https://docs.gitlab.com/omnibus/development/s390x.html

It looks like the upstream C library has a bug when running on a big endian host:

$ uname -a
Linux d2dad0ba076b 5.19.0-76051900-generic #202207312230~1663791054~22.04~28340d4~dev-Ubuntu SMP PREEMPT_DY s390x s390x s390x GNU/Linux
$ ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
 0  f2 3c 44 22 40 49 0e 56  fadd.s     #0.000000, fp0

This is just one example test that failed--there are many. More testing is required to find more error cases.
Also, ideally a big endian architecture would be tested in CI.

The text was updated successfully, but these errors were encountered:

XVilka · 2023-06-29T04:45:48Z

In Rizin we use Travis CI for S390, and we use capstone as the engine. If you send a PR with a test, it could be ensured there is no regression once fixed:
https://github.com/rizinorg/rizin/blob/dev/.travis.yml#L22

It would be nice to have somehow big-endian CI for Capstone too but Travis CI is expensive.
Spinning up QEMU in GitHub Actions might be easier, if you want to send a PR here

huth · 2023-11-23T17:41:54Z

I think this is likely a dup of #1710 ... @michalsc provides a hint there how this could be fixed.

I think something like this should do the job:

diff a/include/capstone/m68k.h b/include/capstone/m68k.h
--- a/include/capstone/m68k.h
+++ b/include/capstone/m68k.h
@@ -161,7 +161,12 @@ typedef struct cs_m68k_op {
 	union {
 		uint64_t imm;               ///< immediate value for IMM operand
 		double dimm; 		    ///< double imm
-		float simm; 		    ///< float imm
+		struct {
+#if defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
+			float pad_simm;
+#endif
+			float simm; 	    ///< float imm
+		};
 		m68k_reg reg;		    ///< register value for REG operand
 		cs_m68k_op_reg_pair reg_pair; ///< register pair in one operand
 	};

... maybe not the nicest solution on earth, but at least it is not very intrusive ...?

huth · 2023-12-18T16:23:52Z

I'm pretty confident that this is the same issue as #1710 ... so I'd like to suggest to close this one here and track the issue in the older ticket instead?

tmfink · 2023-12-19T05:09:17Z

duplicate of #1710

…ts (#2222) Disassembling single floating points with immediate values currently gives wrong results on big endian hosts (like s390x), e.g.: ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56' 0 f2 3c 44 22 40 49 0e 56 fadd.s #0.000000, fp0 While it should be (like on x86): ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56' 0 f2 3c 44 22 40 49 0e 56 fadd.s #3.141500, fp0 The problem is that these single float values are supposed to be stored in the 32-bit "simm" field of struct cs_m68k_op (see e.g. the printing of M68K_FPU_SIZE_SINGLE in printAddressingMode() in M68KInstPrinter.c), but currently the immediate is only written to the 64-bit "imm" field of the union in cs_m68k_op. This works on little endian systems, since the least significant bytes overlap in the union there. For example, let's assume that the value 0x01020304 gets written to "imm": 04 03 02 01 00 00 00 00 uint64_t imm xx xx xx xx xx xx xx xx double dimm; xx xx xx xx .. .. .. .. float simm; But on big endian hosts, the important bytes do not overlap, so "simm" is always zero there: 00 00 00 00 01 02 03 04 uint64_t imm xx xx xx xx xx xx xx xx double dimm; xx xx xx xx .. .. .. .. float simm; To fix the problem, let's always set "simm" explicitly, this works on both, big endian and little endian hosts. Thanks to Michal Schulz for his initial analysis of the problem (in #1710) and to Travis Finkenauer for providing an easy example to reproduce the issue (in #1931). Closes: #1710

tmfink closed this as completed Dec 19, 2023

huth mentioned this issue Dec 20, 2023

Fix broken disassembly of floating point immediates on big endian hosts #2222

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when running on big endian host (such as s390x) #1931

Error when running on big endian host (such as s390x) #1931

tmfink commented Nov 7, 2022

XVilka commented Jun 29, 2023

huth commented Nov 23, 2023

huth commented Dec 18, 2023

tmfink commented Dec 19, 2023 •

edited

Loading

Error when running on big endian host (such as s390x) #1931

Error when running on big endian host (such as s390x) #1931

Comments

tmfink commented Nov 7, 2022

Expected

Actual

Reproducing/Testing

XVilka commented Jun 29, 2023

huth commented Nov 23, 2023

huth commented Dec 18, 2023

tmfink commented Dec 19, 2023 • edited Loading

tmfink commented Dec 19, 2023 •

edited

Loading