Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running on big endian host (such as s390x) #1931

Closed
tmfink opened this issue Nov 7, 2022 · 4 comments
Closed

Error when running on big endian host (such as s390x) #1931

tmfink opened this issue Nov 7, 2022 · 4 comments

Comments

@tmfink
Copy link
Contributor

tmfink commented Nov 7, 2022

When built/run on a big endian host (such as s390x), capstone has unexpected output.

Expected

When running on an amd64 Linux host (little endian):

./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
 0  f2 3c 44 22 40 49 0e 56  fadd.s     #3.141500, fp0

Actual

When running on a s390x Linux host (big endian):

$ ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
 0  f2 3c 44 22 40 49 0e 56  fadd.s     #0.000000, fp0

This was originally discovered by @plugwash in capstone-rust/capstone-rs#137 for debian testing CI tests for rust-capstone (rust-lang bindings).

Reproducing/Testing

I was able to get a s390x virtualized using multiarch/qemu-user-static container as mentioned in these docs:
https://docs.gitlab.com/omnibus/development/s390x.html

It looks like the upstream C library has a bug when running on a big endian host:

$ uname -a
Linux d2dad0ba076b 5.19.0-76051900-generic #202207312230~1663791054~22.04~28340d4~dev-Ubuntu SMP PREEMPT_DY s390x s390x s390x GNU/Linux
$ ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
 0  f2 3c 44 22 40 49 0e 56  fadd.s     #0.000000, fp0

This is just one example test that failed--there are many. More testing is required to find more error cases.
Also, ideally a big endian architecture would be tested in CI.

@XVilka
Copy link
Contributor

XVilka commented Jun 29, 2023

In Rizin we use Travis CI for S390, and we use capstone as the engine. If you send a PR with a test, it could be ensured there is no regression once fixed:
https://github.com/rizinorg/rizin/blob/dev/.travis.yml#L22

It would be nice to have somehow big-endian CI for Capstone too but Travis CI is expensive.
Spinning up QEMU in GitHub Actions might be easier, if you want to send a PR here

@huth
Copy link
Contributor

huth commented Nov 23, 2023

I think this is likely a dup of #1710 ... @michalsc provides a hint there how this could be fixed.

I think something like this should do the job:

diff a/include/capstone/m68k.h b/include/capstone/m68k.h
--- a/include/capstone/m68k.h
+++ b/include/capstone/m68k.h
@@ -161,7 +161,12 @@ typedef struct cs_m68k_op {
 	union {
 		uint64_t imm;               ///< immediate value for IMM operand
 		double dimm; 		    ///< double imm
-		float simm; 		    ///< float imm
+		struct {
+#if defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
+			float pad_simm;
+#endif
+			float simm; 	    ///< float imm
+		};
 		m68k_reg reg;		    ///< register value for REG operand
 		cs_m68k_op_reg_pair reg_pair; ///< register pair in one operand
 	};

... maybe not the nicest solution on earth, but at least it is not very intrusive ...?

@huth
Copy link
Contributor

huth commented Dec 18, 2023

I'm pretty confident that this is the same issue as #1710 ... so I'd like to suggest to close this one here and track the issue in the older ticket instead?

@tmfink
Copy link
Contributor Author

tmfink commented Dec 19, 2023

duplicate of #1710

@tmfink tmfink closed this as completed Dec 19, 2023
kabeor pushed a commit that referenced this issue Dec 21, 2023
…ts (#2222)

Disassembling single floating points with immediate values currently
gives wrong results on big endian hosts (like s390x), e.g.:

 ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
  0  f2 3c 44 22 40 49 0e 56  fadd.s     #0.000000, fp0

While it should be (like on x86):

 ./cstool/cstool m68k40 'f2 3c 44 22 40 49 0e 56'
  0  f2 3c 44 22 40 49 0e 56  fadd.s     #3.141500, fp0

The problem is that these single float values are supposed to be stored
in the 32-bit "simm" field of struct cs_m68k_op (see e.g. the printing
of M68K_FPU_SIZE_SINGLE in printAddressingMode() in M68KInstPrinter.c),
but currently the immediate is only written to the 64-bit "imm" field
of the union in cs_m68k_op. This works on little endian systems, since
the least significant bytes overlap in the union there. For example,
let's assume that the value 0x01020304 gets written to "imm":

 04 03 02 01 00 00 00 00    uint64_t imm
 xx xx xx xx xx xx xx xx    double dimm;
 xx xx xx xx .. .. .. ..    float simm;

But on big endian hosts, the important bytes do not overlap, so "simm"
is always zero there:

 00 00 00 00 01 02 03 04    uint64_t imm
 xx xx xx xx xx xx xx xx    double dimm;
 xx xx xx xx .. .. .. ..    float simm;

To fix the problem, let's always set "simm" explicitly, this works on
both, big endian and little endian hosts.

Thanks to Michal Schulz for his initial analysis of the problem
(in #1710) and to Travis Finkenauer for providing an easy example
to reproduce the issue (in #1931).

Closes: #1710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants