Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test92: refactor and fix SLJIT_MOV_U16 atomics in higher offsets for s390x #173

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

carenas
Copy link
Contributor

@carenas carenas commented Jun 18, 2023

while this doesn't fully fix higher offset issues with atomics in s390x (with the emulation) it is worth doing and provides enough data to discuss the different approaches that could be taken to do so long term.

@zherczeg
Copy link
Owner

What is the problem (bug?) exactly?

@carenas
Copy link
Contributor Author

carenas commented Jun 18, 2023

What is the problem (bug?) exactly?

to begin it truncates the memory address to a 4 byte alignment but reads 8, so it bases its calculation in invalid data, past the first integer.

indeed, the reason why I couldn't come with a "fix" is also why I decided for a full rewrite (only for SLJIT_MOV_U16 case) so we could at least compare approaches and maybe come back with a better one at the end.

@zherczeg
Copy link
Owner

It uses cs not csg so it should be a 4 byte read. It is a good idea to compare multiple approaches though.

allow for the option (currently unused) of adding non packed
atomics for testing.

add test on higher half offset, and fix it for s390x, a similar
problem with SLJIT_MOV_U8 (currently not tested) and its fix
punted for now.
@carenas
Copy link
Contributor Author

carenas commented Jun 18, 2023

to help debugging here is my ad-hoc test code:

#include "sljitLir.h"
#include <stdio.h>

#if T == 16
typedef sljit_u16 test_t;
#define M SLJIT_MOV_U16
#define D 0x1337
#else
typedef sljit_u8 test_t;
#define M SLJIT_MOV_U8
#define D 0x42
#endif

typedef long (SLJIT_FUNC *func_t)(test_t *);

static func_t generate_func(void)
{
	void *code;

	struct sljit_compiler *C = sljit_create_compiler(NULL, NULL);

	sljit_emit_enter(C, 0, SLJIT_ARGS1(W, P), 2, 1, 0, 0, 0);
	sljit_emit_atomic_load(C, M, SLJIT_R0, SLJIT_S0);
	sljit_emit_op1(C, M, SLJIT_R1, 0, SLJIT_IMM, D);
	sljit_emit_atomic_store(C, M, SLJIT_R1, SLJIT_S0, SLJIT_R0);

	sljit_emit_return(C, SLJIT_MOV, SLJIT_R0, 0);

	code = sljit_generate_code(C);

	sljit_free_compiler(C);
	return (func_t)code;

}

int main(void)
{
	sljit_uw c = 0x5566aabbdeadbeef;
	func_t func = generate_func();

	for (int i = 0; i < (int)(sizeof(c)/sizeof(test_t)); i++) {
		printf("%d: %lx %hx %lx\n", i, func((test_t *)&c + i), ((test_t *)&c)[i], c);
		c = 0x5566aabbdeadbeef;
	}

	sljit_free_code((void *)func, NULL);
	return 0;
}

and the successful output (build with -DT=16) with the proposed code:

0: 5566aabb 1337 1337aabbdeadbeef
1: 5566aabb 1337 55661337deadbeef
2: deadbeef 1337 5566aabb1337beef
3: deadbeef 1337 5566aabbdead1337

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants