Platform interoperability #27

TomScheffers · 2022-09-01T11:25:45Z

Is there a way to effectively check if compiled models are able to run on a machine?

I am running predictions on various platforms, when loading the compiled model, I load the one which was compiled on the same platform (using: PLATFORM = sys.platform + '-' + sysconfig.get_platform().split('-')[-1].lower(), resulting in either darwin-arm64 or linux-x86_64). However sometimes models which are compiled in a linux-x86_64 environment, are not interoperable with other linux-x86_64 machines (I use AWS Fargate, which runs the container on whatever hardware is available). This results in exit code 132 (Illegal Instruction) in the model.predict() loop.

The underlying reason is probably that the underlying machines are not the same architecture (ARM based?). For example, when I compile a model within a Docker container (with DOCKER_DEFAULT_PLATFORM=linux/amd64) on my M1 Mac, it registers the platform as linux-x86_64, but the model cannot be used on AWS linux machine using Docker.

What would be a solid way to go about this issue? Is there some LLVM version which I need to look at in order for models to be interoperable?

Thanks a lot.

The text was updated successfully, but these errors were encountered:

siboehm · 2022-09-01T11:52:58Z

Related to #12. I think the issue you're encountering is that lleaves basically compiles with march=native, meaning the code is targeted to the microarchitecture (Haswell, Skylake etc) as well as the ISA extensions (AVX256, AVX512, SSE4, ...). So there's no guarantee that a cached binary file will run on a different CPU, unless they're the same model. So you're compiling on one x86_64 machine and lleaves emits some instructions that don't exist on a different / older x86_64 CPU.

In the current lleaves version there's not much you can do except to compile on the machine that you'll run the final binary on. The way to fix this is to introduce a new flag in the compile() method, something like native=False. Then lleaves should disable the hyper-specific instruction targeting. The relevant code is here: https://github.com/siboehm/lleaves/blob/master/lleaves/llvm_binding.py#L16
This won't be a big change, but it'll require some testing. I can't really tell you when I'll get around to implementing it. It's on my todo list eventually, but I'll also accept PRs for it :)

Issues like this are to some degree the consequence of using llvmlite to interface with LLVM, as opposed to writing a proper LLVM compiler. llvmlite is made for JIT-compilers like numba, which always assume you'll run the code on the machine that it was compiled on.

TomScheffers · 2022-09-01T12:26:44Z

Okay, it totally makes sense now. Thanks again for your quick response.

For me it would still be beneficial to use specific instruction targeting, however I need to know which compiled version my machine requires. For now I will hash the llvm.get_host_cpu_features to compute interoperability. That should work right 😄? Something like:

import hashlib, json
import llvmlite.binding as llvm

h = hashlib.sha256()
h.update(json.dumps(dict(llvm.get_host_cpu_features()), sort_keys=True).encode())
key = h.hexdigest()

siboehm · 2022-09-01T12:37:29Z

Without think about it for long, I'd probably use get_host_cpu_features, get_cpu_name and get_process_triple, that should fully determine the specific CPU + CPU features + operating system.

siboehm mentioned this issue Oct 13, 2022

lleaves costs too much memory #28

Open

siboehm mentioned this issue Oct 25, 2022

[Question] how does model cache play with distributed workers with different CPUs? #31

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Platform interoperability #27

Platform interoperability #27

TomScheffers commented Sep 1, 2022

siboehm commented Sep 1, 2022

TomScheffers commented Sep 1, 2022

siboehm commented Sep 1, 2022

Platform interoperability #27

Platform interoperability #27

Comments

TomScheffers commented Sep 1, 2022

siboehm commented Sep 1, 2022

TomScheffers commented Sep 1, 2022

siboehm commented Sep 1, 2022