Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

published maven jars don't support Apple Silicon (ARM) Macs (arm64e) because of missing dynamic library #5035

Open
xelax opened this issue Feb 27, 2022 · 17 comments

Comments

@xelax
Copy link

xelax commented Feb 27, 2022

Description

Running lightgbm on a Mac with apple silicon (M1 chip, aarch64 architecture) fails because of missing native library for the architecture:

22/02/16 10:51:32 ERROR LightGBMRanker: {"uid":"LightGBMRanker_7c727e2d1c9e","className":"class com.microsoft.azure.synapse.ml.lightgbm.LightGBMRanker","method":"train","buildVersion":"0.9.5"}
org.apache.spark.SparkException: Job aborted due to stage failure: Could not recover from a failed barrier ResultStage. Most recent failure reason: Stage failed because barrier task ResultTask(13, 0) finished unsuccessfully.
java.lang.UnsatisfiedLinkError: /private/var/folders/31/56fwtfy17t520t96p0zwjnt40000gq/T/mml-natives618985592717473613/lib_lightgbm.dylib: dlopen(/private/var/folders/31/56fwtfy17t520t96p0zwjnt40000gq/T/mml-natives618985592717473613/lib_lightgbm.dylib, 0x0001): tried: '/private/var/folders/31/56fwtfy17t520t96p0zwjnt40000gq/T/mml-natives618985592717473613/lib_lightgbm.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e')), '/usr/lib/lib_lightgbm.dylib' (no such file)
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1950)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1832)
	at java.lang.Runtime.load0(Runtime.java:811)
	at java.lang.System.load(System.java:1088)
	at com.microsoft.azure.synapse.ml.core.env.NativeLoader.loadLibraryByName(NativeLoader.java:66)
	at com.microsoft.azure.synapse.ml.lightgbm.LightGBMUtils$.initializeNativeLibrary(LightGBMUtils.scala:39)
	at com.microsoft.azure.synapse.ml.lightgbm.LightGBMBase.trainLightGBM(LightGBMBase.scala:356)

I think the fix should be as simple as adding the appropriate library inside lightgbmlib-3.2.110.jar.

@StrikerRUS
Copy link
Collaborator

@xelax Hey, sorry about the problem you are having! Next time please add a reference to the preceding context of the issue. microsoft/SynapseML#1405

@xelax
Copy link
Author

xelax commented Feb 28, 2022

ok, sorry about that. I though that it was not essential for the report. I will put a cross reference in that bug report as well pointing to this one.

@StrikerRUS
Copy link
Collaborator

We have no communication channels with SynapseML maintainers other than public conversations on GitHub, so it's really important to use cross references with the aim to "connect" us because we are not maintaining com.microsoft.azure.synapse.ml.lightgbm package in this repo but SynapseML maintainers have already redirected you here with this problem as they think this is our upstream issue.

@mhamilton723
Copy link
Contributor

Hey @StrikerRUS, i suggested that @xelax raise this issue because it seems like it will require a change to the actual jar that lightGBM publishes. In particular one needs to add native libraries for MacOSX in the lightGBM jar creation side. Do you know how he can get started with trying to make that change?

@imatiach-msft
Copy link
Contributor

@StrikerRUS I think @mhamilton723 is correct, we publish the jar with linux/macos/windows native files built by lightgbm repository. However, I'm not sure how to resolve this issue. I see a similar issue here:
#4843
Adding a different architecture from the current build step sounds like a lot of work though, I don't even know if it is possible.

@imatiach-msft
Copy link
Contributor

Adding some screenshots of the generated artifacts we use:
image

This builds the swig dylib file for macos:
image

The artifacts can be found here:
image

I use these artifacts to create the lightgbm jar that is published to maven:
image

@imatiach-msft
Copy link
Contributor

my guess is something about this task/vm image needs to be modified:

vmImage: 'macOS-10.15'

xelax is there some kind of mac image with this arm64e architecture? Not sure if that is even possible.

@StrikerRUS
Copy link
Collaborator

Hey guys @mhamilton723, @imatiach-msft 🙋‍♂️

Do you know how he can get started with trying to make that change?

Unfortunately, not.

Adding a different architecture from the current build step sounds like a lot of work though, I don't even know if it is possible.

my guess is something about this task/vm image needs to be modified:

As of best of my knowledge, there is no any free CI service that suggests ARM64 macOS right now. So we cannot simply add new CI job.

Probably, cross-compilation is a workaround. But I don't know much about it and unfortunately don't have enough time to get familiar with it right now. Also, I heard cross compilation for ARM64 macOS has its own challenges.

Some potentially useful links if one has interest to start digging:
dmlc/xgboost#7706
dmlc/xgboost#7621
https://stackoverflow.com/questions/64788005/java-jdk-for-the-apple-m1-chip
https://github.com/conda-forge/lightgbm-feedstock/blob/master/.ci_support/osx_64_python3.9.____cpython.yaml

@StrikerRUS
Copy link
Collaborator

@imatiach-msft

Adding some screenshots of the generated artifacts we use:

You can simply download nightly builds. They have files you need.
https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html#nightly-builds

@imatiach-msft
Copy link
Contributor

"As of best of my knowledge, there is no any free CI service that suggests ARM64 macOS right now. So we cannot simply add new CI job."
Oh I see... I guess this is just not possible then, unless a CI is created for ARM64 macOS. I don't think we should try to do something special just for this architecture. Sorry @xelax . If you are really, really, really motivated you could create your own lightgbm jar, upload it on maven, and then create a custom build of mmlspark (which can be done with any PR) that would use that jar. But that is really a lot of work.

@StrikerRUS
Copy link
Collaborator

I don't think we should try to do something special just for this architecture.

The thing is that this architecture is getting more and more popular. New Macs will be released exclusively with Apple M chips.

The main problem is that no one of maintainers here has new Mac, so we cannot even check some basic workarounds (see for example this thread #4843).

@xelax
Copy link
Author

xelax commented Mar 3, 2022

support for M1 macs in github actions might be coming soon: https://github.com/actions/runner/blob/main/docs/start/envosx.md

@StrikerRUS
Copy link
Collaborator

support for M1 macs in github actions might be coming soon:

That will be truly amazing!

However, that doc hasn't been updated since 15 Sep 2021 and the PR they are referring to as a blocker was merged 10 Jun 2021.

@StrikerRUS
Copy link
Collaborator

In addition, to be clear, "support for M1 macs in github actions" runner doesn't mean free GitHub Actions hosted runner on M1 Mac.
actions/runner-images#2187

@StrikerRUS
Copy link
Collaborator

Some thoughts about cross-compilation from more experienced person than me.

Probably, cross-compilation is a workaround. But I don't know much about it and unfortunately don't have enough time to get familiar with it right now. Also, I heard cross compilation for ARM64 macOS has its own challenges.

I think cross-compiling the JNI binding will be tricky. You'll probably need to have an ARM64 JVM as well as an x86 one to ensure you have the right libraries to link against.
dmlc/xgboost#7501 (comment)

@Vonatzki
Copy link

Any updates on this? My team is interested in testing SynapseML in an EMR Cluster composed of Graviton2 nodes.

@jameslamb
Copy link
Collaborator

Thanks for using LightGBM. If you're saying you have access to an environment with an arm64e architecture and are willing to work on proposing fixes to allow compiling LightGBM in such environments, we'd welcome the help!

Otherwise, click "subscribe" on this issue to be notified of updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants