-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XLA doesn't support Mac ARM #217
Comments
Do you have the full Bazel build output? It might be something we can try to troubleshoot or we can open this as an issue upstream. @josevalim On that note perhaps we should also add some options for logging build outputs. It would make probably make it easier for users to pass these issues up for us to try and debug. |
I'm pretty sure that Mac ARM support just doesn't yet exist for XLA but will soon enough. You can see various Jax issues and discussions about the topic (as well as for PyTorch XLA). |
Thanks @jeffreyksmithjr! Btw, we will make the project public today, so if you get any e-mail related to your org access, that's the reason. :) |
I'm also on an ARM Mac and got a similar error. Inspecting the libxsla.so file I get:
and I think the reason is my Bazel is a x86_64 binary as well:
Looks like support for ARM Macs landed very recently via |
I was able to maybe make some progress:
Credit: bazelbuild/bazel#12900 (comment)
results in a different error:
|
@wojtekmach Can you try |
For anybody interested in attempting to resolve this in some way, it seems it might be possible to fix: I don't have a mac to test on, but my recommendation is: First, per the issue above - Install x86_64 Bazel 3.7.1 through Rosetta Next, change Finally, set
|
@seanmor5 Tried that but I don't think that version of Tensorflow likes the Makefile patches:
|
@behe I know what's going on. In #247 we "removed" the NumPy dependency by commenting out some things in the TF build script. They changed the script recently so we'll need to update what we're doing with a new commit. What you'll want to do is remove |
I got the same error as above, interestingly running the exact same version of tensorflow from a local checkout allowed me to move forward diff --git a/exla/Makefile b/exla/Makefile
index 0db21c8..2b7864b 100644
--- a/exla/Makefile
+++ b/exla/Makefile
@@ -24,7 +24,8 @@ ERTS_SYM_DIR = $(EXLA_DIR)/erts
BAZEL_FLAGS = --define "framework_shared_object=false" -c $(EXLA_MODE)
TENSORFLOW_NS = tf-$(EXLA_TENSORFLOW_GIT_REV)
-TENSORFLOW_DIR = $(EXLA_CACHE)/$(TENSORFLOW_NS)/erts-$(ERTS_VERSION)
+# TENSORFLOW_DIR = $(EXLA_CACHE)/$(TENSORFLOW_NS)/erts-$(ERTS_VERSION)
+TENSORFLOW_DIR = /Users/wojtek/src/tensorflow
TENSORFLOW_EXLA_NS = tensorflow/compiler/xla/exla
TENSORFLOW_EXLA_DIR = $(TENSORFLOW_DIR)/$(TENSORFLOW_EXLA_NS)
@@ -48,11 +49,11 @@ PTD:
$(TENSORFLOW_DIR):
mkdir -p $(TENSORFLOW_DIR)
- cd $(TENSORFLOW_DIR) && \
- git init && \
- git remote add origin $(EXLA_TENSORFLOW_GIT_REPO) && \
- git fetch --depth 1 origin $(EXLA_TENSORFLOW_GIT_REV) && \
- git checkout FETCH_HEAD
+ # cd $(TENSORFLOW_DIR) && \
+ # git init && \
+ # git remote add origin $(EXLA_TENSORFLOW_GIT_REPO) && \
+ # git fetch --depth 1 origin $(EXLA_TENSORFLOW_GIT_REV) && \
+ # git checkout FETCH_HEAD
cd $(TENSORFLOW_DIR) && \
sed -e '/register_toolchains("@local_config_python\/\/:py_toolchain")/ s/^#*/#/' -i.backup WORKSPACE && \
@@ -63,4 +64,4 @@ $(TENSORFLOW_DIR):
clean:
cd $(TENSORFLOW_DIR) && bazel clean --expunge
rm -f $(ERTS_SYM_DIR) $(TENSORFLOW_EXLA_DIR)
- rm -rf $(EXLA_SO) $(TENSORFLOW_DIR)
+ rm -rf $(EXLA_SO) # $(TENSORFLOW_DIR) after this I got a different error that I'll post soon. |
As I mentioned in the previous post, I was able to make further progress but eventually the compilation crashed with:
below is the full output:
|
@seanmor5 @wojtekmach Did you manage to fix the LLVM linking issue? I am having the same problem building XLA for Jax and it seems like Bazel strips out some of the LLVM dependencies. |
Im stuck at |
@wojtekmach Although the above issue looks like a compiler/linker bug it also looks like you are linking in with some tensorflow core functionality (some TF kernels) and do you need those? For the Python XLA extension used in jax there is some discussion about that here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/python/BUILD#L539 |
@jotsif thanks! I tried that and got the following, might be an exla issue?
|
@wojtekmach That is an EXLA issue. If you change ERL_NIF_TERM qr(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[]) {
if (argc != 3) {
return exla::nif::error(env, "Bad argument count.");
}
xla::XlaOp* operand;
bool full_matrices;
int config_int;
if (!exla::nif::get<xla::XlaOp>(env, argv[0], operand)) {
return exla::nif::error(env, "Unable to get operand.");
}
if (!exla::nif::get(env, argv[1], &full_matrices)) {
return exla::nif::error(env, "Unable to get full matrices flag.");
}
xla::XlaOp q, r;
xla::QrExplicit(*operand, full_matrices, q, r);
ERL_NIF_TERM q_term = exla::nif::make<xla::XlaOp>(env, q);
ERL_NIF_TERM r_term = exla::nif::make<xla::XlaOp>(env, r);
return exla::nif::ok(env, enif_make_tuple2(env, q_term, r_term));
} It should fix. QR will need to be updated in some other places as well, but this is a quick fix for the build on Mac |
Yeah, that allowed me to move forward but ended up stuck on the same
|
Would be interested if anyone has got this to work yet? |
Hi @andrewphillipo, I have been tracking some developments in JAX that this may now be possible on TF Master: google/jax#5501 and google/jax#6701 I am willing to put a branch together that upgrades our TF version and adds the flags necessary for building on Mac ARM, but I unfortunately don't have a machine to test on. |
@seanmor5 i think we can upgrade the TF version and then let users play with flags. :) Upgrading TF would already be a huge help. |
If it helps I have a Mac mini here you can remote into if you like? Not sure what the best solution is for that or if you even have time. Also happy to test anything... |
@wojtekmach could have a crack at it, one more time? |
Does this mean anything to you guys ;-) https://developer.apple.com/metal/tensorflow-plugin/ |
For those tracking this issue, #423 uses a new version of TensorFlow which should support Mac ARM. I believe you might need Bazel 4.1. If anybody would like to take a shot at building off of that branch, that would be really appreciated |
Didn't mean to close |
I tried the following:
|
In attempting to install EXLA (and thus XLA) on a Mac with an M1 chip (running Big Sur) runs into this issue:
Really more of an XLA issue than an EXLA issue. Just noting it for capture of the negative data around Mac ARM support.
The text was updated successfully, but these errors were encountered: