Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors building tensorflow-text on apple silicon even when using matching versions of tensorflow 2.10 #1077

Open
tsdeng opened this issue Jan 29, 2023 · 54 comments

Comments

@tsdeng
Copy link

tsdeng commented Jan 29, 2023

I'm trying to build the tensorflow-text on a M1 Mac because tensorflow-text is not released for Apple silicon.

When compiling using ./oss_scripts/run_build.sh I see following errors:

In file included from tensorflow_text/core/kernels/byte_splitter_tflite.cc:18:
external/org_tensorflow/tensorflow/lite/kernels/shim/tflite_op_shim.h:128:35: error: no member named 'OpName' in 'tensorflow::text::ByteSplitterWithOffsetsOp<tflite::shim::Runtime::kTfLite>'
    resolver->AddCustom(ImplType::OpName(), GetTfLiteRegistration());
                        ~~~~~~~~~~^
tensorflow_text/core/kernels/byte_splitter_tflite.cc:28:53: note: in instantiation of member function 'tflite::shim::TfLiteOpKernel<tensorflow::text::ByteSplitterWithOffsetsOp>::Add' requested here
      tensorflow::text::ByteSplitterWithOffsetsOp>::Add(resolver);

I would also love to know if anyone is able to get tensorflow-text 2.10 to work on apple silicon.

Setup and reproduce

Tensorflow is installed via conda

conda install -c apple tensorflow-deps=2.10.0
python -m pip install tensorflow-macos==2.10.0
python -m pip install tensorflow-metal==0.6

Bazel is installed from home-brew

brew install bazelisk

Tensorflow-text is downloaded from release page

wget https://github.com/tensorflow/text/archive/refs/tags/v2.10.0.zip
@broken
Copy link
Member

broken commented Jan 30, 2023

You are not compiling against TF 2.10.0 despite having it installed. That line in tflite_op_shim.h is not a function in version 2.10 and 2.11. So you must be compiling against nightly.
https://github.com/tensorflow/tensorflow/blob/r2.10/tensorflow/lite/kernels/shim/tflite_op_shim.h#L128

In run_build.sh, it is not running the prepare_tf_dep.sh script if it is running on Apple silicon. That script sets up the TF dependencies to be the exact commit of the TF you are running.

Can you try running ./oss_scripts/prepare_tf_dep.sh manually? I wonder if it works and the reason it doesn't run is just because we couldn't confirm it works, or if there is a fundamental problem with how Apple sets up the library that we are not able to grab the correct commit from that script. You may want to add set -x to the top of the file to see the lines as they run if it fails.

@tsdeng
Copy link
Author

tsdeng commented Jan 31, 2023

@broken thanks for the quick reply
I'm now encountering the checksum issue which probably is caused by https://github.blog/changelog/2023-01-30-git-archive-checksums-may-change/

I will update once the checksum issue go away.

@tsdeng
Copy link
Author

tsdeng commented Feb 1, 2023

Now I'm seeing different errors:

bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h:17:2: error: This file was generated by an older version of protoc which is
#error This file was generated by an older version of protoc which is
 ^
bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h:18:2: error: incompatible with your Protocol Buffer headers. Please
#error incompatible with your Protocol Buffer headers. Please
 ^
bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h:19:2: error: regenerate this file with a newer version of protoc.
#error regenerate this file with a newer version of protoc.

@tsdeng
Copy link
Author

tsdeng commented Feb 1, 2023

The protobuf version is 3.19.6 which is pulled by tensorflow-macos 2.10.0

(venv) ➜  text git:(4a098cd) ✗ conda list | grep proto
protobuf                  3.19.6                   pypi_0    pypi

I don't have protoc installed.

@ethiel
Copy link

ethiel commented Feb 1, 2023

Same issue here, in this case with tensorflow-macos 2.11.0 and protobuf 3.19.4

@tsdeng
Copy link
Author

tsdeng commented Feb 1, 2023

@ethiel how did you get a higher version of tensorflow-macos and a lower version of protobuf? Is the protobuf dependency pulled by tensorflow-macos? Or did you install it separately?

@ethiel
Copy link

ethiel commented Feb 1, 2023

It was pulled from tensorflow. I did not install any library.

@ethiel
Copy link

ethiel commented Feb 1, 2023

However, after manually executing prepare_tf_dep.sh I'm facing a different issue:
error: no member named 'OpName' in 'tensorflow::text::FastBertNormalizeOptflite::shim::Runtime::kTfLite'

@ethiel
Copy link

ethiel commented Feb 1, 2023

I'm afraid we will have to wait for Apple to provide the port for tensor flow-text... I guess there is no way to compile from source as tensorflow-macos was built with different versions, different architecture...
Anyway, @tsdeng if you are able to create the wheel, I'll be the first one to say thank you.

@broken
Copy link
Member

broken commented Feb 1, 2023

@ethiel Your error is mismatching versions. On nightly, we switched the shims from using OpName member variable to method. Likely you need to check out the TF Text branch of the version you want to build, run prepare_tf_dep, or something similar.

@tsdeng so did prepare_tf_dep.sh run fine? If so, I don't see any reason we shouldn't be running it.

@tsdeng Ugh.. This does look closer to a blocker. I have no clue what protoc tensorflow-macos uses.

If you are tenacious, you can try reaching out to their GitHub account and ask what version they are using, and update our WORKSPACE file with that version. Or more quickly, I know TF Text v2.9 was successfully built (https://github.com/sun1638650145/Libraries-and-Extensions-for-TensorFlow-for-Apple-Silicon/releases), it looks like that used version 3.9.2 (https://github.com/tensorflow/tensorflow/blob/r2.9/tensorflow/workspace2.bzl#L457). You can try copying that into our WORKSPACE file to see if it works. Bazel should ignore what TF specifies if TF Text defines it first. Make sure you are in a new directory or done a bazel clean --expunge (iirc) before rebuilding with this change.

http_archive(
        name = "com_google_protobuf",
        patch_file = ["//third_party/protobuf:protobuf.patch"],
        sha256 = "cfcba2df10feec52a84208693937c17a4b5df7775e1635c1e3baffc487b24c9b",
        strip_prefix = "protobuf-3.9.2",
        system_build_file = "//third_party/systemlibs:protobuf.BUILD",
        system_link_files = {
            "//third_party/systemlibs:protobuf.bzl": "protobuf.bzl",
            "//third_party/systemlibs:protobuf_deps.bzl": "protobuf_deps.bzl",
        },
        urls = ["https://github.com/protocolbuffers/protobuf/archive/v3.9.2.zip"],
    )

I'd remove the patch_file, system_build_file, and system_link_files to see if it works. Those may be TF specific and not needed. Otherwise, you may need to copy those files into our own third_party directory from their r2.9 branch.

Hopefully this helps.

@tsdeng
Copy link
Author

tsdeng commented Feb 2, 2023

@broken I did run prepare_tf_dep.sh and it finishes successfully.

After adding the http_archive of com_google_protobuf
Now the error becomes

ERROR: /private/var/tmp/_bazel_tianshuo/5b5bef7c172bd140468d2c8b06d622ae/external/com_google_protobuf/BUILD:979:21: in blacklisted_protos attribute of proto_lang_toolchain rule @com_google_protobuf//:cc_toolchain: '@com_google_protobuf//:_internal_wkt_protos_genrule' does not have mandatory providers: 'ProtoInfo'. Since this rule was created by the macro 'proto_lang_toolchain', the error might have been caused by the macro implementation
ERROR: /private/var/tmp/_bazel_tianshuo/5b5bef7c172bd140468d2c8b06d622ae/external/com_google_protobuf/BUILD:979:21: Analysis of target '@com_google_protobuf//:cc_toolchain' failed
ERROR: Analysis of target '//oss_scripts/pip_package:build_pip_package' failed; build aborted:
INFO: Elapsed time: 2.021s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (173 packages loaded, 4429 targets configured)

@tsdeng
Copy link
Author

tsdeng commented Feb 2, 2023

I think I don't have a clear understanding of how protobuf libraries are used and linked in this case.

It seems bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h is downloaded from a http archive instead of generated on my local machine using protoc. So the full_type.pb.h must be generated earlier by a protoc of version 3.9.2 and then get packaged in to the http archive.

After this http archive is downloaded to my machine, it tries to link against some protobuf runtime library and only found newer version and therefore the conflict?

What's weird is in my machine I don't have libprotobuf installed so I wonder what is it trying to link against when the error happens. @broken is my understanding correct?

In my conda environment only protobuf is installed, not libprotobuf:

(venv) ➜  text git:(4a098cd) ✗ conda list | grep protobuf
protobuf                  3.19.6                   pypi_0    pypi

@DuongTSon
Copy link

@tsdeng The protobuf issue can be fixed by using a patch to make it compatible with the Bazel 5.3.0.

http_archive(
    name = "com_google_protobuf",
    strip_prefix = "protobuf-3.9.2",
    urls = ["https://github.com/protocolbuffers/protobuf/archive/v3.9.2.zip"],
    patch_args = ["-p1"],
    patches = ["//third_party/protobuf:protobuf.patch"]
)
  • protobuf.patch file
diff --git a/BUILD b/BUILD
index dbae719ff..18ad37cc9 100644
--- a/BUILD
+++ b/BUILD
@@ -100,6 +100,7 @@ LINK_OPTS = select({
 
 load(
     ":protobuf.bzl",
+    "adapt_proto_library",
     "cc_proto_library",
     "py_proto_library",
     "internal_copied_filegroup",
@@ -255,13 +256,10 @@ filegroup(
     visibility = ["//visibility:public"],
 )
 
-cc_proto_library(
+
+cc_library(
     name = "cc_wkt_protos",
-    srcs = WELL_KNOWN_PROTOS,
-    include = "src",
-    default_runtime = ":protobuf",
-    internal_bootstrap_hack = 1,
-    protoc = ":protoc",
+    deprecation = "Only for backward compatibility. Do not use.",
     visibility = ["//visibility:public"],
 )
 
@@ -491,6 +489,13 @@ cc_proto_library(
     deps = [":cc_wkt_protos"],
 )
 
+adapt_proto_library(
+    name = "cc_wkt_protos_genproto",
+    deps = [proto + "_proto" for proto in WELL_KNOWN_PROTO_MAP.keys()],
+    visibility = ["//visibility:public"],
+)
+
+
 COMMON_TEST_SRCS = [
     # AUTOGEN(common_test_srcs)
     "src/google/protobuf/arena_test_util.cc",
@@ -981,7 +986,7 @@ proto_lang_toolchain(
     command_line = "--cpp_out=$(OUT)",
     runtime = ":protobuf",
     visibility = ["//visibility:public"],
-    blacklisted_protos = [":_internal_wkt_protos_genrule"],
+    blacklisted_protos = [proto + "_proto" for proto in WELL_KNOWN_PROTO_MAP.keys()],
 )
 
 proto_lang_toolchain(
diff --git a/protobuf.bzl b/protobuf.bzl
index e0653321f..579fa8331 100644
--- a/protobuf.bzl
+++ b/protobuf.bzl
@@ -6,6 +6,29 @@ def _GetPath(ctx, path):
     else:
         return path
 
+def _adapt_proto_library_impl(ctx):
+    deps = [dep[ProtoInfo] for dep in ctx.attr.deps]
+
+    srcs = [src for dep in deps for src in dep.direct_sources]
+    return struct(
+        proto = struct(
+            srcs = srcs,
+            import_flags = ["-I{}".format(path) for dep in deps for path in dep.transitive_proto_path.to_list()],
+            deps = srcs,
+        ),
+    )
+
+adapt_proto_library = rule(
+    implementation = _adapt_proto_library_impl,
+    attrs = {
+        "deps": attr.label_list(
+            mandatory = True,
+            providers = [ProtoInfo],
+        ),
+    },
+    doc = "Adapts `proto_library` from `@rules_proto` to be used with `{cc,py}_proto_library` from this file.",
+)
+
 def _IsNewExternal(ctx):
     # Bazel 0.4.4 and older have genfiles paths that look like:
     #   bazel-out/local-fastbuild/genfiles/external/repo/foo
@@ -229,7 +252,6 @@ def cc_proto_library(
         cc_libs = [],
         include = None,
         protoc = "@com_google_protobuf//:protoc",
-        internal_bootstrap_hack = False,
         use_grpc_plugin = False,
         default_runtime = "@com_google_protobuf//:protobuf",
         **kargs):
@@ -247,10 +269,6 @@ def cc_proto_library(
           cc_library.
       include: a string indicating the include path of the .proto files.
       protoc: the label of the protocol compiler to generate the sources.
-      internal_bootstrap_hack: a flag indicate the cc_proto_library is used only
-          for bootstraping. When it is set to True, no files will be generated.
-          The rule will simply be a provider for .proto files, so that other
-          cc_proto_library can depend on it.
       use_grpc_plugin: a flag to indicate whether to call the grpc C++ plugin
           when processing the proto files.
       default_runtime: the implicitly default runtime which will be depended on by
@@ -263,25 +281,6 @@ def cc_proto_library(
     if include != None:
         includes = [include]
 
-    if internal_bootstrap_hack:
-        # For pre-checked-in generated files, we add the internal_bootstrap_hack
-        # which will skip the codegen action.
-        proto_gen(
-            name = name + "_genproto",
-            srcs = srcs,
-            deps = [s + "_genproto" for s in deps],
-            includes = includes,
-            protoc = protoc,
-            visibility = ["//visibility:public"],
-        )
-
-        # An empty cc_library to make rule dependency consistent.
-        native.cc_library(
-            name = name,
-            **kargs
-        )
-        return
-
     grpc_cpp_plugin = None
     if use_grpc_plugin:
         grpc_cpp_plugin = "//external:grpc_cpp_plugin"

However, after fixing the protobuf issue, it has another error which I cannot resolve it yet.

tensorflow_text/core/ops/regex_split_ops.cc:31:10: error: use of undeclared identifier 'OkStatus'; did you mean 'Status'?
  return OkStatus();
         ^
bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/platform/status.h:43:7: note: 'Status' declared here
class Status {

@chrisoesterreichprog
Copy link

chrisoesterreichprog commented Feb 8, 2023

https://medium.com/@murphy.crosby/building-tensorflow-and-tensorflow-text-on-a-m1-mac-9b90d55e92df

That Tutorial helped me installing Tensorflow-Text on M1 Mac

  1. Download
    Download these three wheels into your project:
    https://drive.google.com/drive/folders/1eHfUjjb5kOaQ-SZom5ldHROyr5Rwa5mh

  2. Install
    brew install python@3.9
    python3.9 -m venv .venv
    source .venv/bin/activate
    pip install tensorflow_io_gcs_filesystem-0.27.0-cp39-cp39-macosx_13_0_arm64.whl
    pip install tensorflow-2.10.1-cp39-cp39-macosx_13_0_arm64.whl
    pip install tensorflow_text-2.10.0-cp39-cp39-macosx_11_0_arm64.whl
    pip install tensorflow-metal==0.6.0

@DuongTSon
Copy link

DuongTSon commented Feb 8, 2023

Finally, I have successfully built both the tensorflow-text 2.9 and 2.11. The tensorflow-macos version and the tensorflow-text version should be strictly matched. If you build the text V2.11, then you need to checkout to the 2.11 branch and install the tensorflow-macos=2.11.

  1. Install the bazelisk
brew install bazelisk
  1. Remove the auto-configuration in the oss_script/run_build.sh file. Currently, it will update the version to the latest version of Tensorflow which causes the major incompatibility.
# Remove or Disable those lines below
# Set tensorflow version
if [[ $osname != "Darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
   source oss_scripts/prepare_tf_dep.sh
 fi
  1. Create a virtual env with the anaconda ARM version. For example, env name is tf2.11
conda create --name tf2.11 python=3.10
conda activate tf2.11
  1. Install tensorflow-macos and tensorflow-metal
pip install tensorflow-macos==2.11.0
pip install tensorflow-metal==0.7.1
  1. Run the build
./oss_scripts/run_build.sh

@ethiel
Copy link

ethiel commented Feb 8, 2023

Thanks for the guide, @DuongTSon. Where did you find the tensorflow-deps==2.11.0?. I tried but the most updated version is 2.9.0 in my channels.

@broken
Copy link
Member

broken commented Feb 9, 2023

This is great @DuongTSon!

With step 2, are you saying that the conditional is not failing so it's executing the prepare_tf_dep.sh inproperly, or that you need it to execute that line so we need to comment it out? I'm trying to figure out what we need to do to fix it so you no longer need to worry about it.

@DuongTSon
Copy link

@ethiel Sorry, my mistake. Actually we do not need to install tensorflow-deps to build the tensorflow-text on M1 MacOS. Updated my answer!

@DuongTSon
Copy link

@broken Just try to fix that line, actually the condition $osname should be in lowercase.

# Set tensorflow version
if [[ $osname != "darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
  source oss_scripts/prepare_tf_dep.sh
fi

I have tested this version. It work well on Mac M1 now.

@ethiel
Copy link

ethiel commented Feb 10, 2023

@DuongTSon thanks, I didn't installed anyway because I didn't find it.
I'm able to build from source, but I can't use it. This is the error when I try to use it in a simple project:

tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users/ethiel/miniconda3/envs/python310-tensorflow/lib/python3.10/site-packages/tensorflow_text/python/ops/_regex_split_ops.dylib, 0x0006): malformed trie child, cycle to nodeOffset=0x2
weak-def symbol not found (__ZN10tensorflow11register_op19OpDefBuilderWrapper10SetShapeFnENSt3__18functionIFN3tsl6StatusEPNS_15shape_inference16InferenceContextEEEE)

I guess the issue is python 3.10.

Python 3.9 with tensorflow-macos 2.9.0 works fine. The issue seems to be in the mix of tensorflow-macos 2.11.0 and the Python version.

@mridulrao
Copy link

Hey! I was able to create wheel for tensorflow-text==2.10.0 and install tensorflow-macos==2.10.0. But when I try to import them in jupyter notebook, it throws an error dlopen(/Users/kawaii/opt/miniconda3/envs/transformers_2/lib/python3.10/site-packages/tensorflow_text/python/ops/_regex_split_ops.dylib, 0x0006): malformed trie child, cycle to nodeOffset=0x2 weak-def symbol not found (__ZN10tensorflow11register_op19OpDefBuilderWrapper10SetShapeFnENSt3__18functionIFNS_6StatusEPNS_15shape_inference16InferenceContextEEEE)

Any idea how to solve it?

@DuongTSon
Copy link

@mridulrao it's tensorflow-metal version issue. You can use tensorflow-metal==0.6.0, I have tested with tensorflow-text 2.10. It's working well.

@ethiel
Copy link

ethiel commented Feb 11, 2023

@DuongTSon Did you build 2.10 from source?. I can't, the build fails with this error:
Compiling tensorflow_text/core/ops/constrained_sequence_op.cc failed: undeclared inclusion(s) in rule '//tensorflow_text:constrained_sequence_op_cc': this rule is missing dependency declarations for the following files included by 'tensorflow_text/core/ops/constrained_sequence_op.cc': 'external/com_google_absl/absl/status/status.h' 'external/com_google_absl/absl/status/internal/status_internal.h'

@mridulrao
Copy link

@ethiel I was getting the same issue but was able to solve it by using tensorflow==2.10.0 and bazel==5.1.1

@mridulrao
Copy link

mridulrao commented Feb 12, 2023

@DuongTSon I am still getting the same error after downgrading the metal version. I removed tensorflow-metal from the environment but still got the error. The error is coming from import tensorflow_text line

@DuongTSon
Copy link

DuongTSon commented Feb 12, 2023

@mridulrao
I have built the 2.10 from source and tested the Transformer without any issue. I guess it's python version, miniconda might lack some functions.

My build environment is like the following

  • Mac Monterey
  • Xcode 13.4.1
  • Clang 13.1.6
  • Anaconda 22.11.1 (M1)
  • tensorflow-macos==2.11.0
  • tensorflow-metal==0.6.0
  • Python 3.10 (anaconda)
  • tensorflow text source code (checkout at 2.10 branch)

@mridulrao
Copy link

@DuongTSon
Okay! I will try your system configuration. Moreover, if possible can you help me understand Transformers architecture and solve a few bugs? I have been following the transformer tutorial and used MuRIL encoder.

@alanlomeli
Copy link

Could I use I precompiled wheel for macos 13 and if yes could someone provide me one?
I followed @DuongTSon instructions but when I build it gets stuck :(

@DuongTSon
Copy link

Could I use I precompiled wheel for macos 13 and if yes could someone provide me one? I followed @DuongTSon instructions but when I build it gets stuck :(

I am not sure whether my builds in MacOS Monterey can work on your computer but you can try. There are 3 versions of tensorflow-text 2.9, 2.10, 2.11 in the link below.
https://1drv.ms/u/s!AmRjIZct7QZDg9xM-rf_kE_svoEcJA?e=IvpE7R

@DuongTSon
Copy link

@DuongTSon Okay! I will try your system configuration. Moreover, if possible can you help me understand Transformers architecture and solve a few bugs? I have been following the transformer tutorial and used MuRIL encoder.

I just tried the Transformer with tensorflow-text in this tutorial https://www.tensorflow.org/text/tutorials/transformer. I havent used the MuRIL encoder!

broken added a commit that referenced this issue Feb 28, 2023
…for more details.

PiperOrigin-RevId: 511829171
broken added a commit that referenced this issue Feb 28, 2023
…for more details.

PiperOrigin-RevId: 511829171
@broken
Copy link
Member

broken commented Feb 28, 2023

@broken Just try to fix that line, actually the condition $osname should be in lowercase.

# Set tensorflow version
if [[ $osname != "darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
  source oss_scripts/prepare_tf_dep.sh
fi

I have tested this version. It work well on Mac M1 now.

@DuongTSon I had somebody with a M1 test this, and uname -s returns "Darwin" (with a capital D). By lower-casing it, the prepare_tf_dep.sh script is being ran. However, the original steps you provided said to comment all of it out. This seems contradictory. Do you know if this should be ran or not?

@vedmant
Copy link

vedmant commented Mar 1, 2023

@ethiel For me complication doesn't work, version 2.10.0, I have this error:

++ sed -E -i '""' 's/strip_prefix = "tensorflow-2.+",/strip_prefix = "tensorflow-      <span class="sha-block m-0">commit <span class="sha user-select-contain">53ce211</span></span>",/' WORKSPACE
sed: 1: "s/strip_prefix = "tenso ...": bad flag in substitute command: 's'

Looks like the problem starts here: oss_scripts/prepare_tf_dep.sh
python3 -c 'print(__import__("tensorflow").__git_version__)' it returns unknown and I think script supposed to get a git version.
Then it tries to get commit_sha from github:
commit_sha=$(curl -SsL https://github.com/tensorflow/tensorflow/commit/${short_commit_sha} | grep sha-block | grep commit | sed -e 's/.*\([a-f0-9]\{40\}\).*/\1/') which returns html code instead: <span class="sha-block m-0">commit <span class="sha user-select-contain">b5c7f1f</span></span>

So the question is why python3 -c 'print(__import__("tensorflow").__git_version__)' returns unknown?

@DuongTSon
Copy link

DuongTSon commented Mar 1, 2023

@broken Sorry for the confusion. In the tutorial, I mean we need to disable the prepare_tf_dep.sh script when compiling on MacOs M1 because it will update the tensorflow version in the WORKSPACE file to the latest one which causes the incompatible issue.

About the lowercase darwin, I thought you also have tried to set a rule to avoid updating the tensorflow version with the $osname condition check but somehow the condition did not capture the correct osname in MacOS Monterey. Then I have tried to fix that with the lowercase darwin.

In summary, to make the build work with the MacOS M1, we need to disable the prepare_tf_dep.sh by some approaches. Either ways I have listed above work for me.

@vedmant
Copy link

vedmant commented Mar 1, 2023

@DuongTSon OK, I was able to build it, but now there is an error when I try to import:

 from tensorflow_text.core.pybinds import tflite_registrar
ImportError: dlopen(/Users/vedmant/miniconda3/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so, 0x0002): tried: '/Users/vedmant/miniconda3/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))

However I see mentions during the build and it clearly mentions arm64:

copying tensorflow_text/core/pybinds/tflite_registrar.so -> build/lib.macosx-11.1-arm64-cpython-310/tensorflow_text/core/pybinds
copying build/lib.macosx-11.1-arm64-cpython-310/tensorflow_text/core/pybinds/tflite_registrar.so -> build/bdist.macosx-11.1-arm64/wheel/tensorflow_text/core/pybinds

@broken
Copy link
Member

broken commented Mar 1, 2023

@DuongTSon Got it; thanks.

@vedmant The package you are trying to import was built for the wrong architecture. Some ideas:

  • Did you pip install your built package?
  • Does that shared object file have the expected creation date?
  • Are you importing on the same computer you built tf text on?

@vedmant
Copy link

vedmant commented Mar 1, 2023

@broken Yes for all questions, installed by pip, the same computer the same environment, and same date. Even if I try to install just tensorflow_text==2.10.0 I have error ERROR: Could not find a version that satisfies the requirement tensorflow_text==2.10.0 (from versions: none) co I could not install x64 version in any case.

@broken
Copy link
Member

broken commented Mar 1, 2023

That's very strange. If you are positive that it's the lib you built, then you must be building it for the wrong architecture somehow. I think your best bet is Google for how to ensure you are building for the right architecture. For example, this page suggests you may have multiple architecture implementations of LLVM installed, check your clang target (clang -v), and possibly set ARCHPREFERENCE (if it is used).

But first, to double verify it is the build, I would unzip your package (just rename with a .zip extension and unzip), and then use the file command (file tflite_registrar.so) to verify they were built for the right architecture. Then you will know for certain whether your created package was built incorrectly or there was an installation issue.

copybara-service bot pushed a commit that referenced this issue Mar 2, 2023
copybara-service bot pushed a commit that referenced this issue Mar 2, 2023
copybara-service bot pushed a commit that referenced this issue Mar 2, 2023
broken added a commit that referenced this issue Mar 2, 2023
broken added a commit that referenced this issue Mar 2, 2023
broken added a commit that referenced this issue Mar 2, 2023
broken added a commit that referenced this issue Mar 2, 2023
@tsdeng
Copy link
Author

tsdeng commented Mar 3, 2023

Whoever hitting this problem, here's the solution for everybody using MacOS 13 and want to build tensorflow-text. This is the kind of thing that should not be this hard.
Thanks to @DuongTSon and https://medium.com/@murphy.crosby/building-tensorflow-and-tensorflow-text-on-a-m1-mac-9b90d55e92df

Please make sure:

  1. Use conda. Because you will need tensorflow-metal, tensorflow-deps and tensorflow-macos
  2. Download Xcode 13.1 from here. Even though you won't be able to install Xcode 13.1, you still need the older version ld to workaround the issue mentioned here

Following are the steps.

Create a conda environment:

conda create -p ./venv python=3.10
conda activate ./venv

Install tensorflow macOS dependencies:

conda install -c apple tensorflow-deps=2.10.0
python -m pip install tensorflow-macos==2.10.0
python -m pip install 'tensorflow-metal==0.6.0'

clone the tensorflow-text repo and checkout 2.10.0 branch

git clone https://github.com/tensorflow/text.git
git checkout 2.10

Comment out following lines in oss_scripts/run_build.sh:

# if [[ $osname != "darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
#  source oss_scripts/prepare_tf_dep.sh
#fi

Backup you ld and replace it with the older version of ld in Xcode 13.1. Here I assume you downloaded Xcode 13 in ~/Downloads folder.

sudo mv /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld ./ld.backup
sudo cp ~/Downloads/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/

Run ld -v to make sure your ld version is ld64-711

Now run ./oss_scripts/run_build.sh and the wheel will be produced.

I really really hope the tensorflow team can improve the dev experience on Apple Silicon given the huge amount of devs using Mac and all new Macs are on Apple Silicon.

@vedmant
Copy link

vedmant commented Mar 3, 2023

@tsdeng Thanks, I'll try this, can you upload a built whl file, maybe I just can install it?

Actually I was able to build this way as well, but using all this steps in your example. However I have the same error when I try to run it:

ImportError: dlopen(/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so, 0x0002): tried: '/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))

@tsdeng
Copy link
Author

tsdeng commented Mar 3, 2023

@vedmant I will upload a wheel tonight. You environment seems to have some issues. In no way should you get x86_64. Are you unintentionally running Rosetta? Did you install the apple silicon version of Vonda?

@vedmant
Copy link

vedmant commented Mar 3, 2023

@tsdeng Not running rosetta, installed environment conda create -p ./venv python=3.10 using miniconda3. If I run python -c "import platform; print(platform.machine());" it returns arm64 in this environment.

@tsdeng
Copy link
Author

tsdeng commented Mar 4, 2023

@vedmant I am using miniforge3 which has packages better supporting apple silicon. I would suggest you to give miniforge a try.

Here is the wheel I built.

@vedmant
Copy link

vedmant commented Mar 6, 2023

@tsdeng Thanks, this worked, but I have different error now:

Metal device set to: Apple M1 Pro

systemMemory: 16.00 GB
maxCacheSize: 5.33 GB

2023-03-05 20:20:07.106507: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-03-05 20:20:07.106915: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2023-03-05 20:20:20.261844: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-03-05 20:20:22.996269: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
2023-03-05 20:20:23.221834: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
Traceback (most recent call last):
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1613, in _call_impl
    return self._call_with_structured_signature(args, kwargs,
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1691, in _call_with_structured_signature
    self._structured_signature_check_missing_args(args, kwargs)
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1710, in _structured_signature_check_missing_args
    raise TypeError(f"{self._structured_signature_summary()} missing "
TypeError: signature_wrapper(*, text) missing required arguments: text.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/vedmant/Projects/_Project/WF-01-project/project-backend/ML/predicting.py", line 77, in <module>
    output = predicting(title=args.title, model_path=args.model_path, label_dict=args.label_dict)
  File "/Users/vedmant/Projects/_Project/WF-01-project/project-backend/ML/predicting.py", line 61, in predicting
    pred = predicts(str(title), model, decode=dec)
  File "/Users/vedmant/Projects/_Project/WF-01-project/project-backend/ML/util.py", line 61, in predicts
    return decode[np.argmax(infer(tf.constant(x))['classifier'].numpy()[0])]
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1604, in __call__
    return self._call_impl(args, kwargs)
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1617, in _call_impl
    return self._call_with_flat_signature(args, kwargs,
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1671, in _call_with_flat_signature
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/saved_model/load.py", line 138, in _call_flat
    return super(_WrapperFunction, self)._call_flat(args, captured_inputs,
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1862, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 499, in call
    outputs = execute.execute(
  File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.NotFoundError: Graph execution error:

No registered 'AddN' OpKernel for 'GPU' devices compatible with node {{node StatefulPartitionedCall/model/preprocessing/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/bert_pack_inputs/PartitionedCall/RaggedConcat/ArithmeticOptimizer/AddOpsRewrite_Leaf_0_add_2}}
	 (OpKernel was found, but attributes didn't match) Requested Attributes: N=2, T=DT_INT64, _XlaHasReferenceVars=false, _grappler_ArithmeticOptimizer_AddOpsRewriteStage=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"
	.  Registered:  device='XLA_CPU_JIT'; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, 16534343205130372495, DT_COMPLEX128, DT_HALF, DT_UINT32, DT_UINT64, DT_VARIANT]
  device='GPU'; T in [DT_FLOAT]
  device='DEFAULT'; T in [DT_INT32]
  device='CPU'; T in [DT_UINT64]
  device='CPU'; T in [DT_INT64]
  device='CPU'; T in [DT_UINT32]
  device='CPU'; T in [DT_UINT16]
  device='CPU'; T in [DT_INT16]
  device='CPU'; T in [DT_UINT8]
  device='CPU'; T in [DT_INT8]
  device='CPU'; T in [DT_INT32]
  device='CPU'; T in [DT_HALF]
  device='CPU'; T in [DT_BFLOAT16]
  device='CPU'; T in [DT_FLOAT]
  device='CPU'; T in [DT_DOUBLE]
  device='CPU'; T in [DT_COMPLEX64]
  device='CPU'; T in [DT_COMPLEX128]
  device='CPU'; T in [DT_VARIANT]

	 [[StatefulPartitionedCall/model/preprocessing/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/bert_pack_inputs/PartitionedCall/RaggedConcat/ArithmeticOptimizer/AddOpsRewrite_Leaf_0_add_2]] [Op:__inference_signature_wrapper_34568]
2023-03-05 20:20:23.893251: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.

It works however if I run on CPU only.

@tsdeng
Copy link
Author

tsdeng commented Mar 7, 2023

This is a known issue: https://developer.apple.com/forums/thread/711402

@mhaas
Copy link

mhaas commented Mar 13, 2023

@DuongTSon OK, I was able to build it, but now there is an error when I try to import:

 from tensorflow_text.core.pybinds import tflite_registrar
ImportError: dlopen(/Users/vedmant/miniconda3/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so, 0x0002): tried: '/Users/vedmant/miniconda3/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))

I was seeing the same problem (just with arm64 instead of arm64e) and I was reasonably sure that my environment is set up correctly. I fixed/worked around this by explicitly adding --cpu=darwin_arm64 to my .bazelrc:

# Settings for MacOS on ARM CPUs.
build:macos_arm64 --cpu=darwin_arm64
build:macos_arm64 --macos_minimum_os=11.0

Note that the .bazelrcis regenerated every time you call oss_scripts/run_build.sh. To prevent this, I commented out the call to the configure.shscript like this after running for the first time:


# Run configure.
# source oss_scripts/configure.sh

My Python 3.10 is installed via brew and not via conda. That might make a difference and explain why only some people see this error - and why it eventually worked for @vedmant after switching to a different Python binary.

Additionally, to make the resulting wheel work, I had to install the older ldas described above.

With these two tweaks, the guide by @DuongTSon above works fine.

@qitao052
Copy link

qitao052 commented Apr 5, 2023

I meet a problem when importing tensforflow_text like below:

--> from tensorflow_text.core.pybinds import tflite_registrar
ImportError: cannot import name 'tflite_registrar' from 'tensorflow_text.core.pybinds' (unknown location)

Could any help to figure out the problem? Thansk a lot

my environment:
MacOS 13.3
conda environment
Python 3.10.10
tensorflow 2.12.0

Below is the approach I install the library:
Step1:
git clone https://github.com/tensorflow/text.git
cd text
git checkout 2.12

Step2: install bazel by (version 5.3.0)
./oss_scripts/install_bazel.sh

Step3: I modified the run_build.sh by specify the version 5.3.0.

if [ "$installed_bazel_version" != "5.3.0" ]; then

Since the auto generated .bazelversion contains a #Note and always tell me to install 5.3.0 bazel
5.3.0
NOTE: Update Bazel version in tensorflow/tools/ci_build/release/common.sh.oss

Step4: generate the wheel, succeed with several warnings.
./oss_scripts/run_build.sh

Step5: pip install tensorflow_text-2.12.0-cp310-cp310-macosx_11_0_arm64.whl

@dhruv1710
Copy link

@josharian
Copy link

For anyone else still hitting this, the fix for me was to edit oss_scripts/run_build.sh and add --config="macos_arm64" to bazel build (i.e. before --enable_runfiles ...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests