AttributeError when calling model.fit() with AdamW optimizer on Apple Silicon #176

anton-bogomazov · 2023-06-13T09:42:52Z

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Ventura Version 13.4 (22F66), Apple M1 Pro
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.14.0-dev20230612
Python version: 3.11.1
Bazel version (if compiling from source): n/a
GPU model and memory: n/a
Exact command to reproduce: check below

Describe the problem.
Calling model.fit() created with AdamW optimizer leads to an AttributeError. Keras trying to fallback optimizers to their legacy versions on Apple Silicon, but the legacy version of AdamW does not exist.
https://github.com/keras-team/keras/blob/5849a0953a644bd6af51b672b32a235510d4f43d/keras/optimizers/__init__.py#LL300C1-L315C59

Same issue description: https://developer.apple.com/forums/thread/731019

Describe the current behavior.
AttributeError: 'str' object has no attribute 'minimize' while fitting model with AdamW optimizer.

Describe the expected behavior.
Check if legacy version exists for the optimizer; don't fallback if not and use the standard version + print warning.

Contributing.

Do you want to contribute a PR? (yes/no): yes
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing): check if legacy version exists for the optimizer; don't fallback if not and use the standard version + print warning.

Standalone code to reproduce the issue.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from keras.optimizers.experimental import AdamW
import torch.optim as optim
print(f'Tensorflow version: {tf.__version__}')

# Create and compile a linear model
model = Sequential()
model.add(Dense(1, input_dim=1, activation='linear'))
model.compile(optimizer=AdamW(learning_rate=0.001, weight_decay=0.001),
              loss='mean_squared_error')
# Generate some dummy data
X_train = tf.random.uniform(shape=(100, 1), minval=-1, maxval=1)
y_train = 2 * X_train + 1
# Fit and predict
model.fit(X_train, y_train, epochs=10, batch_size=4)
X_test = tf.random.uniform(shape=(10, 1), minval=-1, maxval=1)
y_pred = model.predict(X_test)

Source code / logs.

Tensorflow version: 2.14.0-dev20230612
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.AdamW` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.AdamW`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.AdamW`.
Epoch 1/10
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File ~/adamw_test.py:22
     19 y_train = 2 * X_train + 1
     21 # Train the model
---> 22 model.fit(X_train, y_train, epochs=10, batch_size=4)
     24 # Generate predictions
     25 X_test = tf.random.uniform(shape=(10, 1), minval=-1, maxval=1)

File ~/.pyenv/versions/3.11.1/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File /var/folders/98/w0f3wmg54750lvm2swvb7q700000gn/T/__autograph_generated_filenjw6calw.py:15, in outer_factory.<locals>.inner_factory.<locals>.tf__train_function(iterator)
     13 try:
     14     do_return = True
---> 15     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16 except:
     17     do_return = False

AttributeError: in user code:

    File "/Users/user/.pyenv/versions/3.11.1/lib/python3.11/site-packages/keras/src/engine/training.py", line 1338, in train_function  *
        return step_function(self, iterator)
    File "/Users/user/.pyenv/versions/3.11.1/lib/python3.11/site-packages/keras/src/engine/training.py", line 1322, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/Users/user/.pyenv/versions/3.11.1/lib/python3.11/site-packages/keras/src/engine/training.py", line 1303, in run_step  **
        outputs = model.train_step(data)
    File "/Users/user/.pyenv/versions/3.11.1/lib/python3.11/site-packages/keras/src/engine/training.py", line 1084, in train_step
        self.optimizer.minimize(loss, self.trainable_variables, tape=tape)

    AttributeError: 'str' object has no attribute 'minimize'

The text was updated successfully, but these errors were encountered:

tilakrayal · 2023-06-22T09:40:02Z

@anton-bogomazov,
I tried to execute the mentioned above code on both tensorflow v2.12 and tf-nighty(2.14.0-dev20230622), and it was executed without any issue/error. Also instead of using from keras.optimizers.experimental import AdamW, please try to use from tensorflow.keras.optimizers import AdamW. Kindly find the gist of it here. Thank you!

onuralpszr · 2023-06-22T10:00:34Z

@tilakrayal I was also following this issue and just to be clear you run under apple silicon right ?

anton-bogomazov · 2023-06-22T10:01:44Z

Thank you, @tilakrayal !
That issue is only reproducible on the Apple Silicon, the same code works well on the other platforms, as you correctly validated in Colab. I elaborated specific reason in the description: "Keras trying to fallback optimizers to their legacy versions on Apple Silicon, but the legacy version of AdamW does not exist." There is no fallback to legacy AdamW in Colab, because it is specific feature for Apple Silicon, so bug wasn't reproduced in your notebook.

anton-bogomazov · 2023-07-05T08:54:35Z

Hello, @tilakrayal !
I'm worried that this issue will be lost in the mix of 'support' issues, so could you, please, take a look at my message above?
Thank you!

Stephen-Cobalt · 2023-07-07T23:17:08Z

@anton-bogomazov
If you need to use AdamW, you can effectively achieve the same functionality by using the weight_decay parameter for Adam. I also have been experiencing the same issue on Apple silicon and have determined that to be the best temporary solution.

tilakrayal · 2023-07-12T08:42:18Z

@anton-bogomazov,
I tried to execute the mentioned above code on both tensorflow v2.12 and tf-nighty(2.14.0-dev20230622), and it was executed without any issue/error. Kindly find the gist of it here
As it is failing only on tensorflow-macos, we request to raise the concern on the macos-apple forum for the quick resolution. Thank you!

Netanelshoshan · 2023-07-19T22:39:13Z

@anton-bogomazov
I found a workaround to make AdamW work on Apple Silicon with the latest version of tensorflow, tensorflow-addons.
All you need to do is to import AdamW from tensorflow_addons.optimizers and you should be good.
I'm not using it with tensorflow-metal though, there's a big impact in performance. (At least 4x slower)
Hope this helps! 🙂

github-actions · 2023-08-03T01:48:59Z

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions · 2023-08-18T01:43:13Z

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

google-ml-butler · 2023-08-18T01:43:16Z

Are you satisfied with the resolution of your issue?
Yes
No

anna-hope · 2023-11-09T23:54:27Z

For anyone encountering this, this is fixed on tensorflow-macos==2.14

ethiel · 2023-11-23T16:23:53Z

For anyone encountering this, this is fixed on tensorflow-macos==2.14

I think the issue is not fixed, I'm in 2.15 version and I'm not able to train a BERT model using Adamw in my Apple Silicon M1 Max. And the log saying "WARNING:absl:At this time, the v2.11+ optimizer tf.keras.optimizers.AdamWeightDecay runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at tf.keras.optimizers.legacy.AdamWeightDecay" is still there

google-ml-butler bot assigned tilakrayal Jun 13, 2023

tilakrayal added the stat:awaiting response from contributor label Jun 22, 2023

google-ml-butler bot removed the stat:awaiting response from contributor label Jun 22, 2023

sigma-andex mentioned this issue Jul 10, 2023

Issue with AdamW on Apple M1 davidADSP/Generative_Deep_Learning_2nd_Edition#15

Open

tilakrayal added the stat:awaiting response from contributor label Jul 12, 2023

github-actions bot closed this as completed Aug 18, 2023

tcivie mentioned this issue Aug 20, 2023

keras create/load model fails on Apple M2 cpu. Same code works on Ryzen #46

Closed

fchollet transferred this issue from keras-team/keras Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError when calling model.fit() with AdamW optimizer on Apple Silicon #176

AttributeError when calling model.fit() with AdamW optimizer on Apple Silicon #176

anton-bogomazov commented Jun 13, 2023

tilakrayal commented Jun 22, 2023

onuralpszr commented Jun 22, 2023

anton-bogomazov commented Jun 22, 2023

anton-bogomazov commented Jul 5, 2023

Stephen-Cobalt commented Jul 7, 2023 •

edited

Loading

tilakrayal commented Jul 12, 2023

Netanelshoshan commented Jul 19, 2023

github-actions bot commented Aug 3, 2023

github-actions bot commented Aug 18, 2023

google-ml-butler bot commented Aug 18, 2023

anna-hope commented Nov 9, 2023

ethiel commented Nov 23, 2023

AttributeError when calling model.fit() with AdamW optimizer on Apple Silicon #176

AttributeError when calling model.fit() with AdamW optimizer on Apple Silicon #176

Comments

anton-bogomazov commented Jun 13, 2023

tilakrayal commented Jun 22, 2023

onuralpszr commented Jun 22, 2023

anton-bogomazov commented Jun 22, 2023

anton-bogomazov commented Jul 5, 2023

Stephen-Cobalt commented Jul 7, 2023 • edited Loading

tilakrayal commented Jul 12, 2023

Netanelshoshan commented Jul 19, 2023

github-actions bot commented Aug 3, 2023

github-actions bot commented Aug 18, 2023

google-ml-butler bot commented Aug 18, 2023

anna-hope commented Nov 9, 2023

ethiel commented Nov 23, 2023

Stephen-Cobalt commented Jul 7, 2023 •

edited

Loading