# Apple Silicon; M1+ and Pluggable Devices

This addresses the current complications of running accelerated Tensorflow on MacBook Pro with Apple Silicon.

Apparently a genius name *_Penporn_* forgot that people DO machine learning on MacBook Pro with Apple Silicon.
It all starts with this [PluggableDevice: Device Plugins for TensorFlow](https://blog.tensorflow.org/2021/06/pluggabledevice-device-plugins-for-TensorFlow.html "Apple post to TensorFlow blog") post on June 07, 2021.
This post was about everything and *NOTHING* at the same time -- a feat only a true genius can achieve.
Yet good folks have created a bridging library Apple gladly passed as their own `tensorflow_macos`, not of the TensorFlow community.
As time passes and important additions make their creeping slow way into `tensorflow` people want to be able to use local GPU acceleration with the Apple Silicon chip more.
And then Apple's own development finally takes place: Metal! See [Get started with tensorflow-metal](https://developer.apple.com/metal/tensorflow-plugin/) which can be installed through Pip.

- [tensorflow-metal 1.2.0](https://pypi.org/project/tensorflow-metal/ "TensorFlow acceleration for Mac GPUs.") -- This is your current decrepit salvation!

## How to install Properly!

1. First, and most important -- rip out all of your tensorflow installation.
2. Then, run the following command: `pip install tensorflow==2.17 tensorflow-metal`
3. Verify that you see something like this: `Successfully installed libclang-18.1.1 protobuf-4.25.6 tensorflow-2.17.0 tensorflow-metal-1.2.0`
4. Conda should also say something like this:
```shell
conda list | grep tensorflow
tensorflow-base           2.17.0          cpu_py312had574b8_3     conda-forge
tensorflow-metal          1.2.0           pypi_0                  pypi
```
5. Run the dumbo next cell, example provided by Apple in the Pip repository above.
6. If your installation has failed, then you will not even see the downloading step message like this: Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz
7. When successful, you should see the folowing confirmation in the cell log after the download:
```shell
2025-03-07 07:23:49.017465: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Max
2025-03-07 07:23:49.017501: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 64.00 GB
2025-03-07 07:23:49.017510: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 10.67 GB
2025-03-07 07:23:49.017523: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2025-03-07 07:23:49.017532: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
```
8. Don't mind all the undefined symbols, Apple is happy with those on all the machines I've tested this with.
9. On the first epoch you should see the confirmation of the GPU acceleration: `Epoch 1/5. 2025-03-07 07:23:56.845964: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.`

NOTE: I am on Python==3.12.8! 3.12.9 will NOT work for you with tensorflow-metal 1.2.0!

The last thing to check on is that this long model and the large data set actually peg your GPU good past half the time.
Pop open your Activity Monitor and make sure your GPY column is selected to show.
In my case it looks something like this:

<img src="resources/img/metal-gpu-by-python.png" style="width: 1800px; float: left; " alt="Screenshot of the Activity Monitor showing GPU usage while running a model with tensorflow-metal off of the Pip repository example in the cell below." />

NOTE: My runtime on my M1 box and this dataset and model is about 20 minutes. But I saw much better times on M2 and M3.





In [None]:
import tensorflow as tf

# This is the unmolested example from Apple during tensorflow-metal 1.2.0 release.

cifar = tf.keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar.load_data()
model = tf.keras.applications.ResNet50(
    include_top=True,
    weights=None,
    input_shape=(32, 32, 3),
    classes=100, )

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
model.fit(x_train, y_train, epochs=5, batch_size=64)