# How to run TensorFlow on Apple mac M-series 🍟

The TensorFlow machine learning framework is supposed to automatically detect and prioritise the use of GPUs over CPUs. <br>
However, when using Tensorflow on a M-series (Apple Silicon) mac I have found that TensorFlow does not automatically detect and use your Apple GPU; increasing training time significantly. 

* I have listed the steps below to create an environment which will enable TensorFlow to recognise and use Apple's GPUs on M-series chips.
* I have also included an example comparing Apple's GPU and CPU (using a M1-Pro laptop) in a small TensorFlow ML project. 

### 📦 Environment requirements 

I used Conda to create a new envirnment with python included. Then manually installed the following pip packages. Then manually added other conda packages I needed. I experimented with creating a YAML file with these instructions, however have continued to find issues with package conflicts when automating this process, but this manual method worked.  

**Step-bystep Environment Instructions:**

1. Create a new environment with python.
2. pip install tensorflow-macos
3. pip install tensorflow-metal
4. conda install your other packages such as jupyter, pandas etc...

Tensorflow should now automatically use your Mac M-series GPU if it can locate them.

### 💿 Setup

In [1]:
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np

In [2]:
# Check for GPUs!
print("Num GPUs", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs 1


### 🧪 Simple example model and data to test

In [3]:
# Create a simple CNN model
model = models.Sequential([
    layers.InputLayer(input_shape=(128, 128, 1)), 
    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Generate random input data (for testing purposes)
x_train = np.random.random((10000, 128, 128, 1))  # 10000 images, 128x128 pixels, grayscale
y_train = np.random.randint(10, size=(10000,))   # Random labels for 10 classes

2025-02-07 13:24:06.632305: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2025-02-07 13:24:06.632332: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB
2025-02-07 13:24:06.632337: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB
2025-02-07 13:24:06.632352: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2025-02-07 13:24:06.632363: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


### 🏃‍♂️‍➡️ Using Mac M-series GPUs

In [4]:
%%time

# Train the model for a few epochs (to test GPU usage)
model.fit(x_train, y_train, epochs=2, batch_size=32)

Epoch 1/2


2025-02-07 13:24:11.074972: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.


[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 30ms/step - accuracy: 0.1019 - loss: 2.3136
Epoch 2/2
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 31ms/step - accuracy: 0.1031 - loss: 2.3022
CPU times: user 11.7 s, sys: 7.18 s, total: 18.9 s
Wall time: 27.7 s


<keras.src.callbacks.history.History at 0x164f9bd10>

### 🐌 Using Mac M-series CPU only - for comparison

In [5]:
%%time 

with tf.device('/CPU:0'):
    model.fit(x_train, y_train, epochs=2, batch_size=32)

Epoch 1/2
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 133ms/step - accuracy: 0.1070 - loss: 2.3023
Epoch 2/2
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 136ms/step - accuracy: 0.1009 - loss: 2.3023
CPU times: user 7min 1s, sys: 59.4 s, total: 8min 1s
Wall time: 1min 25s
