# TensorFlow Metal v1.1.0

## Tensorflow v2.15.0

start from fresh venv

```shell
~ % python3 -m venv .venv-metal
~ % source .venv-metal/bin/activate
(.venv-metal) ~ % pip install --upgrade pip
(.venv-metal) ~ % pip list

Package    Version
---------- -------
pip        23.3.2
setuptools 65.5.

(.venv-metal) ~ % pip3 install tensorflow==2.15.0
(.venv-metal) ~ % pip3 install tensorflow-metal
(.venv-metal) ~ % pip3 list | grep tensorflow

tensorflow                   2.15.0
tensorflow-estimator         2.15.0
tensorflow-io-gcs-filesystem 0.34.0
tensorflow-macos             2.15.0
tensorflow-metal             1.1.0

(.venv-metal) ~ % jupyter_notebook.sh
```

In [1]:
!which pip
!echo
!pip list | grep tensorflow

/Users/marksusol/.venv-metal/bin/pip

tensorflow                   2.15.0
tensorflow-estimator         2.15.0
tensorflow-io-gcs-filesystem 0.34.0
tensorflow-macos             2.15.0
tensorflow-metal             1.1.0


In [None]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'

In [2]:
import tensorflow as tf
tf.config.get_visible_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [7]:
import time
from datetime import timedelta

# only works before device initialized
def disable_device(device='GPU'):    
    physical_devices = tf.config.list_physical_devices(device)
    try:
      tf.config.set_visible_devices([], device)
      visible_devices = tf.config.get_visible_devices()
      for device in visible_devices:
        assert device.device_type != device
    except:
      print('Invalid device or cannot modify virtual devices once initialized.')
      pass

def calculate_time(device_time):
    elapsed = abs(device_time[0] - device_time[1])
    return str(timedelta(seconds=elapsed))

def train_model(device):
    print('Tensorflow: %s'%(device))
    print('-- Start: %s '%(time.time()))
    
    cifar = tf.keras.datasets.cifar100
    (x_train, y_train), (x_test, y_test) = cifar.load_data()
    model = tf.keras.applications.ResNet50(
        include_top=True,
        weights=None,
        input_shape=(32, 32, 3),
        classes=100,)
    
    # if softmax layer is not being added at the last layer then we need to
    # have the from_logits=True to indicate the probabilities are not normalized 
    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
    
    with tf.device('/device:%s:0'%(device)):
        model.fit(x_train, y_train, epochs=5, batch_size=64)

    del model
    print('-- End: %s '%(time.time()))

In [5]:
%%time

print('Visible Devices: ', tf.config.get_visible_devices())
train_model('CPU')

Visible Devices:  [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Tensorflow: CPU
-- Start: 1704998364.623704 


2024-01-11 11:39:24.910238: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Pro
2024-01-11 11:39:24.910283: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 18.00 GB
2024-01-11 11:39:24.910292: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 6.00 GB
2024-01-11 11:39:24.910580: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-01-11 11:39:24.910803: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


Epoch 1/5


  output, from_logits = _get_logits(
2024-01-11 11:39:27.136726: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
--End: 1705000398.2893922 
CPU times: user 1h 25min 53s, sys: 19min 9s, total: 1h 45min 2s
Wall time: 33min 53s


> **Note:** The NUMA error message for an Apple Silicon computer is benign and can be ignored. 
>  Apple silicon memory is UMA (unified memory architecture) not NUMA

In [6]:
%%time

print('Visible Devices: ', tf.config.get_visible_devices())
train_model('GPU')

Visible Devices:  [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Tensorflow: GPU
-- Start: 1705000398.3098922 
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
--End: 1705000619.002486 
CPU times: user 3min 33s, sys: 43.2 s, total: 4min 16s
Wall time: 3min 40s


In [None]:
<!-- Ensure the colab doesn't run past this point. --> 

### CPU Only

This test can be run by itself after restarting the runtime to ensure we don't run into the `cannot modify virtual devices once initialized.` message. We're not able to 'renable' the GPU device once disabled.

In [7]:
%%time

disable_device('GPU')
print('Visible Devices: ', tf.config.get_visible_devices())
train_model('CPU')

Visible Devices:  [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
Tensorflow: CPU
--Start: 1704996048.3641548 
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
--End: 1704997812.719677 
CPU times: user 1h 25min 17s, sys: 12min 50s, total: 1h 38min 7s
Wall time: 29min 24s
