<a href="https://colab.research.google.com/github/AhmedFarrukh/DeepLearning-EdgeComputing/blob/main/notebooks/CPU_inference.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook, we test the inference times of quantized models, and their original versions, on a CPU.

## **Prepare the Resource**
This notebook will try to reserve the compute_cascadelake_r device available on CHI@UC.

### **Check Availability**
Before you begin, you should check the host calendar at https://chi.uc.chameleoncloud.org/project/leases/calendar/host/ to see what node types are available.

### **Chameleon Configuration**
You can change your Chameleon project name (if not using the one that is automatically configured in the JupyterHub environment) and the site on which to reserve resources (depending on availability) in the following cell.

If you need to change the details of the Chameleon server, e.g. use a different edge device (NODE_TYPE), or a different node type depending on availability, you can also do that in the following cell.

In [None]:
import chi, os, time
from chi import lease
from chi import server

PROJECT_NAME = os.getenv('OS_PROJECT_NAME') # change this if you need to
chi.use_site("CHI@UC")
chi.set("project_name", PROJECT_NAME)
username = os.getenv('USER') # all exp resources will have this prefix

Now using CHI@UC:
URL: https://chi.uc.chameleoncloud.org
Location: Argonne National Laboratory, Lemont, Illinois, USA
Support contact: help@chameleoncloud.org


In [None]:
chi.set("image", "CC-Ubuntu20.04")
NODE_TYPE = "compute_cascadelake_r"
expname = "cpu-inference"

In [None]:
res = []
lease.add_node_reservation(res, node_type=NODE_TYPE, count=1)
lease.add_fip_reservation(res, count=1)
start_date, end_date = lease.lease_duration(days=0, hours=8)

l = lease.create_lease(f"{username}-{NODE_TYPE}", res, start_date=start_date, end_date=end_date)
l = lease.wait_for_active(l["id"])  #Comment this line if the lease starts in the future

In [None]:
# continue here, whether using a lease created just now or one created earlier
l = lease.get_lease(f"{username}-{NODE_TYPE}")
l['id']

'f950ba18-ba4a-4849-8193-c70e61aa9452'

### **Provisioning Resources**
This cell provisions resources. It will take approximately 15 minutes. You can check on its status in the Chameleon web-based UI: https://chi.uc.chameleoncloud.org/project/instances/, then come back here when it is in the READY state.

In [None]:
reservation_id = lease.get_node_reservation(l["id"])
server.create_server(
    f"{username}-{NODE_TYPE}",
    reservation_id=reservation_id,
    image_name=chi.get("image")
)
server_id = server.get_server_id(f"{username}-{NODE_TYPE}")
server.wait_for_active(server_id)

openstack.compute.v2.server.Server(id=80569ed3-f37c-481d-8838-4fac8389cfce, name=ahmed_farrukh_nyu_edu-compute_cascadelake_r, status=ACTIVE, tenant_id=cb970a4b0f2e42c9b1b3f9015d02f8a5, user_id=042ab1e0e3f7c495647249cc7d377c5e9031a04fccce623cea2f51f120a9bd5a, metadata={}, hostId=c9f7f1389ac96a530d6c179791ae187dfcdcf6acd066d7d067c56aa6, image={'id': '2be02db9-e591-47b4-9dd3-1d23f7a01433', 'links': [{'rel': 'bookmark', 'href': 'https://chi.uc.chameleoncloud.org:8774/images/2be02db9-e591-47b4-9dd3-1d23f7a01433'}]}, flavor={'vcpus': 1, 'ram': 1, 'disk': 20, 'ephemeral': 0, 'swap': 0, 'original_name': 'baremetal', 'extra_specs': {'resources:CUSTOM_BAREMETAL': '1', 'resources:VCPU': '0', 'resources:MEMORY_MB': '0', 'resources:DISK_GB': '0'}}, created=2024-07-18T15:51:17Z, updated=2024-07-18T16:06:24Z, addresses={'sharednet1': [{'version': 4, 'addr': '10.140.83.253', 'OS-EXT-IPS:type': 'fixed', 'OS-EXT-IPS-MAC:mac_addr': 'b8:ce:f6:43:4f:97'}]}, accessIPv4=, accessIPv6=, links=[{'rel': 'self', 


Associate an IP address with this server:

In [None]:
reserved_fip = lease.get_reserved_floating_ips(l["id"])[0]
server.associate_floating_ip(server_id,reserved_fip)

'192.5.87.186'


and wait for it to come up:

In [None]:
server.wait_for_tcp(reserved_fip, port=22)

### **Install Basic Packages**

In [None]:
from chi import ssh
node = ssh.Remote(reserved_fip)

In [None]:
node.run('sudo apt update')
node.run('sudo apt -y install python3-pip python3-dev')
node.run('sudo pip3 install --upgrade pip')





Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease
Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [128 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-backports InRelease [128 kB]
Get:4 http://security.ubuntu.com/ubuntu focal-security InRelease [128 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [3426 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 c-n-f Metadata [17.7 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [3074 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 c-n-f Metadata [540 B]
Get:9 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1210 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 c-n-f Metadata [27.5 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [27.1 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 c-n-f Metadata [616 B]
Get:13





Reading package lists...
Building dependency tree...
Reading state information...
python3-dev is already the newest version (3.8.2-0ubuntu2).
The following NEW packages will be installed:
  python3-pip python3-wheel
0 upgraded, 2 newly installed, 0 to remove and 77 not upgraded.
Need to get 254 kB of archives.
After this operation, 1154 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 python3-wheel all 0.34.2-1ubuntu0.1 [23.9 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 python3-pip all 20.0.2-5ubuntu1.10 [231 kB]


debconf: unable to initialize frontend: Dialog
debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin: 


Fetched 254 kB in 0s (1432 kB/s)
Selecting previously unselected package python3-wheel.
(Reading database ... 82169 files and directories currently installed.)
Preparing to unpack .../python3-wheel_0.34.2-1ubuntu0.1_all.deb ...
Unpacking python3-wheel (0.34.2-1ubuntu0.1) ...
Selecting previously unselected package python3-pip.
Preparing to unpack .../python3-pip_20.0.2-5ubuntu1.10_all.deb ...
Unpacking python3-pip (20.0.2-5ubuntu1.10) ...
Setting up python3-wheel (0.34.2-1ubuntu0.1) ...
Setting up python3-pip (20.0.2-5ubuntu1.10) ...
Processing triggers for man-db (2.9.1-1) ...
Collecting pip
  Downloading pip-24.1.2-py3-none-any.whl (1.8 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 20.0.2
    Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
    Can't uninstall 'pip'. No files were found to uninstall.
Successfully installed pip-24.1.2


<Result cmd='sudo pip3 install --upgrade pip' exited=0>

#### **Install Python Packages**

In [None]:
node.run('python3 -m pip install --user tensorflow')
node.run('python3 -m pip install --user matplotlib')
node.run('python3 -m pip install --user pathlib')
node.run('python3 -m pip install --user numpy')

Collecting tensorflow
  Downloading tensorflow-2.13.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting absl-py>=1.0.0 (from tensorflow)
  Downloading absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow)
  Downloading astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=23.1.21 (from tensorflow)
  Downloading flatbuffers-24.3.25-py2.py3-none-any.whl.metadata (850 bytes)
Collecting gast<=0.4.0,>=0.2.1 (from tensorflow)
  Downloading gast-0.4.0-py3-none-any.whl.metadata (1.1 kB)
Collecting google-pasta>=0.1.1 (from tensorflow)
  Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting grpcio<2.0,>=1.24.3 (from tensorflow)
  Downloading grpcio-1.65.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.3 kB)
Collecting h5py>=2.9.0 (from tensorflow)
  Downloading h5py-3.11.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.5 kB



Successfully installed MarkupSafe-2.1.5 absl-py-2.1.0 astunparse-1.6.3 cachetools-5.4.0 flatbuffers-24.3.25 gast-0.4.0 google-auth-2.32.0 google-auth-oauthlib-1.0.0 google-pasta-0.2.0 grpcio-1.65.1 h5py-3.11.0 importlib-metadata-8.0.0 keras-2.13.1 libclang-18.1.1 markdown-3.6 numpy-1.24.3 opt-einsum-3.3.0 packaging-24.1 protobuf-4.25.3 requests-oauthlib-2.0.0 rsa-4.9 tensorboard-2.13.0 tensorboard-data-server-0.7.2 tensorflow-2.13.1 tensorflow-estimator-2.13.0 tensorflow-io-gcs-filesystem-0.34.0 termcolor-2.4.0 typing-extensions-4.5.0 werkzeug-3.0.3 wrapt-1.16.0
Collecting matplotlib
  Downloading matplotlib-3.7.5-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.metadata (5.7 kB)
Collecting contourpy>=1.0.1 (from matplotlib)
  Downloading contourpy-1.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.9 kB)
Collecting cycler>=0.10 (from matplotlib)
  Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib)




Successfully installed contourpy-1.1.1 cycler-0.12.1 fonttools-4.53.1 importlib-resources-6.4.0 kiwisolver-1.4.5 matplotlib-3.7.5 pillow-10.4.0 pyparsing-3.1.2 python-dateutil-2.9.0.post0 zipp-3.19.2
Collecting pathlib
  Downloading pathlib-1.0.1-py3-none-any.whl.metadata (5.1 kB)
Downloading pathlib-1.0.1-py3-none-any.whl (14 kB)
Installing collected packages: pathlib
Successfully installed pathlib-1.0.1


<Result cmd='python3 -m pip install --user numpy' exited=0>

### **Retrieve Materials**
Finally, get a copy of the code you will run:

In [None]:
node.run('git clone https://github.com/AhmedFarrukh/experimental.git')

Cloning into 'experimental'...


<Result cmd='git clone https://github.com/AhmedFarrukh/experimental.git' exited=0>

### **Run Experiment**

Verify that the code files have correctly been loaded:

In [None]:
node.run('ls ./experimental')

measuringInferenceTimes.py
quantizingModels.py


<Result cmd='ls ./experimental' exited=0>

Run the following cell to load CNN models and apply Dynamic Range Quantization. Both original and quantized models are saved in the ./tflite_models directory.

In [None]:
node.run('python3 ./experimental/quantizingModels.py')

2024-07-18 16:18:05.795386: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-18 16:18:05.797017: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-07-18 16:18:05.830043: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-07-18 16:18:05.830476: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_1_0_224_tf.h5


2024-07-18 16:18:14.871447: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-07-18 16:18:14.871473: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-07-18 16:18:14.872076: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpeld_ljls
2024-07-18 16:18:14.884092: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2024-07-18 16:18:14.884110: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmpeld_ljls
2024-07-18 16:18:14.913755: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:375] MLIR V1 optimization pass is not enabled
2024-07-18 16:18:14.920899: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2024-07-18 16:18:15.146122: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmpeld_ljls
2024-07

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_v3/inception_v3_weights_tf_dim_ordering_tf_kernels.h5


2024-07-18 16:18:49.378236: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-07-18 16:18:49.378266: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-07-18 16:18:49.378479: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpg7evy3yd
2024-07-18 16:18:49.415427: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2024-07-18 16:18:49.415452: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmpg7evy3yd
2024-07-18 16:18:49.523403: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2024-07-18 16:18:50.130132: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmpg7evy3yd
2024-07-18 16:18:50.344587: I tensorflow/cc/saved_model/loader.cc:314] SavedModel load for tags { serve }; Status: success: OK. Took 966109 

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5


2024-07-18 16:19:38.051910: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-07-18 16:19:38.051939: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-07-18 16:19:38.052108: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpklrvurp8
2024-07-18 16:19:38.075769: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2024-07-18 16:19:38.075790: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmpklrvurp8
2024-07-18 16:19:38.149117: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2024-07-18 16:19:38.644083: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmpklrvurp8
2024-07-18 16:19:38.803535: I tensorflow/cc/saved_model/loader.cc:314] SavedModel load for tags { serve }; Status: success: OK. Took 751429 

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet101_weights_tf_dim_ordering_tf_kernels.h5


2024-07-18 16:20:33.796796: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-07-18 16:20:33.796828: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-07-18 16:20:33.797008: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpiodk16g6
2024-07-18 16:20:33.844591: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2024-07-18 16:20:33.844620: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmpiodk16g6
2024-07-18 16:20:33.996093: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2024-07-18 16:20:35.005411: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmpiodk16g6
2024-07-18 16:20:35.328113: I tensorflow/cc/saved_model/loader.cc:314] SavedModel load for tags { serve }; Status: success: OK. Took 1531105

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet152_weights_tf_dim_ordering_tf_kernels.h5


2024-07-18 16:22:06.834698: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-07-18 16:22:06.834732: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-07-18 16:22:06.834935: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmp8cikreot
2024-07-18 16:22:06.904837: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2024-07-18 16:22:06.904869: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmp8cikreot
2024-07-18 16:22:07.134635: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2024-07-18 16:22:08.645192: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmp8cikreot
2024-07-18 16:22:09.133498: I tensorflow/cc/saved_model/loader.cc:314] SavedModel load for tags { serve }; Status: success: OK. Took 2298562

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5


2024-07-18 16:23:20.517173: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-07-18 16:23:20.517201: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-07-18 16:23:20.517364: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpzpek89g3
2024-07-18 16:23:20.522589: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2024-07-18 16:23:20.522607: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmpzpek89g3
2024-07-18 16:23:20.534868: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2024-07-18 16:23:20.871684: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmpzpek89g3
2024-07-18 16:23:20.901440: I tensorflow/cc/saved_model/loader.cc:314] SavedModel load for tags { serve }; Status: success: OK. Took 384076 

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels.h5


2024-07-18 16:27:42.329317: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2024-07-18 16:27:42.329345: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2024-07-18 16:27:42.329509: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmp1er0hrru
2024-07-18 16:27:42.332804: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2024-07-18 16:27:42.332821: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmp1er0hrru
2024-07-18 16:27:42.341380: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2024-07-18 16:27:42.638217: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmp1er0hrru
2024-07-18 16:27:42.666694: I tensorflow/cc/saved_model/loader.cc:314] SavedModel load for tags { serve }; Status: success: OK. Took 337186 

<Result cmd='python3 ./experimental/quantizingModels.py' exited=0>

Run the next cell to load the benchmark.

In [None]:
node.run('mkdir ./benchmark')
node.run('wget https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model -P ./benchmark')
node.run('chmod +x ./benchmark/linux_x86-64_benchmark_model')

--2024-07-18 16:31:58--  https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model
Resolving storage.googleapis.com (storage.googleapis.com)... 142.251.40.155, 142.251.40.187, 142.251.40.219, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.251.40.155|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6237672 (5.9M) [application/octet-stream]
Saving to: ‘./benchmark/linux_x86-64_benchmark_model’

     0K .......... .......... .......... .......... ..........  0% 1.12M 5s
    50K .......... .......... .......... .......... ..........  1% 2.53M 4s
   100K .......... .......... .......... .......... ..........  2% 3.89M 3s
   150K .......... .......... .......... .......... ..........  3% 5.44M 2s
   200K .......... .......... .......... .......... ..........  4% 6.25M 2s
   250K .......... .......... .......... .......... ..........  4% 8.05M 2s
   300K .......... ...

<Result cmd='chmod +x ./benchmark/linux_x86-64_benchmark_model' exited=0>

Finally, use the benchmark to measure the inference time and memory footprint of each model.

In [None]:
node.run('python3 ./experimental/measuringInferenceTimes.py')

2024-07-18 16:32:17.106038: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-18 16:32:17.107730: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-07-18 16:32:17.141476: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-07-18 16:32:17.141900: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
INFO: Created TensorFlow Lite XNNPACK delegate

MobileNet
InceptionV3
ResNet50
ResNet101
ResNet152
VGG16
VGG19
MobileNet_quant
Error with model:  MobileNet_quant
{'MobileNet_quant': {'Init Time (ms)': 42.392, 'Inference Timings (us)': {'Init': 42392, 'First Inference': 9140, 'Warmup (avg)': 8202.95, 'Inference (avg)': 8184.14}}}
INFO: STARTING!
INFO: Log parameter values verbosely: [0]
INFO: Num threads: [1]
INFO: Graph: [./tflite_models/MobileNet_quant.tflite]
INFO: Signature to run: []
INFO: #threads used for CPU inference: [1]
INFO: Loaded model ./tflite_models/MobileNet_quant.tflite
INFO: The input model file size (MB): 4.42374
INFO: Initialized session in 42.392ms.
INFO: Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
INFO: count=61 first=9140 curr=8263 min=8155 max=9140 avg=8202.95 std=124

INFO: Running benchmark for at least 50 iterations and at least 1 seconds but terminate if exceeding 150 seconds.
INFO: count=122 first=8170 curr=8190 min=8147 max=8280 avg=8184.1

<Result cmd='python3 ./experimental/measuringInferenceTimes.py' exited=0>

Check if the plots for the results have been correctly saved:

In [None]:
node.run('ls ./results')

Avg Inference.png
First Inference.png
Init Inference.png
Init Time.png
Memory Init.png
Memory Overall.png
Warmup Inference.png


<Result cmd='ls ./results' exited=0>

Copy and paste the output of the following command in the Jupyter notebook terminal to transfer the plots of results to the Jupyter environment.

In [None]:
print(f'scp -ri ~/.ssh/id_rsa_chameleon cc@{reserved_fip}:/home/cc/results ./work')

scp -ri ~/.ssh/id_rsa_chameleon cc@192.5.87.186:/home/cc/results ./work


The plots resulting from the experiment should not be in the /work/results directory of the Jupyter environment.

## **Release Resources**
If you finish with your experimentation before your lease expires, release your resources and tear down your environment by running the following (commented out to prevent accidental deletions).

This section is designed to work as a "standalone" portion - you can come back to this notebook, ignore the top part, and just run this section to delete your reasources.

In [None]:

# setup environment - if you made any changes in the top part, make the same changes here
import chi, os
from chi import lease, server

PROJECT_NAME = os.getenv('OS_PROJECT_NAME')
chi.use_site("CHI@UC")
chi.set("project_name", PROJECT_NAME)


lease = chi.lease.get_lease(f"{username}-{NODE_TYPE}")

Now using CHI@UC:
URL: https://chi.uc.chameleoncloud.org
Location: Argonne National Laboratory, Lemont, Illinois, USA
Support contact: help@chameleoncloud.org


In [None]:
DELETE = True
# DELETE = True

if DELETE:
    # delete server
    server_id = chi.server.get_server_id(f"{username}-{NODE_TYPE}")
    chi.server.delete_server(server_id)

    # release floating IP
    reserved_fip =  chi.lease.get_reserved_floating_ips(lease["id"])[0]
    ip_info = chi.network.get_floating_ip(reserved_fip)
    chi.neutron().delete_floatingip(ip_info["id"])

    # delete lease
    chi.lease.delete_lease(lease["id"])

Deleted lease with id 419acc92-7ac8-4788-972b-b706a33f0d4c
