# FlashNet

We will use Tencent I/O traces that were replayed on our NVMe device. This data is ready to be used for training after applying some preprocessing. As for the raw version of Tencent trace, it can be downloaded here: https://www.googleapis.com/drive/v3/files/1Xj6rFBOsY9Wt_XlCiAiZn7LfSCAYy1AB?alt=media&key=AIzaSyBbUdTut1W8uzPO1nzCmcHFIw0KsuO3Dfo

#### Goal
This notebook will guide you through the process of preparing a trace profile to train a Neural Network (NN) model for performing per-IO admission control.

## Setup Connection

In [1]:
import chi
from chi import lease, server
import os
import keystoneauth1, blazarclient
import uuid

CLOUD_SITE = "CHI@UC"

PROJECT_ID = "CHI-231080"
chi.set("project_name", PROJECT_ID)
chi.use_site(CLOUD_SITE)
uid_this = str(uuid.uuid4())

print(f"UID for this experiment is {uid_this}")

Now using CHI@UC:
URL: https://chi.uc.chameleoncloud.org
Location: Argonne National Laboratory, Lemont, Illinois, USA
Support contact: help@chameleoncloud.org
UID for this experiment is e779530d-310f-4739-87fc-8ecb981c1e6a


In [2]:
reservations = []
try:
    print("Creating lease...")
    lease.add_fip_reservation(reservations, count=1)
    lease.add_node_reservation(reservations, count=1, node_type="compute_skylake")

    start_date, end_date = lease.lease_duration(days=1)

    l = lease.create_lease(f"flashnet-{uid_this}", reservations, start_date=start_date, end_date=end_date)
    cloud_lease_id = l["id"]

    print("Waiting for lease to start ... This can take up to 1 min ...")
    lease.wait_for_active(cloud_lease_id)
    print("Lease is now active!")
    
except keystoneauth1.exceptions.http.Unauthorized as e:
    print("Unauthorized.\nDid set your project name and and run the code in the first cell?")
    
except blazarclient.exception.BlazarClientException as e:
    print(f"There is an issue making the reservation. Check the calendar to make sure a node is available.")
    print("https://chi.uc.chameleoncloud.org/project/leases/calendar/host/")
    print(e)
    
except Exception as e:
    print("An unexpected error happened.")
    print(e)

Creating lease...
Waiting for lease to start ... This can take up to 1 min ...
Lease is now active!


In [3]:
s = server.create_server(
    f"flashnet-{uid_this}", 
    image_name="CC-Ubuntu20.04",
    reservation_id=lease.get_node_reservation(cloud_lease_id))

print("Waiting for server to start ...")
server.wait_for_active(s.id)
print("Done")

Waiting for server to start ...
Done


In [4]:
ip_addr = lease.get_reserved_floating_ips(cloud_lease_id)[0]
server.associate_floating_ip(s.id, floating_ip_address=ip_addr)

print(f"Waiting for SSH connectivity on {ip_addr} ...")
server.wait_for_tcp(ip_addr, 22)
print("SSH successful")

Waiting for SSH connectivity on 192.5.86.226 ...
SSH successful


## Step-by-step Guideline:

### 0. Setup, Install Conda Dependencies and Ipykernel

Put your Github Token in below command for cloning the repo

In [5]:
from chi import ssh

with ssh.Remote(ip_addr) as conn:
    conn.run('rm -rf flashnet-trovi')
    conn.run('git clone https://your_github_token@github.com/rannnayy/flashnet-trovi.git && cd flashnet-trovi && find . -type f -iname "*.sh" -exec chmod +x {} \; && find . -type f -iname "*.py" -exec chmod +x {} \; && ./install_conda_deps_cpu.sh && source ~/.zshrc && conda install -n flashnet-trovi-env ipykernel --update-deps --force-reinstall -y && export FLASHNET=$(pwd) >> ~/.zshrc')
    conn.run('sudo apt-get install tree && tree flashnet-trovi')

Cloning into 'flashnet-trovi'...
./install_conda_deps_cpu.sh: line 3: /home/cc/.zshrc: No such file or directory


Anaconda has not installed, installing one.


--2023-02-16 08:32:04--  https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8303, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 773428196 (738M) [application/x-sh]
Saving to: ‘Anaconda3-2022.10-Linux-x86_64.sh’

     0K .......... .......... .......... .......... ..........  0% 80.1M 9s
    50K .......... .......... .......... .......... ..........  0% 26.0M 19s
   100K .......... .......... .......... .......... ..........  0% 29.3M 21s
   150K .......... .......... .......... .......... ..........  0% 72.6M 18s
   200K .......... .......... .......... .......... ..........  0% 59.9M 17s
   250K .......... .......... .......... .......... ..........  0%  109M 15s
   300K .......... .......... .......... .......... ..........  0%  150M 14s
   350K .......... .......... .......... 

PREFIX=/home/cc/anaconda3


........ .......... .......... .......... .......... 13%  349M 2s
104350K .......... .......... .......... .......... .......... 13%  282M 2s
104400K .......... .......... .......... .......... .......... 13%  337M 2s
104450K .......... .......... .......... .......... .......... 13%  343M 2s
104500K .......... .......... .......... .......... .......... 13%  341M 2s
104550K .......... .......... .......... .......... .......... 13%  306M 2s
104600K .......... .......... .......... .......... .......... 13%  359M 2s
104650K .......... .......... .......... .......... .......... 13%  358M 2s
104700K .......... .......... .......... .......... .......... 13%  343M 2s
104750K .......... .......... .......... .......... .......... 13%  300M 2s
104800K .......... .......... .......... .......... .......... 13%  285M 2s
104850K .......... .......... .......... .......... .......... 13%  349M 2s
104900K .......... .......... .......... .......... .......... 13%  338M 2s
104950K .......... ...

Unpacking payload ...


......... .......... 14%  345M 2s
110250K .......... .......... .......... .......... .......... 14%  344M 2s
110300K .......... .......... .......... .......... .......... 14%  367M 2s
110350K .......... .......... .......... .......... .......... 14%  289M 2s
110400K .......... .......... .......... .......... .......... 14%  343M 2s
110450K .......... .......... .......... .......... .......... 14%  344M 2s
110500K .......... .......... .......... .......... .......... 14%  350M 2s
110550K .......... .......... .......... .......... .......... 14%  280M 2s
110600K .......... .......... .......... .......... .......... 14%  353M 2s
110650K .......... .......... .......... .......... .......... 14%  357M 2s
110700K .......... .......... .......... .......... .......... 14%  350M 2s
110750K .......... .......... .......... .......... .......... 14%  292M 2s
110800K .......... .......... .......... .......... .......... 14%  346M 2s
110850K .......... .......... .......... .......... ..

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /home/cc/anaconda3

  added / updated specs:
    - _ipyw_jlab_nb_ext_conf==0.1.0=py39h06a4308_1
    - _libgcc_mutex==0.1=main
    - _openmp_mutex==5.1=1_gnu
    - alabaster==0.7.12=pyhd3eb1b0_0
    - anaconda-client==1.11.0=py39h06a4308_0
    - anaconda-navigator==2.3.1=py39h06a4308_0
    - anaconda-project==0.11.1=py39h06a4308_0
    - anaconda==2022.10=py39_0
    - anyio==3.5.0=py39h06a4308_0
    - appdirs==1.4.4=pyhd3eb1b0_0
    - argon2-cffi-bindings==21.2.0=py39h7f8727e_0
    - argon2-cffi==21.3.0=pyhd3eb1b0_0
    - arrow==1.2.2=pyhd3eb1b0_0
    - astroid==2.11.7=py39h06a4308_0
    - astropy==5.1=py39h7deecbd_0
    - atomicwrites==1.4.0=py_0
    - attrs==21.4.0=pyhd3eb1b0_0
    - automat==20.2.0=py_0
    - autopep8==1.6.0=pyhd3eb1b0_1
    - babel==2.9.1=pyhd3eb1b0_0
    - backcall==0.2.0=pyhd3eb1b0_0
    - backports.functools_l

./install_conda_deps_cpu.sh: line 14: /home/cc/.zshrc: No such file or directory


modified      /home/cc/anaconda3/condabin/conda
modified      /home/cc/anaconda3/bin/conda
modified      /home/cc/anaconda3/bin/conda-env
no change     /home/cc/anaconda3/bin/activate
no change     /home/cc/anaconda3/bin/deactivate
no change     /home/cc/anaconda3/etc/profile.d/conda.sh
no change     /home/cc/anaconda3/etc/fish/conf.d/conda.fish
no change     /home/cc/anaconda3/shell/condabin/Conda.psm1
no change     /home/cc/anaconda3/shell/condabin/conda-hook.ps1
no change     /home/cc/anaconda3/lib/python3.9/site-packages/xontrib/conda.xsh
no change     /home/cc/anaconda3/etc/profile.d/conda.csh
modified      /home/cc/.zshrc

==> For changes to take effect, close and re-open your current shell. <==

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done




  current version: 22.9.0
  latest version: 23.1.0

Please update conda by running

    $ conda update -n base -c defaults conda





## Package Plan ##

  environment location: /home/cc/anaconda3/envs/flashnet-trovi-env

  added / updated specs:
    - python=3.8


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2023.01.10 |       h06a4308_0         120 KB
    certifi-2022.12.7          |   py38h06a4308_0         150 KB
    libffi-3.4.2               |       h6a678d5_6         136 KB
    ncurses-6.4                |       h6a678d5_0         914 KB
    openssl-1.1.1t             |       h7f8727e_0         3.7 MB
    pip-22.3.1                 |   py38h06a4308_0         2.7 MB
    python-3.8.16              |       h7a1cb2a_2        23.9 MB
    readline-8.2               |       h5eee18b_0         357 KB
    setuptools-65.6.3          |   py38h06a4308_0         1.1 MB
    sqlite-3.40.1              |       h5082296_0         1.2 MB
    xz-5.2.10                  |       h5eee18b_1         429 KB
    zlib

Cloning into 'mlperf-logging'...


Obtaining file:///tmp/mlperf-logging
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Installing collected packages: mlperf-logging
  Running setup.py develop for mlperf-logging
Successfully installed mlperf-logging-2.1.0
Collecting jupyter
  Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting qtconsole
  Downloading qtconsole-5.4.0-py3-none-any.whl (121 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.0/121.0 kB 14.8 MB/s eta 0:00:00
Collecting nbconvert
  Downloading nbconvert-7.2.9-py3-none-any.whl (274 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 274.9/274.9 kB 33.7 MB/s eta 0:00:00
Collecting notebook
  Downloading notebook-6.5.2-py3-none-any.whl (439 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 439.1/439.1 kB 31.3 MB/s eta 0:00:00
Collecting jupyter-console
  Downloading jupyter_console-6.5.1-py3-none-any.whl (23 kB)
Collecting mistune<3,>=2.0.3
  Downloading mistune-2.0.5-py2.py3-none-any.whl (24 kB)
Co



  current version: 22.9.0
  latest version: 23.1.0

Please update conda by running

    $ conda update -n base -c defaults conda





## Package Plan ##

  environment location: /home/cc/anaconda3/envs/flashnet-trovi-env

  added / updated specs:
    - _libgcc_mutex
    - _openmp_mutex
    - asttokens
    - backcall
    - ca-certificates
    - certifi
    - comm
    - debugpy
    - decorator
    - entrypoints
    - executing
    - ipykernel
    - ipython
    - jedi
    - jupyter_client
    - jupyter_core
    - ld_impl_linux-64
    - libffi
    - libgcc-ng
    - libgomp
    - libsodium
    - libstdcxx-ng
    - matplotlib-inline
    - ncurses
    - nest-asyncio
    - openssl
    - packaging
    - parso
    - pexpect
    - pickleshare
    - pip
    - platformdirs
    - prompt-toolkit
    - psutil
    - ptyprocess
    - pure_eval
    - pygments
    - python-dateutil
    - python=3.8
    - pyzmq
    - readline
    - setuptools
    - six
    - sqlite
    - stack_data
    - tk
    - tornado
    - traitlets
    - wcwidth
    - wheel
    - xz
    - zeromq
    - zlib


The following packages will be downloaded:

    package  

debconf: unable to initialize frontend: Dialog
debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin: 


Fetched 43.0 kB in 0s (124 kB/s)
Selecting previously unselected package tree.
(Reading database ... 71127 files and directories currently installed.)
Preparing to unpack .../tree_1.8.0-1_amd64.deb ...
Unpacking tree (1.8.0-1) ...
Setting up tree (1.8.0-1) ...
Processing triggers for man-db (2.9.1-1) ...
flashnet-trovi
├── LICENSE
├── README.md
├── commonutils
│   ├── default_ip_finder.py
│   └── pattern_checker.py
├── data
│   ├── trace_profile
│   │   └── nvme0n1
│   │       ├── tencent.cut.per_50k.most_rand_iops.537.png
│   │       ├── tencent.cut.per_50k.most_rand_iops.537.trace
│   │       ├── tencent.cut.per_50k.most_rand_iops.537.trace.stats
│   │       ├── tencent.cut.per_50k.most_size_thpt.222.png
│   │       ├── tencent.cut.per_50k.most_size_thpt.222.trace
│   │       ├── tencent.cut.per_50k.most_size_thpt.222.trace.stats
│   │       ├── tencent.cut.per_50k.rw_60_40.490.png
│   │       ├── tencent.cut.per_50k.rw_60_40.490.trace
│   │       ├── tencent.cut.per_50k.rw_60_40.490

### 1. Run Tail Analyzer for Labeling

In [6]:
with ssh.Remote(ip_addr) as conn:
    conn.run('source ~/.zshrc && cd flashnet-trovi && export FLASHNET=`pwd` && echo $FLASHNET && cd $FLASHNET/model_collection/1_per_io_admission/tail_analyzer && conda activate "flashnet-trovi-env" && ./tail_v2.py -files $FLASHNET/data/trace_profile/nvme0n1/tencent.cut.per_50k*.trace')

/home/cc/flashnet-trovi
trace_profiles = ['/home/cc/flashnet-trovi/data/trace_profile/nvme0n1/tencent.cut.per_50k.most_rand_iops.537.trace', '/home/cc/flashnet-trovi/data/trace_profile/nvme0n1/tencent.cut.per_50k.most_size_thpt.222.trace', '/home/cc/flashnet-trovi/data/trace_profile/nvme0n1/tencent.cut.per_50k.rw_60_40.490.trace', '/home/cc/flashnet-trovi/data/trace_profile/nvme0n1/tencent.cut.per_50k.rw_65_35.211.trace', '/home/cc/flashnet-trovi/data/trace_profile/nvme0n1/tencent.cut.per_50k.rw_75_25.379.trace']

Processing /home/cc/flashnet-trovi/data/trace_profile/nvme0n1/tencent.cut.per_50k.most_rand_iops.537.trace
#IO labeled = 50000
Fast IO = 36216
Slow IO = 13784
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.labeled
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.lat_cdf.png

Processing /home/cc/flash

### 2. Run Feature Extractor

In [7]:
# run on multiple profiles
with ssh.Remote(ip_addr) as conn:
    conn.run('source ~/.zshrc && cd flashnet-trovi && export FLASHNET=`pwd` && echo $FLASHNET && cd $FLASHNET/model_collection/1_per_io_admission/feature_extractor/ && conda activate flashnet-trovi-env && ./feat_v2.py -files ../dataset/nvme0n1/tencent.cut.per_50k*/profile_v2.labeled')

/home/cc/flashnet-trovi
trace_profiles = ['../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.labeled', '../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.labeled', '../dataset/nvme0n1/tencent.cut.per_50k.rw_60_40.490/profile_v2.labeled', '../dataset/nvme0n1/tencent.cut.per_50k.rw_65_35.211/profile_v2.labeled', '../dataset/nvme0n1/tencent.cut.per_50k.rw_75_25.379/profile_v2.labeled']

Processing ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.labeled
Removed 3 first IOs because they don't have enough historical data
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2.dataset
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2.readonly.dataset

Processing ../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.labeled
Removed 3 first IOs because they don't have enough historical data
===== output file : ../dataset/nvme0n1/tencen

### 3. Train the NN model

In [8]:
# train on multiple datasets
with ssh.Remote(ip_addr) as conn:
    conn.run('source ~/.zshrc && cd flashnet-trovi && export FLASHNET=`pwd` && echo $FLASHNET && cd $FLASHNET/model_collection/1_per_io_admission/train/ && conda activate flashnet-trovi-env && ./train_and_eval.py -model model_binary_nn -datasets ../dataset/nvme0n1/tencent*cut*per*/profile*feat*.dataset -train_eval_split 50_50')

/home/cc/flashnet-trovi


2023-02-16 08:35:32.417559: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[8.2122582e-01 9.2154316e+03 1.4032305e+03 1.4001462e+03 1.4026096e+03
  2.8238577e+01 2.8158209e+01 2.7460609e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2/model_binary_nn/eval.png


2023-02-16 08:36:09.540378: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[1.00000000e+00 8.65139453e+03 1.55035022e+03 1.54121460e+03
  1.53899670e+03 1.40773945e+01 1.56729841e+01 1.61304436e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2.readonly/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2.readonly/model_binary_nn/eval.png


2023-02-16 08:36:41.505425: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[8.31826389e-01 2.56538047e+04 1.91917346e+03 1.91786145e+03
  1.92158630e+03 1.59858665e+01 1.56592913e+01 1.56299715e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.feat_v2/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.feat_v2/model_binary_nn/eval.png


2023-02-16 08:37:18.535107: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[1.0000000e+00 2.9571211e+04 2.1310269e+03 2.1247959e+03 2.1238579e+03
  1.5582695e+01 1.5502524e+01 1.5549119e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.feat_v2.readonly/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.feat_v2.readonly/model_binary_nn/eval.png


2023-02-16 08:37:50.346659: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[6.0012865e-01 1.2135454e+04 1.1996090e+03 1.2010928e+03 1.2041250e+03
  4.1297359e+01 4.1690647e+01 4.1668209e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.rw_60_40.490/profile_v2.feat_v2/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.rw_60_40.490/profile_v2.feat_v2/model_binary_nn/eval.png


2023-02-16 08:38:26.921135: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[1.0000000e+00 1.1951129e+04 1.6033530e+03 1.5920044e+03 1.5681174e+03
  1.1676747e+01 1.5243935e+01 1.5015527e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.rw_60_40.490/profile_v2.feat_v2.readonly/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.rw_60_40.490/profile_v2.feat_v2.readonly/model_binary_nn/eval.png


2023-02-16 08:38:51.511356: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[6.5693301e-01 1.1375179e+04 1.2390829e+03 1.2346439e+03 1.2344220e+03
  3.9443455e+01 3.9640114e+01 3.9791630e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.rw_65_35.211/profile_v2.feat_v2/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.rw_65_35.211/profile_v2.feat_v2/model_binary_nn/eval.png


2023-02-16 08:39:28.049836: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[1.0000000e+00 1.0808831e+04 1.5907959e+03 1.5850159e+03 1.5640159e+03
  1.1379814e+01 1.5438192e+01 1.5376787e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.rw_65_35.211/profile_v2.feat_v2.readonly/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.rw_65_35.211/profile_v2.feat_v2.readonly/model_binary_nn/eval.png


2023-02-16 08:39:54.647823: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[7.5134057e-01 1.2798403e+04 1.4172970e+03 1.4208021e+03 1.4154316e+03
  3.3448116e+01 3.3381634e+01 3.2819305e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.rw_75_25.379/profile_v2.feat_v2/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.rw_75_25.379/profile_v2.feat_v2/model_binary_nn/eval.png


2023-02-16 08:40:32.418418: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


[[1.0000000e+00 1.3368988e+04 1.6702443e+03 1.6627412e+03 1.6439240e+03
  1.2688361e+01 1.5417907e+01 1.5368132e+01]]
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
===== output file : ../dataset/nvme0n1/tencent.cut.per_50k.rw_75_25.379/profile_v2.feat_v2.readonly/model_binary_nn/eval.stats
===== output figure : ../dataset/nvme0n1/tencent.cut.per_50k.rw_75_25.379/profile_v2.feat_v2.readonly/model_binary_nn/eval.png
trace_profiles = ['../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2.dataset', '../dataset/nvme0n1/tencent.cut.per_50k.most_rand_iops.537/profile_v2.feat_v2.readonly.dataset', '../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.feat_v2.dataset', '../dataset/nvme0n1/tencent.cut.per_50k.most_size_thpt.222/profile_v2.feat_v2.readonly.dataset', '../dataset/nvme0n1/tencent.cut.per_50k.rw_60_40.490/profile_v2.feat_v2.dataset', '../dataset/nvme0n1/tencent.cut.per_50k.rw_60_4

### 4. Analyze the model performance

In [9]:
# First, we will gather all the stats
with ssh.Remote(ip_addr) as conn:
    conn.run('source ~/.zshrc && cd flashnet-trovi && export FLASHNET=`pwd` && echo $FLASHNET && cd $FLASHNET/model_collection/1_per_io_admission/script/ && conda activate flashnet-trovi-env && ./gather_eval_stats.py -files ../dataset/nvme*/*cut*/profile*/*/eval.stats')

/home/cc/flashnet-trovi
Found 10 stats files
===== output file : ../dataset/models_performance.csv


In [10]:
with ssh.Remote(ip_addr) as conn:
    conn.run('source ~/.zshrc && cd flashnet-trovi && export FLASHNET=`pwd` && echo $FLASHNET && ./plot_performance.py')

/home/cc/flashnet-trovi


In [14]:
with ssh.Remote(ip_addr) as conn:
    conn.get('flashnet-trovi/fnr.png')
    conn.get('flashnet-trovi/fpr.png')
    conn.get('flashnet-trovi/roc_auc.png')

See the result by opening fnr.png, fpr.png, and roc_auc.png