<a href="https://colab.research.google.com/github/andreaaraldo/machine-learning-for-networks/blob/master/9x.ml_highspeed_networks/2.Testbed-experimental-emulation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

The best model found so far has in the diagonal of the confusion matrix 1, 0.98, 0.99 and 0% loss with inter-packet time of 0.0010 seconds

In [None]:
!pip install tensorflow

In [None]:
from multiprocessing import Queue, Process, Event
# If you run your code in you local machine and your local machine runs MAC-OS
# replace "multiprocessing" with "multiprocess"

import pandas as pd
import numpy as np
import queue
import time
import logging

# The following is to be able to mount Google Drive
from google.colab import drive

import pickle # To load the model

from tensorflow.keras.models import load_model

We connect Google Drive to load the model we trained in the other notebook

In [None]:
mount_point = '/content/gdrive' # Always the same, don't change it
drive.mount(mount_point, force_remount=True)
drive_path = mount_point + '/My Drive/' # Always the same, don't change it

# Replace the following folder with some folder inside your google drive
my_path = drive_path + \
  'tsp/teaching/data-science-for-networks/img-from-code/09.highspeed-net/'

Mounted at /content/gdrive


In [None]:
! wget https://raw.githubusercontent.com/andreaaraldo/machine-learning-for-networks/master/9x.ml_highspeed_networks/generator.csv

--2024-06-13 08:46:12--  https://raw.githubusercontent.com/andreaaraldo/machine-learning-for-networks/master/9x.ml_highspeed_networks/generator.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 176598 (172K) [text/plain]
Saving to: ‘generator.csv’


2024-06-13 08:46:12 (7.11 MB/s) - ‘generator.csv’ saved [176598/176598]



After importing the main libraries, we can create a shared queue which simulates the physical link between two machines.


TXGEN ---> [ shared queue ] ----> RX ----> Processing



In [None]:
shared = Queue(maxsize=1024)   # Max size of the queue
rate = .0010 # inter-packet time (seconds);
duration = 5

# The following line should be the same as the one in the "training and testing"
# notebook
features_to_keep = ['L1-dcache-load-misses', 'L1-dcache-loads', 'L1-dcache-stores',
       'L1-icache-load-misses', 'LLC-load-misses', 'LLC-loads',
       'LLC-store-misses', 'LLC-stores', 'branch-load-misses', 'branch-misses',
       'branches', 'bus-cycles', 'cache-misses', 'cache-references',
       'context-switches', 'cpu-clock', 'cycles', 'dTLB-load-misses',
       'dTLB-store-misses', 'dTLB-stores', 'iTLB-load-misses', 'iTLB-loads',
       'instructions', 'minor-faults', 'node-load-misses', 'node-loads',
       'node-store-misses', 'node-stores', 'page-faults', 'ref-cycles',
       'task-clock']
len(features_to_keep)

31

In [None]:
# TX gen function
def txgen(id, l_queue, stop_event):
    count= 0
    lost = 0

    # Read the tx dataset and transform to numpy
    full_df = pd.read_csv('generator.csv')
    myfeatures = features_to_keep.copy()
    myfeatures.insert(0,'time')
    myfeatures.append('label')
    df = full_df[myfeatures]
    data = df.to_numpy()

    # Limit to iterate over the dataset
    # Feat to get rid of the labels in the csv
    limit = len(data[:,0])
    feat = len(data[0,:]) - 2

    while (not stop_event.is_set() ):
#    while (total_number > 0 ):

        try:
            l_queue.put_nowait( data[count%limit,:feat] )
            #logging.debug("Packet added to the queue " +str(count))
        except queue.Full:
            #logging.debug("Packet loss!")
            lost += 1


        count += 1
        #logging.debug("Total packet sent " +str(count))
        stop_event.wait(timeout=rate)#        print ("working on %s" % arg)

    print("Sent: %d Lost %d ", count, lost)

In [None]:
# PROCESSING FUNCTION: You don't have to modify this function
# Hint: Import the traned model, perform the classification task, and then return
def processing(element):

    ##################################
    # Your processing goes here
    # after loading a model
    # use it to process the element
    #
    # y_pred = model(element)
    #
    # You should return the value of the classification task y_pred
    #
    # Bonus: You can also compare with the original y from the csv
    ##################################

    sample = element.reshape(-1,1).T
    # predict_fun is to be defined later. Its implementation will depend on the
    # model considered
    y_pred = predict_fun(sample)


    return (y_pred)

In [None]:
# RX Function
def rx(id, l_queue, stop_event):
    #logging.debug("Starting the consumer")
    count = 0

    while (not stop_event.is_set()):

        try:
            #logging.debug("Reading queue")
            pkt = l_queue.get(timeout=1)
            #logging.debug("Retrieved element")

            # Processing starting... First counter
            #t0 = time.clock()
            count+=1

            #################################
            # Processing function. Here you have to put your ML approach
            # Pkt is already a numpy element, including all the features but no labels
            # The processing task is to classify the pkt
            processing(pkt)
            #################################

            ##logging.debug("Elapsed time: %.6f", time.clock() - t0)
            #logging.debug("Count: %d", count)

        except Exception as e:
            print (e)
            pass
    print("Received: %d", count)


In [None]:
def main():
    # Hint: use a global model loaded from the file where you saved your training model


    # Event variables for the experiment
    producer_stop = Event()
    consumer_stop = Event()

    # Logger: set the logger to level INFO for normal usage, DEBUG for detailed info
    ##logging.basicConfig(level=#logging.INFO)

    # Here we start the Traffic generator
    t = Process(target=txgen, args=(0, shared, producer_stop))
    t.start()

    # Here we start the receiver
    t2 = Process(target=rx, args=(0, shared, consumer_stop))
    t2.start()

    # Experiment duration
    time.sleep(duration)
    producer_stop.set()
    time.sleep(1)
    consumer_stop.set()

    print("Generator and receiver stopped")

    t.join()
    t2.join()

    print("End of simulation")

# Before you continue

After any simulation, before you try the next model, you should interrupt the Google Colab environment and start it again. Otherwise, the performance of the simulation may be impacted by some remaining operations related to the previous simulation.
To do so, use the upper menu and:



```
runtime > interrupt execution and to the following
```

Than

* Do ``Runtime > Factory Reset runtime`` and run again all the cells up to this current cell (``Runtime > Run before``).
* **Jump directly** to the other model you want to try and run from there.

# Logistic Regression

Load your previously trained Logistic Regressor.

In [None]:
# Replace with your filename
with open(my_path+"logistic-reg.pkl", "rb") as dump_file:
  model = pickle.load(dump_file)

We need to define the predict function. This will be used inside the simulation

In [None]:
def predict_fun(sample):
  y_pred = model.predict(sample)
  return y_pred

Let's run the simulation

In [None]:
main()

  self.pid = os.fork()


Sent: %d Lost %d  4337 0

Received: %d 4337
Generator and receiver stopped
End of simulation


# Neural Network


Let's now try with our previously trained neural network.
We first load it.

In [None]:
nnfile = nn_file = my_path + 'nn1.h5'
model = load_model(nn_file)

We redifine the prediction function. Note that the name of the function to predict in Keras is different than the scikit-learn models, like LogisticRegression.

In [None]:
def predict_fun(sample):
  y_pred = np.argmax (model(sample, training=False) )
  return y_pred

We now run again the `main` function, this time model is this new one

In [None]:
main()

Sent: %d Lost %d  4187 2132
Received: %d 1268
Generator and receiver stopped


Observe that we lost a larger fraction of packets.

# Your own models

Try with other models (other NN architectures, other types of classifiers, etc.)