# Term Project: Checkpoint #1

**Group 8:** Palvi Sabherwal, Hannah Shakouri, Emily Thai

## EXPERIMENT

*Provide a high-level explanation of the experiment(s) that you want to run through netUnicorn/netReplica. What type of data do you need to collect?*

Our project’s goal is to predict download times of video chunks on video streaming platforms to minimize the interruptions caused by fluctuations in network performance. The input of our model will be historic QoS metrics, including throughput, latency, and packet loss. As for the training data we plan on running through netUnicorn/netReplica, we will collect time-series data on historical QoS metrics for video streaming sessions. 

## DATA COLLECTION PIPELINES
*Provide (pseudo)code for the pipeline(s) that you will run for your data collection.*

### Import Statements
*For each task in your pipeline, provide a reference to the implementation of this task that you will use for your data collection.*

These imports statements are our reference to the implementations used in our pipeline tasks. 

In [2]:
import os
import time
import pandas as pd

from netunicorn.client.remote import RemoteClient, RemoteClientException
from netunicorn.base import Experiment, ExperimentStatus, Pipeline
from netunicorn.library.tasks.capture.tcpdump import StartCapture, StopNamedCapture
from netunicorn.library.tasks.upload.fileio import UploadToFileIO
from netunicorn.library.tasks.upload.webdav import UploadToWebDav
from netunicorn.library.tasks.basic import SleepTask
from netunicorn.library.tasks.measurements.ookla_speedtest import SpeedTest
from netunicorn.library.tasks.video_watchers.youtube_watcher import WatchYouTubeVideo

### Set Up netUnicorn
Choosing a device for our project. Using our group's netUnicorn API credentials.

In [2]:
NETUNICORN_ENDPOINT = os.environ.get('NETUNICORN_ENDPOINT', 'https://pinot.cs.ucsb.edu/netunicorn')
NETUNICORN_LOGIN = os.environ.get('NETUNICORN_LOGIN', 'cs190n8')       
NETUNICORN_PASSWORD = os.environ.get('NETUNICORN_PASSWORD', 'kfazTdrx')

In [5]:
client = RemoteClient(endpoint=NETUNICORN_ENDPOINT, login=NETUNICORN_LOGIN, password=NETUNICORN_PASSWORD)
print("Health Check: {}".format(client.healthcheck()))
nodes = client.get_nodes()
print(nodes)

Health Check: True
[<Uncountable node pool with next node template: [aws-fargate-A-cs190n8-, aws-fargate-B-cs190n8-, aws-fargate-ARM64-cs190n8-]>]


In [3]:
working_node = 'aws-fargate-A-cs190n8-'

### Collecting Network Data for Video Streaming
In our collecting network data pipeline we will be collecting packet captures while streaming video for YouTube.

In [4]:
pipeline = Pipeline()

# Flag to enable early stopping -- so if any task fails pipeline would go on working
# pipeline.early_stopping = False

# do another youtube video & rerun the results (maybe do another duration for the other video - around 10min)

# Generate Data for YouTube
pipeline.then(StartCapture(filepath="/tmp/youtube_capture1.pcap", name="capture1"))
for _ in range(5):
    pipeline.then(WatchYouTubeVideo("https://www.youtube.com/watch?v=-9CLMxYf2MI&ab_channel=GillianBowerSlime", 10))
pipeline.then(StopNamedCapture(start_capture_task_name="capture1"))

pipeline.then(SleepTask(2))

pipeline.then(StartCapture(filepath="/tmp/youtube_capture2.pcap", name="capture1"))
for _ in range(5):
    pipeline.then(WatchYouTubeVideo("https://www.youtube.com/watch?v=5hZ4BPv0AbY", 10))
pipeline.then(StopNamedCapture(start_capture_task_name="capture2"))

pipeline.then(StartCapture(filepath="/tmp/youtube_capture3.pcap", name="capture1"))
for _ in range(5):
    pipeline.then(WatchYouTubeVideo("https://www.youtube.com/watch?v=Yi6nzlnRvTk", 10))
pipeline.then(StopNamedCapture(start_capture_task_name="capture3"))

pipeline.then(UploadToWebDav(filepaths={"/tmp/youtube_capture1.pcap"}, endpoint="http://snl-server-5.cs.ucsb.edu/cs190n/cs190n8/youtube_capture", username="group8", password="group8"))
pipeline.then(UploadToWebDav(filepaths={"/tmp/youtube_capture2.pcap"}, endpoint="http://snl-server-5.cs.ucsb.edu/cs190n/cs190n8/youtube_capture", username="group8", password="group8"))
pipeline.then(UploadToWebDav(filepaths={"/tmp/youtube_capture3.pcap"}, endpoint="http://snl-server-5.cs.ucsb.edu/cs190n/cs190n8/youtube_capture", username="group8", password="group8"))

Pipeline(37bf9903-98e4-430c-99e6-a05bcd568754): {'root': [<netunicorn.library.tasks.capture.tcpdump.StartCapture object at 0x7f73f2907b50>], 1: [<netunicorn.library.tasks.video_watchers.youtube_watcher.WatchYouTubeVideo object at 0x7f73f2905420>], 2: [<netunicorn.library.tasks.video_watchers.youtube_watcher.WatchYouTubeVideo object at 0x7f73f29460b0>], 3: [<netunicorn.library.tasks.video_watchers.youtube_watcher.WatchYouTubeVideo object at 0x7f73f29460e0>], 4: [<netunicorn.library.tasks.video_watchers.youtube_watcher.WatchYouTubeVideo object at 0x7f73f2946140>], 5: [<netunicorn.library.tasks.video_watchers.youtube_watcher.WatchYouTubeVideo object at 0x7f73f2944c10>], 6: [<netunicorn.library.tasks.capture.tcpdump.StopNamedCapture object at 0x7f73f2906e30>], 7: [<netunicorn.library.tasks.basic.SleepTask with name 438b8673-39f1-43ad-ad81-db6d8732241f>], 8: [<netunicorn.library.tasks.capture.tcpdump.StartCapture object at 0x7f73f2944b50>], 9: [<netunicorn.library.tasks.video_watchers.youtu

In [5]:
client = RemoteClient(endpoint=NETUNICORN_ENDPOINT, login=NETUNICORN_LOGIN, password=NETUNICORN_PASSWORD)
print("Health Check: {}".format(client.healthcheck()))
nodes = client.get_nodes()
print(nodes)

Health Check: True
[<Uncountable node pool with next node template: [aws-fargate-A-cs190n8-, aws-fargate-B-cs190n8-, aws-fargate-ARM64-cs190n8-]>]


In [6]:
working_nodes = nodes.filter(lambda node: node.name.startswith(working_node)).take(1)

# Creating the experiment
experiment = Experiment().map(pipeline, working_nodes)
print(experiment)

 - Deployment: Node=aws-fargate-A-cs190n8-1, executor_id=, prepared=False, error=None


### Preparing the Experiment
We will use a predefined DockerImage.

In [7]:
from netunicorn.base import DockerImage
for deployment in experiment:
    # you can explore the image on the DockerHub
    deployment.environment_definition = DockerImage(image='speeeday/chromium-speedtest:0.3.1')

In [9]:
experiment_label = "datacoll"

Now we can prepare the experiment, check for any errors and execute.

In [10]:
try:
    client.delete_experiment(experiment_label)
except RemoteClientException:
    pass

client.prepare_experiment(experiment, experiment_label)

while True:
    info = client.get_experiment_status(experiment_label)
    print(info.status)
    if info.status == ExperimentStatus.READY:
        break
    time.sleep(20)

ExperimentStatus.PREPARING
ExperimentStatus.READY


In [11]:
for deployment in client.get_experiment_status(experiment_label).experiment:
    print(f"Prepared: {deployment.prepared}, error: {deployment.error}")

Prepared: True, error: None


In [12]:
client.start_execution(experiment_label)

while True:
    info = client.get_experiment_status(experiment_label)
    print(info.status)
    if info.status != ExperimentStatus.RUNNING:
        break
    time.sleep(20)

ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.RUNNING
ExperimentStatus.FINISHED
