Skip to content

Estimating WebRTC Video QoE Metrics Without Using Application Headers

Notifications You must be signed in to change notification settings

noise-lab/vcaml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vcaml

An end-to-end pipeline designed to estimate QoE for WebRTC-based video conferencing applications (VCAs) without using application layer headers.

1. Download Datasets

2. Install Dependencies

  1. If you intend to train and evaluate our models over your own PCAPs, you will need to create CSVs using the script src/util/pcap2csv.py. It requires a working tshark installation.
  2. For dependencies related to data collection, refer to the data collection README.
  3. Inside a Python3 virtual environment, execute setup.py to install dependencies.

3. Collect Additional Data

Refer to In-Lab Data Collection and Real-World Data Collection for more details.

4. Prepare Inference pipeline

For reproducing the results in our paper, download and place the datasets under data/.

If you intend to use your own traces, place your files under data/ with the same directory structure as our datasets. Do not forget to modify config.py as per your requirments.

5. Train and test models

If you want to conduct an independent analysis on our trained models, refer to the below links. These contain our model intermediates. Place them in data/. Then run the notebooks straight away.

An example directory structure for In-lab Model Intermediates is shown as under:

vcaml/data/in_lab_data_intermediates  
├── cv_splits.pkl  --> A dictionary that stores the cross validation splits
├── framesReceivedPerSecond_ip-udp-heuristic_LSTATS-TSTATS_in_lab_data_cv_1
│   ├── model.pkl  --> A dictionary for each VCA for a single cross validation
│   ├── predictions_meet.pkl  --> Predictions for a single VCA
│   ├── predictions_teams.pkl
│   └── predictions_webex.pkl
├── framesReceivedPerSecond_ip-udp-heuristic_LSTATS-TSTATS_in_lab_data_cv_2
│   ├── model.pkl
│   ├── predictions_meet.pkl
│   ├── predictions_teams.pkl
│   └── predictions_webex.pkl
├── framesReceivedPerSecond_ip-udp-heuristic_LSTATS-TSTATS_in_lab_data_cv_3
│   ├── model.pkl
│   ├── predictions_meet.pkl
│   ├── predictions_teams.pkl
│   └── predictions_webex.pkl

To train and evaluate models from scratch, refer to run_model.py. Modify the below part of the code according to your requirements.

if __name__ == '__main__':

    metrics = ['framesReceivedPerSecond', 'bitrate',
               'frame_jitter', 'frameHeight']  # what to predict
    estimation_methods = ['ip-udp-ml', 'rtp-ml', 'ip-udp-heuristic', 'rtp-heuristic']  # how to predict
    # groups of features as per `features.feature_extraction.py`
    feature_subsets = [['LSTATS', 'TSTATS']]
    data_dir = ['/home/taveesh/Documents/vcaml/data/in_lab_data']

    bname = os.path.basename(data_dir[0])

    # Get a list of pairs (trace_csv_file, ground_truth)

    fp = FileProcessor(data_directory=data_dir[0])
    file_dict = fp.get_linked_files()

    # Create 5-fold cross validation splits and validate files. Refer `src/util/validator.py` for more details

    kcv = KfoldCVOverFiles(5, file_dict, project_config, bname)
    file_splits = kcv.split()

    vca_preds = defaultdict(list)

    param_list = [metrics, estimation_methods, feature_subsets, data_dir]

    # Run models over 5 cross validations

    for metric, estimation_method, feature_subset, data_dir in product(*param_list):
        if metric == 'frameHeight' and 'heuristic' in estimation_method:
            continue
        models = []
        cv_idx = 1
        for fsp in file_splits:
            model_runner = ModelRunner(
                metric, estimation_method, feature_subset, data_dir, cv_idx)
            vca_model = model_runner.train_model(fsp)
            predictions = model_runner.get_test_set_predictions(fsp, vca_model)
            models.append(vca_model)

            for vca in predictions:
                vca_preds[vca].append(pd.concat(predictions[vca], axis=0))

            cv_idx += 1

While the models run, a file log.txt is created to track the progress. An example is shown below:

2023-09-16 14:09:14.841418	VCA: teams || Experiment : framesReceivedPerSecond_ip-udp-ml_LSTATS-TSTATS_in_lab_data_cv_1 || MAE_avg = 1.93 || Accuracy_avg = 77.35
2023-09-16 14:09:14.841507	VCA: meet || Experiment : framesReceivedPerSecond_ip-udp-ml_LSTATS-TSTATS_in_lab_data_cv_1 || MAE_avg = 1.31 || Accuracy_avg = 87.64
2023-09-16 14:09:14.841556	VCA: webex || Experiment : framesReceivedPerSecond_ip-udp-ml_LSTATS-TSTATS_in_lab_data_cv_1 || MAE_avg = 0.85 || Accuracy_avg = 90.9
2023-09-16 14:13:23.324799	VCA: teams || Experiment : framesReceivedPerSecond_ip-udp-ml_LSTATS-TSTATS_in_lab_data_cv_2 || MAE_avg = 1.83 || Accuracy_avg = 82.06
2023-09-16 14:13:23.324886	VCA: meet || Experiment : framesReceivedPerSecond_ip-udp-ml_LSTATS-TSTATS_in_lab_data_cv_2 || MAE_avg = 1.36 || Accuracy_avg = 86.45

6. Cite our work

@inproceedings{10.1145/3618257.3624828,
    author = {Sharma, Taveesh and Mangla, Tarun and Gupta, Arpit and Jiang, Junchen and Feamster, Nick},
    title = {Estimating WebRTC Video QoE Metrics Without Using Application Headers},
    year = {2023},
    isbn = {9798400703829},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3618257.3624828},
    doi = {10.1145/3618257.3624828},
    booktitle = {Proceedings of the 2023 ACM on Internet Measurement Conference},
    pages = {485–500},
    numpages = {16},
    keywords = {machine learning, access networks, quality of experience, video conferencing},
    location = {Montreal QC, Canada},
    series = {IMC '23}
}

About

Estimating WebRTC Video QoE Metrics Without Using Application Headers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published