Skip to content

ardroci/Packet-Sampling-for-Online-Classification-of-Encrypted-Internet-Traffic

Repository files navigation

Packet Sampling for Online Classification of Encrypted Internet Traffic

What is it?

The choice of a specific sampling strategy requires a broad and clear understanding of the capabilities and limitations of each technique and the specific behavior of the network on which it will be applied. Therefore, to properly consider a greater number of traffic scenarios and specific needs this tool provides a wider collection of packet sampling techniques, presented in the table.

Selection Scheme Trigger
Systematic Count-based position
Systematic Time-based time
Simple Random position
Additive Random position
Multi-Adaptive time

Verifying Your Download

If you’re running on Linux, you probably already have GPG (GNU Privacy Guard) installed. If you’re on Windows or OS X, you’ll need to install the appropriate version for your platform.

  • If you’re on a PC running Windows, download and install GPG4Win from here.
  • If you’re on a Macintosh running OS X, download and install GPGTools from here.

Since Windows does not have the native ability to calculate SHA1 checksums, you will also need a utility such as Microsoft File Checksum Integrity Verifier or Hashtab to verify your download.

Once you’ve installed GPG, you’ll need to download and import a copy of the my key. Do this with the following command:

$ gpg --keyserver hkp://keys.gnupg.net --recv-key CE994164

Your output should look like this:

gpg: key CE994164: public key "Ricardo Costa Oliveira <rcoliveira314@gmail.com>" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)

Verify that the key is properly installed with the command:

$ gpg --list-keys --with-fingerprint  CE994164

The output will look like this:

pub   4096R/CE994164 2016-01-13 [expires: 2020-01-13]
      Key fingerprint = 10C9 CEE5 72EE 77A2 C950  39AA 4A02 4A0C CE99 4164
uid       [ultimate] Ricardo Costa Oliveira <rcoliveira314@gmail.com>
sub   4096R/C49636FA 2016-01-13 [expires: 2020-01-13]

You’re now set up to validate your download.

This method does not rely on the integrity of the web site you downloaded the file from, only the official development team key that you install independently. To verify your file this way, you will need to download three files:

  • The ZIP file itself (e.g. Packet-Sampling-for-Online-Classification-of-Encrypted-Internet-Traffic-master.zip)
  • The file containing the calculated SHA256 hash for the ZIP, SHA2SUM
  • The signed version of that file, SHA2SUM.asc

Before verifying the checksums of the file, you must ensure that the SHA2SUM file is the one generated by me. That’s why the file is signed by my official key with a detached signature in SHA2SUM.asc.

Once you have downloaded both SHA2SUM and SHA2SUM.asc, you can verify the signature as follows:

$ gpg --verify SHA2SUM.asc SHA2SUM
gpg: Signature made Mon Jun 12 23:30:00 2016 WEST using RSA key ID CE994164
gpg: Good signature from "Ricardo Costa Oliveira <rcoliveira314@gmail.com>" [ultimate]

If you did get the “Good signature” response, you can now be assured that the checksum in the SHA2SUM file was actually provided by the development team. All that remains to be done to complete the verification is to validate that the signature you compute from the file you’ve downloaded matches the one in the SHA2SUM file. You can do that on Linux or OS X with the following command (assuming that the file is named “Amostragem-de-Pacotes-para-Classificacao-Online-de-Trafego-Internet-Cifrado-master.zip” and is in your working directory):

$ grep Packet-Sampling-for-Online-Classification-of-Encrypted-Internet-Traffic-master.zip SHA2SUM | shasum -a 256 -c

If the file is successfully authenticated, the response will look like this:

Packet-Sampling-for-Online-Classification-of-Encrypted-Internet-Traffic-master.zip: OK

Once you’ve downloaded and verified your file, you can proceed.

Usage

As simple as providing the selection scheme. Here goes the help:

$ ./sampling --help

Packet sampling for online classification of encrypted internet traffic.
Ricardo Oliveira '16

Usage: ./sampling [options]

Options:
-h,--help   show this help message and exit

  General options:
    -v, --verbose       Verbose mode.
                        Default: brief
    -f FILE, --file FILE
                        Open a file with previous captured traffic.
    -o FILE, --output FILE
                        Specifies the directory where the captured
                        traffic is saved in a pcap file.
    -p , --sniffer
                        No selection scheme is applied to the captured traffic.
    -s INTERVAL, --systematic_count INTERVAL
                        Sets the interval between samples to INTERVAL.
    -t INTERVAL, --systematic_time INTERVAL
                        Sets the interval between samples to INTERVAL ms.
    -r N, --simple_random N
                        Sets the interval between samples to
                        [0 , 2*sampling rate-2].
    -a N, --random_additive N
                        Specifies the average sampling rate.
                        On average each sampling will occur every N packets.
    -m , --multi_adaptive
                        Default values: min next sample size = 10000 ms
                                        max next samplesize = 500000 ms
                                        min interval between samples =  10000 ms
                                        max interval between samples = 500000 ms
                                        window size = 10

And a real test using multi-adaptive selection scheme

The network traffic used here was captured in a controlled environment and can be downloaded here.

$ ./sampling --file captured-traffic.pcap --output /user/home/Desktop/  --multi_adaptive

                Statistical Parameters

First Packet Fri Aug  5 18:48:03 2011
Last  Packet Fri Aug  5 19:06:19 2011
Elapsed Time 1095.917873 s

Overhead :
  Number of packets captured : 1890723
  Number of packets selected : 422516 (22.34680 %)
  Appropriate sample size :    32971

  Total Data Volume :          1393631968 Bytes
  Sampled Data Volume :        311384916  Bytes
  Number of Samples :          42535

Throughput Estimation:
  Total Throughput :           1271657 bytes/s
  Sampled Throughput :         284131  bytes/s
  Total Peak to average :      1.280
  Sampled Peak to average :    1.285
  Correlation :                0.999
  Relative Error :             0.001
  Mean :                       737
  Standard Deviation :         682.867

About

Amostragem de Pacotes para Classificacao Online de Trafego Internet Cifrado

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages