Skip to content

w4k2/DSE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deterministic Sampling Ensemble


DSE-diagram1

Deterministic Sampling Ensemble diagram

DSE-diagram1

Deterministic Sampling diagram


Experiment 1 - Evaluating the best sampling method

Experiment files:

Methods:

  • DSE - Deterministic Sampling Ensemble

Base classifiers:

Data streams:

  • Generators:
  • Concept drift:
    • sudden
    • incremental
  • Objects: 15 000
  • Features: 10
  • Imbalance Ratio: 10%
  • Noise: 10%
  • Random samples: 333

Results:

O

Results of Random Under Sampling combination with oversampling methods. Darker is better, best value isbold and underscored

SVMS

Results of SVMSMOTE combination with undersampling methods. Darker is better, best value is bold andunderscored

NCR

Results of NCR combination with oversampling methods. Darker is better, best value is bold and underscored


Experiment 2 - Evaluating the best balance ratio param

Files:

Methods:

  • DSE - Deterministic Sampling Ensemble

Base classifiers:

Data streams:

  • Generators:
  • Concept drift:
    • sudden
    • incremental
  • Objects: 15 000
  • Features: 10
  • Imbalance Ratio: 10%
  • Noise: 10%
  • Random samples: 333

Results:

BALANCE

Balance parameter setup experiment. Darker is better, best value bold and underscore


Experiment 3 - Evaluating the performance on different noise ratio data stream

Files:

Methods:

Base classifiers:

Data streams:

  • Generator: stream-learn
  • Concept drift: incremental
  • Objects: 10 000
  • Features: 10
  • Imbalance Ratio: 10%
  • Noise: 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%
  • Random samples: 111, 222, 333, 444, 555

Results:

noise_exp

Selected mean results from noise experiments


Experiment 4 - Evaluating the performance on different balance ratio data stream

Files:

Base classifiers:

Methods:

Data streams:

  • Generator: stream-learn
  • Concept drift: incremental
  • Objects: 10 000
  • Features: 10
  • Imbalance Ratio: 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%
  • Noise: 10%
  • Random samples: 111, 222, 333, 444, 555

Results:

balance_exp

Selected mean results from noise and balance experiments


Experiment 5 - Main evaluation (synthetic data)

Files:

Base classifiers:

Methods:

Data streams:

  • Generators:
  • Concept drifts:
    • 1 sudden
    • 1 incremental
    • 5 sudden
    • 5 incremental
  • Objects: 100 000
  • Features: 10
  • Imbalance Ratio: 10%, 20%, 30%
  • Noise: 0%, 10%
  • Random samples: 111, 222

Results:

multi_incremental_hbar

Wilcoxon pair rank-sum tests for synthetic data streams with incremental concept drift. Dashed vertical line isa critical value with a confidence level 0.05 (green – win, yellow – tie, red – loss)

multi_sudden_hbar

Wilcoxon pair rank-sum tests for synthetic data streams with sudden concept drift. Dashed vertical line is acritical value with a confidence level 0.05 (green – win, yellow – tie, red – loss)


Experiment 5 - Main evaluation (real data)

Files:

Base classifiers:

Methods:

Data streams:

Results:

covtype

F-score metric over the data chunks for covtypeNorm-1-2vsAll data stream with SVM base classifier

poker

F-score metric over the data chunks for poker-lsn-1-2vsAll data stream with SVM base classifier

About

Deterministic Sampling Ensemble

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published