Skip to content

PasaOpasen/ab-testing-results-difference

Repository files navigation

ab-testing-results-difference

About

Here there is a code for processing next task:

Select best 'optimization' algorithm from {a1, a2, a3, ..., an} if there are N samples results for each of them

For examples, there are 3 optimization methods like a1(start_statement), a2(start_statement), a3(start_statement) and u need to select which one is best for your task. For this u should run these algorithms N times (more times is better) and create table like (N rows for each simulation, 3 columns for each algorithm)

a1 a2 a3
number11 number12 number13
number21 number22 number23
... ... ...
numberN1 numberN2 numberN3

U can use this table as R dataframe or export to .csv if u want.

In file abtest.R there is ab_test(dataframe) function for compare these algorithms with statistical analysis.

Idea of method

The function provides these steps:

  1. Check normal distribution for each column using Anderson-Darling Test For Normality
  2. If all columns are normal check presence of important differences using Variance Analysis, otherwise Kruskal-Wallis Rank Sum Test
  3. If there are some important differences, compare each pair using:

Output

At the end there will be the matrix like:

    a1  a2  a3
a1  0   1   -1
a2  -1   0   -1
a3  1   1   0

Where [a1, a2] == 1 means that mean(a1) is importantly more than mean(a2) but [a1, a3] == -1 means that mean(a1) is importantly less than mean(a3)

Visualizations

During the working it also creates boxplot for comparing results. Next result examples are got from testing several hyperparameters' values of genetic algorithm realization for ration searching at Nutrient planner

About

The logic I'm using to compare several optimization algorithms with different optimization parameters

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages