Draft: New feature: Paired permutation test #768

fjhheras · 2021-01-19T15:41:29Z

Code of Conduct

Description

I would like to add the option to perform permutation tests when the two populations are paired.
http://axon.cs.byu.edu/Dan/478/assignments/permutation_test.php

The current pull request is just a first proposal. If you think this feature is useful enough to be implemented in mlxtend, I can improve documentation/style and introduce non-trivial tests (currently it is not well tested, so it can still be buggy).

Related issues or pull requests

Fixes #767

Pull Request Checklist

Added a note about the modification or contribution to the ./docs/sources/CHANGELOG.md file (if applicable)
[PARTIAL] Added appropriate unit test functions in the ./mlxtend/*/tests directories (if applicable)
Modify documentation in the corresponding Jupyter Notebook under mlxtend/docs/sources/ (if applicable)
Ran PYTHONPATH='.' pytest ./mlxtend -sv and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., PYTHONPATH='.' pytest ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv)
Checked for style issues by running flake8 ./mlxtend

pep8speaks · 2021-01-19T15:41:32Z

Hello @fjhheras! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-03-03 01:33:55 UTC

coveralls · 2021-01-19T15:48:39Z

Coverage decreased (-0.04%) to 90.734% when pulling f51e028 on fjhheras:paired_permutation into 894dce8 on rasbt:master.

rasbt · 2021-03-01T02:36:19Z

Sorry for getting back to it so late. I had a weirdly configured email client and haven't been getting Github notifications last month.

Anyways, thanks for the PR, I really appreciate it! I think there were some issues with the methodology though. Coincidentally, I was teaching a stats course last semester (unfortunately, as per syllabus, I had to use R :P) where I implemented a paired permutation (randomization) test. As far as I know, the standard variant is just flipping the pairs. Below is my R reference implementation that matches the textbook results for a small toy dataset:

pairedSamples<-function(diff,nSim=1000000){
  d <- diff
  
  ld <- length(d)
  obsmeand <- mean(d)
  meandvec <- rep(NA,nSim)
  for (i in 1:nSim){
    rsigns <- 2 * rbinom(ld, 1, .5) - 1
    newd <- rsigns * d
    newmeand <- mean(newd)
    meandvec[i] <- newmeand
  }
  pl <- sum(meandvec >= obsmeand)
  pg <- sum(meandvec <= obsmeand)
  pval <-  ifelse(pl < pg, 2 * pl/nSim, 2 * pg/nSim)
  options(scipen=1)
  if (pval==0){
    pval <- paste("p <",1/nSim)
  } else {
    pval <- paste("p =",pval) 
  }
  options(scipen=0)
  
  hhh <- hist(meandvec, plot=FALSE)
  hist(meandvec, xlab="Mean of differences", 
       main=paste("Paired Sample Randomization Test,", pval),
       probability=TRUE, xlim=range(c(hhh$breaks, obsmeand)),
       breaks=100, col="tan")
  abline(v=obsmeand, col="red")
  mtext(paste(round(obsmeand, digits=4)), side=1, at=obsmeand)
  }

nineteen80 = c(3.67, 1.72, 3.46, 2.60, 2.03, 2.10, 3.01)
nineteen90 = c(2.11, 1.79, 2.71, 1.89, 1.69, 1.71, 2.01)

For some reason, using the PR's code I couldn't reproduce these results. I think the problem were the arrays of different lengths. Tried to debug that but couldn't really figure out to fix it so I rewrote it using a more "naive" implementation (probably too many for-loops). However, it seems to work and reproduce the expected results now (added some unit tests).

Please let me know your feedback!

fjhheras · 2021-03-01T10:15:28Z

Well, that was embarrassing. My local version works fine, but somehow I forgot to add a np.where when doing the PR. My original implementation in the PR seems to output the right values after this change

indices_1 = np.where(rng.randn(n) > 0)[0]

instead of

indices_1 = rng.randn(n) > 0

In any case, the amended implementation you propose is much easier to follow. I checked times, and it is also a bit faster!

rasbt · 2021-03-03T01:04:25Z

No worries, and thanks for double-checking! Wow, I am positively surprised that this is actually not that slow. So, maybe we should keep the current version then ;). I add the docs.

Sandy4321 · 2021-09-03T16:01:00Z

can you share full python code example pls
how to use this new option

rasbt · 2021-09-03T17:43:18Z

I think this is example 3 here: http://rasbt.github.io/mlxtend/user_guide/evaluate/permutation_test/#example-3-paired-two-sample-randomization-test

fjhheras added 3 commits January 19, 2021 11:28

Paired permutation tests

345fa09

style change

f69777b

Refactoring

4a988c4

fjhheras mentioned this pull request Jan 19, 2021

Paired Permutation Test #767

Closed

fixing PEP8 spaces

f51e028

fjhheras and others added 2 commits January 19, 2021 15:56

Solving bug introduced with refactoring

2845907

fix paired permutation test

883d196

rasbt added 2 commits March 2, 2021 19:24

add documentation for paired permutation test

82c821c

Merge branch 'master' into paired_permutation

1e92880

rasbt merged commit d87638c into rasbt:master Mar 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: New feature: Paired permutation test #768

Draft: New feature: Paired permutation test #768

fjhheras commented Jan 19, 2021 •

edited

Loading

pep8speaks commented Jan 19, 2021 •

edited

Loading

coveralls commented Jan 19, 2021 •

edited

Loading

rasbt commented Mar 1, 2021

fjhheras commented Mar 1, 2021 •

edited

Loading

rasbt commented Mar 3, 2021

Sandy4321 commented Sep 3, 2021

rasbt commented Sep 3, 2021

Draft: New feature: Paired permutation test #768

Draft: New feature: Paired permutation test #768

Conversation

fjhheras commented Jan 19, 2021 • edited Loading

Code of Conduct

Description

Related issues or pull requests

Pull Request Checklist

pep8speaks commented Jan 19, 2021 • edited Loading

Comment last updated at 2021-03-03 01:33:55 UTC

coveralls commented Jan 19, 2021 • edited Loading

rasbt commented Mar 1, 2021

fjhheras commented Mar 1, 2021 • edited Loading

rasbt commented Mar 3, 2021

Sandy4321 commented Sep 3, 2021

rasbt commented Sep 3, 2021

fjhheras commented Jan 19, 2021 •

edited

Loading

pep8speaks commented Jan 19, 2021 •

edited

Loading

coveralls commented Jan 19, 2021 •

edited

Loading

fjhheras commented Mar 1, 2021 •

edited

Loading