Quirks (or possibly 'bugs') of the Hyppo Friedman Rafsky module
---

In [6]:
import numpy as np
from numpy.random import default_rng
from hyppo.independence import FriedmanRafsky

seed = 402

In [7]:
# Helper Methods

# A helper method to format the input for the Friedman Rafsky test.
def combine_then_label(x1, x2):
    return_X = np.concatenate([x1, x2])

    n1 = x1.shape[0]
    n2 = x2.shape[0]

    return_Y = np.repeat([1, 2], [n1, n2])

    return return_X, return_Y

# Helper method to run Friednman Rafsky and output the following
# - pvalue
# - corrected test statistics
# - uncorrected test statistic
def run_FR_simtest(x,y):
    x,y = combine_then_label(x, y)
    method = FriedmanRafsky()
    test_rslt = method.test(x,y)
    statistic_rslt = method.statistic(x,y) # uncorrected test statistic
    print(f"test method output: \n\tstatistic: {test_rslt[0]}\n\tpvalue: {test_rslt[1]}")
    print(f"statisc method output: {statistic_rslt}")

From the description of the algorithm in [API Reference](https://hyppo.neurodata.io/api/generated/hyppo.independence.friedmanrafsky#friedmanrafsky), one would expect the following tests to return a statistic of $2$. However, the returned statistic is $20$

In [29]:
shape_1 = (10, 2)

print("***Test 1***")
ones = np.ones(shape_1) 
zeros = np.zeros(shape_1)
run_FR_simtest(ones, zeros)

print("\n***Test 2***")
ones = np.ones(shape_1) 
hundreds = np.zeros(shape_1) + 100
run_FR_simtest(ones, hundreds)

***Test 1***
test method output: 
	statistic: 5.591740407994612
	pvalue: 0.000999000999000999
statisc method output: 20

***Test 2***
test method output: 
	statistic: 6.031816868094657
	pvalue: 0.000999000999000999
statisc method output: 20


# Distributions that FR throws an error for

The following pairs of distributions surpsingly throw an errors
1. `integers` and `integers`. e.g.,
```
    integers(200, 300, size=shape_1)
    rng.integers(0, 100, size=shape_1)
```
2. `ones` and `ones`
 

In [None]:
rng = default_rng()

x1 = rng.integers(200, 300, size=shape_1)
x2 = rng.integers(0, 100, size=shape_1)

run_FR_simtest(x1, x2)