## Demo
Rosenbaum's test implementation.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np 
import networkx as nx
import pandas as pd
import sys
sys.path.append("..")
from rosenbaum import *

found cupy installation, will try use the GPU to calculate the distance matrix.


In [3]:
help(rosenbaum)

Help on function rosenbaum in module rosenbaum.rosenbaum:

rosenbaum(data, group_by, test_group, reference='rest', metric='mahalanobis', rank=True)
    Perform Rosenbaum's matching-based test for checking the association between two groups 
    using a distance-based matching approach.
    
    Parameters:
    -----------
    data : anndata.AnnData or pd.DataFrame
        The input data containing the samples and their respective features. If the input is an
        `AnnData` object, the samples and their corresponding features should be stored in `data.X` and the
        group labels in `data.obs[group_by]`. If using a `pandas.DataFrame`, the group labels should be in the
        column specified by `group_by`, and the feature matrix should be the remaining columns.
    
    group_by : str
        The column in `data.obs` or `data` (in case of a `pandas.DataFrame`) containing the group labels.
        The values of this column should include the `test_group` and potentially the `refer

In [4]:
samples_A = [np.random.normal(0, 2, 2) for _ in range(10)]
samples_B = [np.random.normal(0, 1, 2) for _ in range(10)]
samples_C = [np.random.normal(5, 1, 2) for _ in range(10)]

groups = ["A"] * 10 + ["B"] * 10 + ["C"] * 10
samples = np.array(samples_A + samples_B + samples_C)
data = pd.DataFrame(samples, columns=["X", "Y"])
data["Group"] = groups

In [5]:
p_val, a = rosenbaum(data, group_by="Group", test_group="A", reference="B")
print("P-value", p_val)

computing variable-wise ranks.
filtered samples.
using GPU to calculate distance matrix.
using CPU to calculate distance matrix due to chosen metric.
creating distance graph.
matching samples.
P-value 0.06819805581415489


In [6]:
p_val, a = rosenbaum(data, group_by="Group", test_group="A", reference="C")
print("P-value", p_val)

computing variable-wise ranks.
filtered samples.
using GPU to calculate distance matrix.
using CPU to calculate distance matrix due to chosen metric.
creating distance graph.
matching samples.
P-value 0.06819805581415489


In [7]:
p_val, a = rosenbaum(data, group_by="Group", test_group="A", reference="rest")
print("P-value", p_val)

computing variable-wise ranks.
using GPU to calculate distance matrix.
using CPU to calculate distance matrix due to chosen metric.
creating distance graph.
matching samples.
P-value 0.120039980009995
