# Cluster

```
Perform the NBS for populations X and Y for a t-statistic threshold of alpha.

    Parameters
    ----------
    x : NxMxP np.ndarray, NXP np.ndarray, or 4D Nifti1Image - VOXELSxP
        matrix representing the first population with P subjects. must include masker if data is NXP
    y : NxMxQ np.ndarray, NXQ np.ndarray, or 4D Nifti1Image - VOXELSxQ
        matrix representing the second population with Q subjects. Q need not
        equal P unless paired is set to true.
    thresh : float
        minimum t-value used as threshold
    k : int
        number of permutations used to estimate the empirical null
        distribution
    tail : {'left', 'right', 'both'}
        enables specification of particular alternative hypothesis
        'left' : mean population of X < mean population of Y
        'right' : mean population of Y < mean population of X
        'both' : means are unequal (default)
    paired : bool
        use paired sample t-test instead of population t-test. requires both
        subject populations to have equal N. default value = False
    verbose : bool
        print some extra information each iteration. defaults value = False
    seed : hashable, optional
        If None (default), use the np.random's global random state to generate random numbers.
        Otherwise, use a new np.random.RandomState instance seeded with the given value.
    Returns
    -------
    pval : Cx1 np.ndarray
        A vector of corrected p-values for each component of the networks
        identified. If at least one p-value is less than alpha, the omnibus
        null hypothesis can be rejected at alpha significance. The null
        hypothesis is that the value of the connectivity from each edge has
        equal mean across the two populations.
    adj : IxIxC np.ndarray
        an adjacency matrix identifying the edges comprising each component.
        edges are assigned indexed values.
    null : Kx1 np.ndarray
        A vector of K sampled from the null distribution of maximal component
        size.
```

Basically, the cluster takes two different conditions, runs t-tests between two conditions & thresholds to create suprathersholded matrix, finds components that are clustered together in matrix, generates a null model from max cluster size of k permutations, and compares null model and components to generate p-values

Conditions can be 2d (voxels x subjects), 3d (n x m x subjects, usually fc connectivity), and 4d nifti image, (x, y, z, subjects)

Xitong knows this stuff too so you can ask her for help.

In [None]:
# Example

from thalpy import base, masks
import numpy as np
from neuro_cluster import cluster
import nibabel as nib


dir_tree = base.DirectoryTree('/data/backed_up/shared/xitchen_WM')
subjects = base.get_subjects(dir_tree.deconvolve_dir, dir_tree)

tasks = ['body-others', 'faces-others', 'places-others', 'tools-others', '2bk-0bk', '0bk', '2bk',
         '2bk_body-0bk_body', '2bk_faces-0bk_faces', '2bk_places-0bk_places', '2bk_tools-0bk_tools']

thal_masker = masks.binary_masker(masks.MOREL_PATH)
thal_masker.fit(nib.load(
    "/data/backed_up/shared/xitchen_WM/3dDeconvolve/sub-100206/100206_FIRmodel_errts_REML+tlrc.nii.gz"))

betas = np.load("thal_betas.npy")
print(betas.shape)
task1 = betas[:, 0, :]
task2 = betas[:, 4, :]
comps, pvals, adj, null_arr, sz_comps = cluster.run(task1, task2, masker=thal_masker,
                                                    thresh=2.583, k=1, tail='both', paired=True)

print(comps)
