Skip to content

Harmony v2#278

Merged
pati-ni merged 54 commits intoimmunogenomics:harmony2from
pati-ni:harmony-v2
Mar 23, 2026
Merged

Harmony v2#278
pati-ni merged 54 commits intoimmunogenomics:harmony2from
pati-ni:harmony-v2

Conversation

@pati-ni
Copy link
Copy Markdown
Collaborator

@pati-ni pati-ni commented Mar 23, 2026

Harmony 2 code getting ready for prime time

Defined 2 new typedefs one for R datastructures and one for internal
datastructures that use floats.
performance improvement attempt for huge datasets
- reduced iterations of kmeans to 4
- kmeans++ seeding centers now done according to efficient weighted
sampling

- mt19937 random()

- uniform_real_distribution 0.01,0.99: Avoid bias towards high random
numbers to bias towards high-acceptance range

[DEV] kmeans centroids initialization

- Low memory/ batch by cluster operation
- Verbose for logging progress

[FIX] remove existing centroids from kmeans++ init centroids

[BUG] replace fill::arma::randu with stdlib::rand()

Rcpp armadillo's rand function does not work with randu

[FIX] add elements to set during centroid initialization

If element exists already then backtrack and retry
Parameterize the batch proportion cutoff
unshuffle and then return
- also print some messages
- When covariate has one trivial level after subsetting it is dropped
altogether
Flat dense matrix, not the most memory efficient but better performance
These is just for archival purposes
A cell may belong to several different batches when different
covariates exist.

The design was assuming that all cells MUST have covariates+1 entries
in Phi.

However, if for a cell only one batch was dropped but the other
covariate has support, this is not true.
- Reproducible clusters by the R set.seed for the same embedding set
- null hypothesis gives a ratio close to 1
- Added pseudocount for cases when E is close to 1
@pati-ni pati-ni merged commit 3617c00 into immunogenomics:harmony2 Mar 23, 2026
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant