Skip to content

omics-rust/rsomics-ilr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rsomics-ilr

Isometric log-ratio (ILR) transform of a composition table, as a single fast CLI. Equivalent to skbio.stats.composition.ilr with its default basis.

ILR maps a D-part composition out of the simplex into D-1 real coordinates that are isometric (preserve Aitchison distances), unlike CLR's D collinear coordinates:

ilr(x) = clr(x) · Vᵀ

where V is the (D-1) × D Egozcue/Gram-Schmidt orthonormal basis (skbio's default). Each balance contrasts one part against the geometric mean of the parts before it.

rsomics-ilr table.tsv [--pseudocount 0] [-o ilr.tsv]
  • table.tsv — composition table: header row of feature IDs (corner cell ignored), then one sample_id value... line per sample.
  • --pseudocount — added to every value before the log. The default 0 requires strictly positive data, matching skbio (nozero=True); set a small positive value to admit zeros.

Output is the samples × (D-1) coordinate matrix: an ilr0 … ilr{D-2} header, then one sample_id<TAB>value... line per sample. Use --csv for comma-separated I/O.

The default basis is lower-triangular, so the dense clr·Vᵀ matmul collapses to one prefix-sum pass per sample — O(D) per sample rather than the O(D²) numpy tensordot, while staying value-exact.

Origin

This crate is an independent Rust reimplementation of skbio.stats.composition.ilr based on:

  • Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G., & Barceló-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3), 279–300. DOI: 10.1023/A:1023818214614
  • The scikit-bio implementation (Modified BSD License), read and cited: ilr computes clr(mat) then tensordot(clr, basis, axes=([axis],[1])), where the default basis is _gram_schmidt_basis(D) — a (D-1) × D matrix whose column j (with i = j+1) is [1/i … (i times), -1, 0 …] · sqrt(i/(i+1)).

ILR coordinates are value-exact vs scikit-bio to ~1e-9 (tests/compat.rs diffs both a committed skbio-captured golden and a live skbio.stats.composition.ilr run).

License: MIT OR Apache-2.0. Upstream credit: scikit-bio https://scikit-bio.org/ (Modified BSD License).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors