v0.1.1
🚀 GENBoostGPU v0.1.1
Genomic Elastic Net Boosting on GPU (GENBoostGPU)
GPU-accelerated elastic net regression with boosting for large-scale epigenomic data analysis.
✨ What’s New in v0.1.1
-
Core orchestration functions added
run_windows_with_dask: batch orchestration of genomic windows across single or multiple GPUs using Dask.run_single_window: single-region boosting elastic net with flexible inputs (preloaded arrays or file paths).
-
Boosting Elastic Net updates
- Added Optuna-based hyperparameter tuning for ElasticNet (
alpha,l1_ratio). - Added Ridge regression tuning with delayed evaluation.
- Early stopping based on variance explained stability.
- Final Ridge refit for improved betas and heritability estimates.
- Added Optuna-based hyperparameter tuning for ElasticNet (
-
SNP preprocessing improvements
- Zero-variance filtering
- Missing genotype imputation
- LD clumping (PLINK-like) implemented in CuPy
- Cis-window filtering with error region support
-
Output enhancements
- Per-window
.parquetsummary tables - Saved betas and heritability estimates per phenotype-window pair
- Support for reproducibility with saved model parameters
- Per-window
-
Examples
examples/vmr_test_caudate.py: run on VMR-defined CpG regions.examples/simu_test_100n.py: run on simulated SNP–phenotype datasets.
📦 Installation
Available on [PyPI](https://pypi.org/project/genboostgpu/):
pip install genboostgpuRequires:
- Python ≥ 3.10
- NVIDIA GPU with CUDA 12.x
- RAPIDS cuML / cuDF / CuPy
🔧 Quick Usage
from genboostgpu.vmr_runner import run_single_window
result = run_single_window(
chrom=21,
start=10_000,
end=510_000,
geno_path="data/chr21_subset.bed",
pheno_path="data/phenotypes.tsv",
pheno_id="pheno_379",
outdir="results",
n_iter=50,
n_trials=10
)
print(result)Full Changelog: v0.1.0...v0.1.1