currentNe_GPU

Modified GPU-accelerated currentNe(https://github.com/esrud/currentNe) with PED/MAP, and VCF input support, plus complete Ne estimation & confidence intervals. GPU-accelerated fork of currentNe adding PED/MAP and VCF input, and providing end-to-end Nₑ estimation with confidence intervals. The GPU path computes weighted LD (d²) in FP64 using atomicAdd(double*), while Nₑ and CIs follow the original integration and neural-network variance model.

Requires: NVIDIA GPU ≥ Pascal (SM ≥ 6.0), NVIDIA driver + CUDA Toolkit (12+), gcc/g++ & make, and ~1 GB free GPU memory (more for large datasets).

Cooling note: Not recommended to run on passively cooled (fanless) Tesla GPUs without server-grade, front-to-back airflow. The FP64 path saturates the FP units for extended periods, creating stress-test-level thermal load (stress FPU). Inadequate airflow will cause throttling or faults.

Citations: Santiago, E., Caballero, A., Köpke, C., & Novo, I. (2024). Estimation of the contemporary effective population size from SNP data while accounting for mating structure. Molecular Ecology Resources, 24, e13890. https://doi.org/10.1111/1755-0998.13890

Santiago, E., Köpke, C. & Caballero, A. Accounting for population structure and data quality in demographic inference with linkage disequilibrium methods. Nat Commun 16, 6054 (2025). https://doi.org/10.1038/s41467-025-61378-w##

CUDA build (recommended)

unzip currentNe_gpu_full.zip
cd currentNe_gpu_full
make ARCH=sm_89        # choose your GPU's SM arch (sm_70, sm_80, sm_86, sm_89 ...) also should be set `ARCH ?=sm_89` in Makefile accordingly.

This creates ./currentNe_gpu.

CPU fallback

make cpu

This creates ./currentNe_gpu_cpu (OpenMP).

Run

General form:

ulimit -s unlimited    #default Maxloci setting to 20 million, can increase in the cpp file.
./currentNe_gpu <datafile> <num_chromosomes> [options]

<datafile>: one of
- prefix.vcf
- prefix.ped (requires prefix.map in the same folder)
- prefix.tped (with individuals as columns following the first 4 fields)
<num_chromosomes>: required (e.g., 22 for human-like autosomes, or the true count for your organism's autosomes).

Common options:

-s <N> Number of SNPs to use (default: all segregating)
-t <T> CPU threads (for non-GPU parts; default: OpenMP auto)
-o <file> Output filename (default: <prefix>_currentNe_OUTPUT.txt)
-k <int> Important, please see original description in currentNe
-q Quiet: only print Ne (and with -v also 50% & 90% CI)
-v With -q, also print CIs
-p Print full analysis to stdout instead of file

Examples:

# TPED
./currentNe_gpu mydata.tped 19 -t 8

# PED/MAP
./currentNe_gpu mypop.ped 19 -t 8

# VCF
./currentNe_gpu cohort.vcf 19 -t 8
./currentNe_gpu cohort.vcf 19 -t 8 -k 1

-t 8 is enough

Output

Full report file (unless -p): <prefix>_currentNe_OUTPUT.txt
Includes: input stats, d², expected/observed het, Ne point estimate, 50%/90% CI.

Notes

Double atomicAdd requires GPU architecture sm_60+; set ARCH accordingly.
For very large SNP counts, memory = L × N bytes (char). Consider filtering -s or thinning SNPs.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
lib		lib
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
currentNe.cpp		currentNe.cpp
currentNe_gpu_full.zip		currentNe_gpu_full.zip
currentNe_gpu_snippet.h		currentNe_gpu_snippet.h
gpu_ld.cu		gpu_ld.cu
gpu_ld.cuh		gpu_ld.cuh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

currentNe_GPU

CUDA build (recommended)

CPU fallback

Run

Output

Notes

About

Uh oh!

Releases 1

Packages

Languages

License

hrluo93/currentNe_GPU

Folders and files

Latest commit

History

Repository files navigation

currentNe_GPU

CUDA build (recommended)

CPU fallback

Run

Output

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages