Skip to content

Developer Notes

Mo Tiwari edited this page Dec 28, 2021 · 17 revisions

Welcome to the BanditPAM wiki!

This is a space for code contributors to keep track of notes and learnings that don't belong in Github issues.

Highly Requested Features that Mo won't have time to work on:

  • An R implementation of BanditPAM
  • An MATLAB implementation of BanditPAM

Less Requested Features that Mo won't have the time to work on:

  • An integration with PySpark

Gotchas:

  • setuptools will always, at least partly, use the compiler that Python was compiled with. This causes a problem, e.g., when trying to install clang-compiled BanditPAM on gcc-compiled Python and was resulting in errors. This CANNOT be fixed by modifying the CC environment variable. See https://github.com/pypa/setuptools/issues/1732

Potential Cache Improvements:

  • potentially transpose cache to avoid false sharing
  • Move to multi-producer single-consumer queue for cache so that cache can be dynamically resized
  • Give each thread a local copy of cache
  • Helpful resource: Lecture 9 of series in OMP

Potential OpenMP Improvements:

  • Good practice to have default(none) inside all omp parallel workspace constructs
  • Prevent false sharing among threads for better speedups (This is dependent on local cache line size and datatype sizes)
  • Consider using loop reductions via OpenMP

Potential frameworks to investigate:

Clone this wiki locally