-
Notifications
You must be signed in to change notification settings - Fork 51
Developer Notes
Mo Tiwari edited this page Jan 18, 2022
·
17 revisions
Welcome to the BanditPAM wiki!
This is a space for code contributors to keep track of notes and learnings that don't belong in Github issues.
- An R implementation of BanditPAM
- An MATLAB implementation of BanditPAM
- An integration with PySpark
-
setuptoolswill always, at least partly, use the compiler that Python was compiled with. This causes a problem, e.g., when trying to installclang-compiled BanditPAM ongcc-compiled Python and was resulting in errors. This CANNOT be fixed by modifying theCCenvironment variable. See https://github.com/pypa/setuptools/issues/1732 - You may occasionally get a bug like
(Producer: 'LLVM13.0.0' Reader: 'LLVM 12.0.0'); somehow this was the case inbaseafter uninstalling and reinstalling somebrewpackages. Weirdly, it was resolved by creating a new Python 3.8conda environment, in which BanditPAM could be installed successfully, and then somehow (?!) fixed inbase
- potentially transpose cache to avoid false sharing
- Move to multi-producer single-consumer queue for cache so that cache can be dynamically resized
- Give each thread a local copy of cache
- Helpful resource: Lecture 9 of series in OMP
- Good practice to have default(none) inside all omp parallel workspace constructs
- Prevent false sharing among threads for better speedups (This is dependent on local cache line size and datatype sizes)
- Consider using loop reductions via OpenMP
- Right now, we compile with system python on the MacOS Github runners. It appears to work, though I'm not sure if the runners are using
gccorclang-- or if it matters, since thesetup.pyshould detect it properly.
- pbr
-
Cython -- Cython will likely be MUCH faster. It's how the
sklearnimplementation ofKMeansis written - Numba
- Eigen (
pybind11supports it out of the box, and we will likely no longer needcarmaorarmadillo) - Boost
- Folly