More BLeSS efficiencies #434

davidcerny · 2024-03-19T18:38:47Z

Some additional improvements to the performance of ln(pdf), again following very helpful suggestions by Claudio Kozický (@qak). The main changes are as follows:

(1) The "alignment" of the average distance matrix and the distance matrix of a proposed tree (i.e., the step of ensuring that we are subtracting distances corresponding to the same pair of taxa) is done using std::sort rather than std::find, which is up to two orders of magnitude faster for 10,000-by-10,000 matrices;

(2) Indexing is done using uint32_t rather than size_t, which speeds up memory access (and consequently, the matrix subtraction step).

In practice, the performance improvement gained by these changes can be difficult to observe, because the total amount of time spent on evaluating the (log) probability density of a proposed tree is now dominated by the time it takes to convert the tree to a distance matrix. However, some experimentation with std::chrono::steady_clock will show that the ln(pdf) calculation implemented in this branch is about 12% faster for 10,000-by-10,000 matrices (~0.63 vs. ~0.55 s).

hoehna

Looks good, the only suggestion I have is to move the string index sort to StringUtilities so that it can be used more broadly.

hoehna · 2024-03-28T11:57:24Z

src/core/math/distributions/DistributionExponentialError.cpp

@@ -21,6 +23,28 @@

 using namespace RevBayesCore;

+/*!


Wouldn't it make sense to add this function to StringUtilities?

Great point – moved it there!

The requested changes were implemented.

davidcerny added 4 commits March 11, 2024 18:24

Trying to use std::sort instead of std::find

8dca8c4

Fixing an undefined reference error

5aec7bc

Fixing boolean mask indexing

6d26379

No nested indices; uint32_t instead of size_t

c04d0aa

davidcerny requested a review from bredelings March 19, 2024 18:38

davidcerny changed the title ~~More bless efficiencies~~ More BLeSS efficiencies Mar 19, 2024

davidcerny requested a review from hoehna March 26, 2024 17:49

hoehna previously requested changes Mar 28, 2024

View reviewed changes

davidcerny and others added 5 commits March 28, 2024 13:06

Moving stringSortIndices() to StringUtilities

5656c61

Merge branch 'development' into more-bless-efficiencies

af34f0c

Fix the 'use of undeclared identifier' error for std::uint32_t

36b62ca

Including cstdint in the header file too

3a3ef50

Merge branch 'development' into more-bless-efficiencies

c80b4ba

bredelings approved these changes Mar 29, 2024

View reviewed changes

bredelings merged commit ed422ba into development Mar 29, 2024
20 checks passed

davidcerny deleted the more-bless-efficiencies branch March 29, 2024 18:53

davidcerny mentioned this pull request Apr 18, 2024

Compute average distance matrices more efficiently #454

Merged

davidcerny mentioned this pull request May 12, 2024

Improvements to MatrixBoolean and AverageDistanceMatrix #469

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More BLeSS efficiencies #434

More BLeSS efficiencies #434

davidcerny commented Mar 19, 2024

hoehna left a comment

hoehna Mar 28, 2024

davidcerny Mar 28, 2024

More BLeSS efficiencies #434

More BLeSS efficiencies #434

Conversation

davidcerny commented Mar 19, 2024

hoehna left a comment

Choose a reason for hiding this comment

hoehna Mar 28, 2024

Choose a reason for hiding this comment

davidcerny Mar 28, 2024

Choose a reason for hiding this comment