You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm analysing a dataset with 168 data blocks, using the greedy algorithm for merging subsets.
When I do this in PF2, the first step of the greedy algorithm computes the BIC score of 14028 subset pairs. This is expected - 168 choose 2 = 14028.
But when I run what I think is the same analysis in MF2, the first step of the algorithm analyses just 2328 subset pairs. At least, this is what's reported in the logged output. I really can't figure out quite what's going on.
It gets stranger when you look at the next step of the algorithm, which should only be analysing subsets that add to the new subset, but in this case my output is as below.
I'm guessing my commandline is just wrong, and in particular that --merge greedy is not actually doing what I think it's doing, i.e. implementing an algorithm the same as the PF2 greedy algorithm.
FIrst, —merge greedy isn’t working to force it to do the greedy algorithm. The fix is to make sure that when the default -rcluster-max is specified, it doesn’t overwrite the —merge option. The upshot of this issue is that the algorithm here was rcluster, not greedy.
Second, we now understand why the number of schemes wasn’t right, even for the rcluster algorithm. We have a cool new feature for the rcluster algorithm which we forgot about! In the original in partition finder, we take the e.g. 10% of closest pairs based on absolute difference. THe issue with that is that we’ll choose lower-rate pairs preferentially, because they have the lower difference. So we’ve added a feature where we take the union of two sets:
The rcluster-max closest in absolute rate difference
The rcluster-max closest in log(rate) difference
The upshot is that the number of schemes in every step is between rcluster-max and 2*rcluster_max
Thanks for reporting the issue. It was due to an upper bound introduced for the number of candidates when computing the best-fit partition scheme.
In version 2.2.2.8 (https://github.com/iqtree/iqtree2/releases/tag/v2.2.2.8), the issue has been fixed and the greedy algorithm for computing the best-fit partitioning scheme (i.e. option: --merge greedy) should work as expected.
I'm analysing a dataset with 168 data blocks, using the greedy algorithm for merging subsets.
When I do this in PF2, the first step of the greedy algorithm computes the BIC score of 14028 subset pairs. This is expected - 168 choose 2 = 14028.
But when I run what I think is the same analysis in MF2, the first step of the algorithm analyses just 2328 subset pairs. At least, this is what's reported in the logged output. I really can't figure out quite what's going on.
It gets stranger when you look at the next step of the algorithm, which should only be analysing subsets that add to the new subset, but in this case my output is as below.
I'm guessing my commandline is just wrong, and in particular that
--merge greedy
is not actually doing what I think it's doing, i.e. implementing an algorithm the same as the PF2 greedy algorithm.Commandline
The text was updated successfully, but these errors were encountered: