Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add inference #79

Draft
wants to merge 48 commits into
base: dev
Choose a base branch
from
Draft

add inference #79

wants to merge 48 commits into from

Conversation

ejd99
Copy link
Collaborator

@ejd99 ejd99 commented Feb 24, 2023

No description provided.

@richardreeve richardreeve self-requested a review February 24, 2023 14:19
@coveralls
Copy link

coveralls commented Feb 24, 2023

Pull Request Test Coverage Report for Build 6700074205

  • 2 of 224 (0.89%) changed or added relevant lines in 3 files are covered.
  • 46 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-5.6%) to 54.527%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/Interface.jl 0 1 0.0%
src/inference.jl 0 221 0.0%
Files with Coverage Reduction New Missed Lines %
src/Interface.jl 1 55.56%
src/TreeSet.jl 1 42.42%
src/trim.jl 2 95.0%
src/rcall.jl 42 0.0%
Totals Coverage Status
Change from base Build 5476180660: -5.6%
Covered Lines: 1361
Relevant Lines: 2496

💛 - Coveralls

@mkborregaard
Copy link
Member

Nice! What does this PR do?

@richardreeve
Copy link
Member

richardreeve commented Mar 1, 2023

Hi @mkborregaard - Emily is my PhD student and she is writing code to do trait inference and calculations of phylogenetic signal on trees, so she's initially reproducing some R phylolm code to do a ML approach, and then she'll be developing a Bayesian algorithm.

@mkborregaard
Copy link
Member

Sounds amazing! I guessed as much :-) would it be nicest if I interact/comment on the PRs or better if I just keep hands off?

@richardreeve richardreeve marked this pull request as draft March 1, 2023 14:58
@richardreeve
Copy link
Member

It would be great to have your input in a bit, though if you have any obvious thoughts feel free to chip in. It's very much WIP at the moment though (I've formally converted it to a draft), so feedback once it's a bit more advanced would be very helpful.

Copy link

codecov bot commented Dec 20, 2023

Codecov Report

Attention: Patch coverage is 0% with 239 lines in your changes are missing coverage. Please review.

Project coverage is 72.78%. Comparing base (bbf4476) to head (c6f13f7).

❗ Current head c6f13f7 differs from pull request most recent head ae5c3f9. Consider uploading reports for the commit ae5c3f9 to get more accurate results

Files Patch % Lines
src/inference.jl 0.00% 239 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev      #79      +/-   ##
==========================================
- Coverage   80.15%   72.78%   -7.37%     
==========================================
  Files          19       20       +1     
  Lines        2958     3197     +239     
==========================================
- Hits         2371     2327      -44     
- Misses        587      870     +283     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@richardreeve
Copy link
Member

richardreeve commented Apr 3, 2024

@ejd99 - I've updated the code to use a union of a Dual and a Float64 by default, but you can select a Number (2-3x as slow!) or Float64 (5 times faster, but doesn't work with lambda). The code still scales linearly with number of tips, and on my M1 Max, it looks like it would run a 200,000 tip tree with just β and σ is about 3 hours, and β, σ and λ in about 20 hours. Obviously if I can work out how to select the data type better so it's completely type stable it'll hopefully speed up more, but this is a good start.

Model Tips Float64 Union Number
covariance β and σ 100 13.66s = 0.137s / tip N/A N/A
covariance β and σ 200 95.17s = 0.476s / tip N/A N/A
threepoint β and σ 100 5.63s = 0.056s / tip 29.36s 69.11s
threepoint β and σ 200 11.04s = 0.055s / tip 66.5s
threepoint β, σ and λ 100 N/A 25.14s = 0.251s / tip 78.5s
threepoint β, σ and λ 200 N/A 51.01s = 0.255s / tip 159.14s
threepoint β, σ and λ 50k N/A 19093s = 0.382s / tip
threepoint β, σ and λ 100k N/A 35275s = 0.353s / tip

@richardreeve
Copy link
Member

richardreeve commented Apr 3, 2024

I've identified an error in how you were using lambda in logpdf(::MyDist3, ...), which I think caused the issue that lambda wasn't converging correctly, and I've improved the speed of the function (I've updated the table above). As a result, somewhat surprisingly, the lambda version runs slightly faster using the same node data type as the one without. That might suggest I've introduced an error of course, so you should check! However, the simpler model is ultimately much faster because it can use just Float64s.

Edit: Looking at the speeds, I suspect that it's scaling as $tip^1$ for the threepoint model and $tip^2$ for the covariance matrix model, but we see an additional slowdown in both cases because, as the numbers of tips increases, the data can no longer be stored in a fast cache and has to be stored in main memory. Not super important, but in case you were wondering what was happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants