Skip to content

Conversation

@mhauru
Copy link
Member

@mhauru mhauru commented Oct 30, 2025

More aggressively tighten element types after each setindex_internal!!. This fixes performance for some models, most notably the Loop univariate 10k that showed substantial disadvantage compared to Metadata in still in #1098.

Also, add tests for tighten_ and loosen_types!!. These become more important since we are calling those two functions all the time now, and they must be compile-time no-ops for this to not cause a large overhead. Tests check this the best they can (I've checked more thoroughly manually with code_typed).

Also fix a type instability in loosen_types!! caught by the new tests.

Benchmarking now, depending on results will either mark this ready or add more performance improvements to this if needed.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 30, 2025

Benchmark Report for Commit 305b730

Computer Information

Julia Version 1.11.7
Commit f2b3dbda30a (2025-09-08 12:10 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, icelake-server)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

┌───────────────────────┬───────┬─────────────┬───────────────────┬────────┬────────────────┬─────────────────┐
│                 Model │   Dim │  AD Backend │           VarInfo │ Linked │ t(eval)/t(ref) │ t(grad)/t(eval) │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼────────────────┼─────────────────┤
│ Simple assume observe │     1 │ forwarddiff │             typed │  false │            8.0 │             1.5 │
│           Smorgasbord │   201 │ forwarddiff │             typed │  false │          837.0 │            46.9 │
│           Smorgasbord │   201 │ forwarddiff │ simple_namedtuple │   true │          447.0 │            66.3 │
│           Smorgasbord │   201 │ forwarddiff │           untyped │   true │          856.6 │            40.4 │
│           Smorgasbord │   201 │ forwarddiff │       simple_dict │   true │         7011.2 │            28.7 │
│           Smorgasbord │   201 │ forwarddiff │      typed_vector │   true │          876.3 │            42.0 │
│           Smorgasbord │   201 │ forwarddiff │    untyped_vector │   true │          826.3 │            40.0 │
│           Smorgasbord │   201 │ reversediff │             typed │   true │          949.3 │            47.9 │
│           Smorgasbord │   201 │    mooncake │             typed │   true │          794.4 │             5.4 │
│           Smorgasbord │   201 │      enzyme │             typed │   true │          932.1 │             3.9 │
│    Loop univariate 1k │  1000 │    mooncake │             typed │   true │         3873.9 │             6.5 │
│       Multivariate 1k │  1000 │    mooncake │             typed │   true │         1301.8 │             7.2 │
│   Loop univariate 10k │ 10000 │    mooncake │             typed │   true │        41073.1 │             6.2 │
│      Multivariate 10k │ 10000 │    mooncake │             typed │   true │        11934.5 │             7.9 │
│               Dynamic │    10 │    mooncake │             typed │   true │          142.0 │             7.9 │
│              Submodel │     1 │    mooncake │             typed │   true │           10.3 │             5.5 │
│                   LDA │    12 │ reversediff │             typed │   true │          941.6 │             2.1 │
└───────────────────────┴───────┴─────────────┴───────────────────┴────────┴────────────────┴─────────────────┘

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.23%. Comparing base (80cf12d) to head (305b730).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1102      +/-   ##
==========================================
+ Coverage   81.17%   81.23%   +0.05%     
==========================================
  Files          40       40              
  Lines        3793     3805      +12     
==========================================
+ Hits         3079     3091      +12     
  Misses        714      714              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Contributor

DynamicPPL.jl documentation for PR #1102 is available at:
https://TuringLang.github.io/DynamicPPL.jl/previews/PR1102/

@mhauru
Copy link
Member Author

mhauru commented Oct 30, 2025

Current benchmarks:

┌───────────────────────┬───────┬─────────────┬────────────────┬────────┬────────────────┬─────────────────┐
│                 Model │   Dim │  AD Backend │        VarInfo │ Linked │ t(eval)/t(ref) │ t(grad)/t(eval) │
├───────────────────────┼───────┼─────────────┼────────────────┼────────┼────────────────┼─────────────────┤
│ Simple assume observe │     1 │    mooncake │          typed │  false │            4.5 │             5.8 │
│ Simple assume observe │     1 │    mooncake │   typed_vector │  false │            8.1 │             6.6 │
│ Simple assume observe │     1 │    mooncake │        untyped │  false │           36.0 │             1.6 │
│ Simple assume observe │     1 │    mooncake │ untyped_vector │  false │            6.3 │             7.3 │
│           Smorgasbord │   201 │ reversediff │          typed │  false │          370.5 │            50.8 │
│           Smorgasbord │   201 │ reversediff │   typed_vector │  false │          407.4 │            46.5 │
│           Smorgasbord │   201 │ reversediff │        untyped │  false │         4072.0 │             4.6 │
│           Smorgasbord │   201 │ reversediff │ untyped_vector │  false │          336.3 │            56.4 │
│    Loop univariate 1k │  1000 │    mooncake │          typed │   true │         1742.8 │             4.7 │
│    Loop univariate 1k │  1000 │    mooncake │   typed_vector │   true │         1822.0 │             4.7 │
│    Loop univariate 1k │  1000 │    mooncake │        untyped │   true │         1779.7 │            18.0 │
│    Loop univariate 1k │  1000 │    mooncake │ untyped_vector │   true │         1818.4 │             4.6 │
│       Multivariate 1k │  1000 │    mooncake │          typed │   true │          425.4 │             8.4 │
│       Multivariate 1k │  1000 │    mooncake │   typed_vector │   true │          437.9 │             8.6 │
│       Multivariate 1k │  1000 │    mooncake │        untyped │   true │         1625.9 │             2.3 │
│       Multivariate 1k │  1000 │    mooncake │ untyped_vector │   true │          419.1 │             8.2 │
│   Loop univariate 10k │ 10000 │    mooncake │          typed │   true │        17784.3 │             4.9 │
│   Loop univariate 10k │ 10000 │    mooncake │   typed_vector │   true │        19867.9 │             4.7 │
│   Loop univariate 10k │ 10000 │    mooncake │        untyped │   true │        19939.9 │            18.1 │
│   Loop univariate 10k │ 10000 │    mooncake │ untyped_vector │   true │        18634.1 │             4.7 │
│      Multivariate 10k │ 10000 │    mooncake │          typed │   true │         3693.4 │             9.4 │
│      Multivariate 10k │ 10000 │    mooncake │   typed_vector │   true │         3759.9 │             9.2 │
│      Multivariate 10k │ 10000 │    mooncake │        untyped │   true │        14833.3 │             2.4 │
│      Multivariate 10k │ 10000 │    mooncake │ untyped_vector │   true │         3545.0 │             9.5 │
│               Dynamic │    10 │    mooncake │          typed │   true │           71.1 │             6.2 │
│               Dynamic │    10 │    mooncake │   typed_vector │   true │           97.1 │             5.7 │
│               Dynamic │    10 │    mooncake │ untyped_vector │   true │           78.2 │             6.8 │
│              Submodel │     1 │    mooncake │          typed │   true │            5.4 │             5.2 │
│              Submodel │     1 │    mooncake │   typed_vector │   true │            9.9 │             5.7 │
│              Submodel │     1 │    mooncake │        untyped │   true │            4.5 │            10.8 │
│              Submodel │     1 │    mooncake │ untyped_vector │   true │            8.1 │             5.7 │
│                   LDA │    12 │ reversediff │          typed │   true │          468.5 │             2.0 │
│                   LDA │    12 │ reversediff │   typed_vector │   true │          497.3 │             1.9 │
└───────────────────────┴───────┴─────────────┴────────────────┴────────┴────────────────┴─────────────────┘

Still not happy with that overhead, will try to understand it better and fix.

@yebai
Copy link
Member

yebai commented Oct 31, 2025

Thanks, Markus. To clarify, is the below accurate?

  • typed = typed varinfo with namedtuple of metadata
  • untyped = varinfo with Metadata
  • typed vector = varinfo with namedtuple of VNV (ie, varnamedvector)
  • untyped vector = varinfo with VNV

@mhauru
Copy link
Member Author

mhauru commented Oct 31, 2025

Yes, that's correct.

@mhauru
Copy link
Member Author

mhauru commented Oct 31, 2025

Some of the overheads turned out to be in unflatten, where some unnecessary recontiguification was being done. With that fixed:

┌───────────────────────┬───────┬─────────────┬────────────────┬────────┬────────────────┬─────────────────┐
│                 Model │   Dim │  AD Backend │        VarInfo │ Linked │ t(eval)/t(ref) │ t(grad)/t(eval) │
├───────────────────────┼───────┼─────────────┼────────────────┼────────┼────────────────┼─────────────────┤
│ Simple assume observe │     1 │    mooncake │          typed │  false │           11.2 │             5.6 │
│ Simple assume observe │     1 │    mooncake │   typed_vector │  false │           11.2 │             7.8 │
│ Simple assume observe │     1 │    mooncake │        untyped │  false │           87.7 │             1.7 │
│ Simple assume observe │     1 │    mooncake │ untyped_vector │  false │            6.7 │            11.3 │
│           Smorgasbord │   201 │ reversediff │          typed │  false │          921.6 │            51.3 │
│           Smorgasbord │   201 │ reversediff │   typed_vector │  false │          944.1 │            50.3 │
│           Smorgasbord │   201 │ reversediff │        untyped │  false │         9901.3 │             4.8 │
│           Smorgasbord │   201 │ reversediff │ untyped_vector │  false │          802.4 │            59.9 │
│    Loop univariate 1k │  1000 │    mooncake │          typed │   true │         4385.3 │             4.8 │
│    Loop univariate 1k │  1000 │    mooncake │   typed_vector │   true │         4524.7 │             4.9 │
│    Loop univariate 1k │  1000 │    mooncake │        untyped │   true │         4464.0 │            17.5 │
│    Loop univariate 1k │  1000 │    mooncake │ untyped_vector │   true │         4377.5 │             4.7 │
│       Multivariate 1k │  1000 │    mooncake │          typed │   true │         1049.7 │             8.4 │
│       Multivariate 1k │  1000 │    mooncake │   typed_vector │   true │         1061.0 │             8.4 │
│       Multivariate 1k │  1000 │    mooncake │        untyped │   true │         4075.2 │             2.3 │
│       Multivariate 1k │  1000 │    mooncake │ untyped_vector │   true │         1000.3 │             8.5 │
│   Loop univariate 10k │ 10000 │    mooncake │          typed │   true │        44469.4 │             5.0 │
│   Loop univariate 10k │ 10000 │    mooncake │   typed_vector │   true │        45690.0 │             5.1 │
│   Loop univariate 10k │ 10000 │    mooncake │        untyped │   true │        49740.4 │            16.6 │
│   Loop univariate 10k │ 10000 │    mooncake │ untyped_vector │   true │        44590.8 │             5.0 │
│      Multivariate 10k │ 10000 │    mooncake │          typed │   true │         9332.6 │             9.3 │
│      Multivariate 10k │ 10000 │    mooncake │   typed_vector │   true │         9323.7 │             9.3 │
│      Multivariate 10k │ 10000 │    mooncake │        untyped │   true │        37636.3 │             2.5 │
│      Multivariate 10k │ 10000 │    mooncake │ untyped_vector │   true │         9418.0 │             9.1 │
│               Dynamic │    10 │    mooncake │          typed │   true │          179.9 │             6.2 │
│               Dynamic │    10 │    mooncake │   typed_vector │   true │          193.3 │             6.4 │
│               Dynamic │    10 │    mooncake │ untyped_vector │   true │          175.3 │             7.4 │
│              Submodel │     1 │    mooncake │          typed │   true │           13.5 │             5.0 │
│              Submodel │     1 │    mooncake │   typed_vector │   true │           13.5 │             6.8 │
│              Submodel │     1 │    mooncake │        untyped │   true │           11.2 │            11.6 │
│              Submodel │     1 │    mooncake │ untyped_vector │   true │           11.2 │             7.2 │
│                   LDA │    12 │ reversediff │          typed │   true │         1191.3 │             2.0 │
│                   LDA │    12 │ reversediff │   typed_vector │   true │         1166.6 │             2.1 │
└───────────────────────┴───────┴─────────────┴────────────────┴────────┴────────────────┴─────────────────┘

Loop univariate has gotten a lot better, from a 12% slow down to 3%. I might still look into e.g. Dynamic, see if I can squeeze that down, but if there's nothing obvious to be found I would call this good enough and start a PR replacing Metadata with VNV.

@mhauru mhauru marked this pull request as ready for review October 31, 2025 16:32
@mhauru mhauru requested a review from penelopeysm October 31, 2025 16:32
Comment on lines +1300 to +1303
# Linking can often change the sizes of variables, causing inactive elements. We don't
# want to keep them around, since typically linking is done once and then the VarInfo
# is evaluated multiple times. Hence we contiguify here.
metadata = contiguify!(metadata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the code correctly, I think contiguify! should return nothing and this should not assign to metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should contiguify! return nothing? It doesn't have to return anything since it does its work in-place, but I think it can, see #653 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh. For sanity's sake I'd prefer single-bang to return nothing, but ok.

# Linking can often change the sizes of variables, causing inactive elements. We don't
# want to keep them around, since typically linking is done once and then the VarInfo
# is evaluated multiple times. Hence we contiguify here.
metadata = contiguify!(metadata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.

Comment on lines +590 to +596
@test vnv == DynamicPPL.tighten_types!!(deepcopy(vnv))
# TODO(mhauru) We would like to check something more stringent here, namely that
# the operation is compiled to a direct no-op, with no instructions at all. I
# don't know how to do that though, so for now we just check that it doesn't
# allocate.
@allocations(DynamicPPL.tighten_types!!(vnv)) == 0
return nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do something with this?

julia> using DynamicPPL; vnv = DynamicPPL.VarNamedVector();

julia> ct = code_typed(DynamicPPL.tighten_types!!, (typeof(vnv),))
1-element Vector{Any}:
 CodeInfo(
1nothing::Nothing
└──     return vnv
) => DynamicPPL.VarNamedVector{Union{}, Union{}, Union{}, Vector{Union{}}, Vector{Union{}}, Vector{Union{}}}

julia> only(ct).first.code
2-element Vector{Any}:
 nothing
 :(return _2)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered, but this seems brittle to me. Why is there a nothing in that Vector? Will there always be nothing, or will that change with updates to Julia?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loosen_types!! does this:

julia> ct = code_typed(DynamicPPL.loosen_types!!, (typeof(vnv), Type{Union{}}, Type{Union{}}, Type{Union{}}))
1-element Vector{Any}:
 CodeInfo(
1nothing::Nothingnothing::Nothingnothing::Nothing
└──     return vnv
) => DynamicPPL.VarNamedVector{Union{}, Union{}, Union{}, Vector{Union{}}, Vector{Union{}}, Vector{Union{}}}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fair. On 1.10 there's no nothing. 🤷‍♀️

julia> only(ct).first.code
1-element Vector{Any}:
 :(return _2)

I guess it's like pick your poison, either you aren't really testing what you want to test, or you have to use some language internals. Happy with either choice (let's hope @allocations doesn't get broken again).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting about loosen_types. So maybe to be general, we could test that the last element is a return and everything else is a nothing. But that makes me displeased too.

vnv = DynamicPPL.VarNamedVector()
vnv = setindex!!(vnv, 1.0, vn)
vnv = setindex!!(vnv, 2, @varname(b))
@test ~DynamicPPL.is_tightly_typed(vnv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to prefer ~ over !?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, don't know why I did that. Changed to ! (also in other files, for consistency).

Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with the rest, albeit mildly disappointed that my idea of a clear distinction between ! and !! was a delusion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants