Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing memory footprint #121

Closed
GeorgeR227 opened this issue Mar 5, 2024 · 18 comments · Fixed by #133
Closed

Reducing memory footprint #121

GeorgeR227 opened this issue Mar 5, 2024 · 18 comments · Fixed by #133

Comments

@GeorgeR227
Copy link
Contributor

It seems like some of basic ACSet fuctionality, like index setting, has a much larger than expected memory footprint.

I've been writing various performance-oriented functions for CombinatorialSpaces.jl and a main way I've gotten around this is that instead of using the ACSet indexing, I've just grabbing entire views of ACSets columns at a time and indexing that view. Below is a small example demonstrating the difference:

using CombinatorialSpaces
using GeometryBasics: Point3
using Catlab: add_parts!, parts

function add_vertex_center_ACS!(s)
   for v in parts(s, :V)
       s[v, :vertex_center] = v
   end
end

function add_vertex_center_view!(s)
   vc_view = @view s[:vertex_center]
   for v in parts(s, :V)
       vc_view[v] = v
   end
end

sd = EmbeddedDeltaDualComplex2D{Bool,Float64,Point3{Float64}}()
add_parts!(sd, :V, 100000)
add_parts!(sd, :DualV, 100000)
@time add_vertex_center_view!(sd)
@time add_vertex_center_ACS!(sd)

Below is the result of running this code. We can see that the view uses substantially less memory and thus also speeds up the code by a lot.

While I would like to use the nice ACSet interface, performance-wise I keep having to avoid it. While all of my examples here relate to the CombinatorialSpaces package, I think the issue itself stems from the underlying ACSet fuctionality, since the meshes used there are built on it.

@jpfairbanks
Copy link
Member

Is this related to what we removed for type stability of the code that uses ACSets with attribute variables? @kris-brown, @mehalter do you remember what issue/PR that is associated with? I couldn't find it with the issue search.

@jpfairbanks
Copy link
Member

The PR that added variables to ACSets is here: AlgebraicJulia/Catlab.jl#740

@GeorgeR227
Copy link
Contributor Author

It's possible type stability may have a role in this:
image

Here Julia says the type is Any but in all cases this value is an Int64. I don't even think this type is settable by the user.

@slwu89
Copy link
Member

slwu89 commented Mar 5, 2024

I've also noticed strange cases where the type should have been inferred but I got Any. Will add them as an MWE if I come across them again.

@epatters
Copy link
Member

epatters commented Mar 5, 2024

It would be good to track this down. From earlier discussions with Sean, it sounds like we had a performance regression on basic acset operations at some point.

@jpfairbanks
Copy link
Member

I thought we replaced views with copies at one point to handle the case of Variable Attributes. I remember that we decided to live with a performance problem because the usability of having to convert everything you accessed was too high.

@epatters
Copy link
Member

epatters commented Mar 6, 2024

We did replace views with copies for vectorized access to subparts. But the elementwise assignments in the OP should be fast and they shouldn't allocate since all the needed info is available at compile time (being hardcoded symbols). I think it's a regression that they are doing all that allocation, though I don't know when the regression happened.

@jpfairbanks
Copy link
Member

@GeorgeR227 is going to hit bisect to find the root cause and then we can see what to do and fix it.

@epatters
Copy link
Member

Great!

@GeorgeR227
Copy link
Contributor Author

From what I can see, it seems like this issue has always persisted in ACSets.jl. I've gone back to 0915821 when view seems to have first been supported for ACSets and I see the same allocation problem there. Here's the code I used to test:

using ACSets

SchTest = BasicSchema([:V,:E], [(:v0,:E,:V)])

@acset_type Test(SchTest, index=[:v0])

s = Test()

add_part!(s, :V)
add_part!(s, :E)

s[1, :v0] = 1
@time s[1, :v0] = 1

v0 = @view s[:v0]
v0[1] = 1
@time v0[1] = 1

and here are the results from then:
image

The first is the regular ACSet setting while the second is the view workaround.

What's interesting is that this allocation size is actually 64 bytes on the most recent commit and I've tracked it down to fb4e036 as to when the size of this allocation increased from 48 to 64 bytes.

@epatters
Copy link
Member

epatters commented May 16, 2024

Thanks for looking into this. It's possible that the regression happened before the acset code was migrated out of Catlab and into its own package.

Actually, investigating further, I believe the problem is upstream. Run this code:

using ACSets

@inline my_set_subpart!(acs::SimpleACSet, part, name::Symbol, subpart) =
  _set_subpart!(acs, part, Val{name}, subpart)

@inline _set_subpart!(acs::SimpleACSet, part, ::Type{Val{name}}, subpart) where name =
  acs.subparts[name][part] = subpart

SchTest = BasicSchema([:V,:E], [(:v0,:E,:V)])

@acset_type Test(SchTest, index=[:v0])

s = Test()
add_part!(s, :V)
add_part!(s, :E)

_set_subpart!(s, 1, Val{:v0}, 1)
@time _set_subpart!(s, 1, Val{:v0}, 1)

my_set_subpart!(s, 1, :v0, 1)
@time my_set_subpart!(s, 1, :v0, 1)

I get the following result:

julia> _set_subpart!(s, 1, Val{:v0}, 1)
1

julia> @time _set_subpart!(s, 1, Val{:v0}, 1)
  0.000007 seconds
1

julia> my_set_subpart!(s, 1, :v0, 1)
1

julia> @time my_set_subpart!(s, 1, :v0, 1)
  0.000011 seconds (2 allocations: 96 bytes)

It appears that the compiler is not propagating the constant symbol :v0 despite everything being @inline-ed. I feel certain that this used to work, because it was the basis for the original acsets design and was verified in early benchmarks associated with our paper. Perhaps we should ask for advice from the Julia compiler folks.

@GeorgeR227
Copy link
Contributor Author

I can confirm the issue is a type propagation issue, I've modified the code slightly to enforce the type we expect and now see no allocations.

julia> @inline my_set_subpart!(acs::SimpleACSet, part, name::Symbol, subpart) =
         _set_subpart!(acs, part, Val{name}::Type{Val{:v0}}, subpart)
my_set_subpart! (generic function with 1 method)

julia> my_set_subpart!(s, 1, :v0, 1)
1

julia> @time my_set_subpart!(s, 1, :v0, 1)
  0.000014 seconds
1

Here's the @code_warntype as well. The line at %2 has bad typing since the type of _A is not known then but afterwards it asserts the type and everything is good.

julia> @code_warntype my_set_subpart!(s, 1, :v0, 1)
MethodInstance for my_set_subpart!(::Test, ::Int64, ::Symbol, ::Int64)
  from my_set_subpart!(acs::SimpleACSet, part, name::Symbol, subpart) @ Main REPL[95]:1
Arguments
  #self#::Core.Const(my_set_subpart!)
  acs::Test
  part::Int64
  name::Symbol
  subpart::Int64
Body::Int64
1nothing%2 = Core.apply_type(Main.Val, name)::Type{Val{_A}} where _A
│   %3 = Main.Type::Core.Const(Type)
│   %4 = Core.apply_type(Main.Val, :v0)::Core.Const(Val{:v0})
│   %5 = Core.apply_type(%3, %4)::Core.Const(Type{Val{:v0}})
│   %6 = Core.typeassert(%2, %5)::Core.Const(Val{:v0})
│   %7 = Main._set_subpart!(acs, part, %6, subpart)::Int64
└──      return %7

@aviatesk
Copy link
Contributor

I investigated this issue a bit, and it seems that constant-propagation for my_set_subpart! is working correctly.:

julia> call_my_set_subpart!(acs::SimpleACSet, part, subpart) = 
           my_set_subpart!(acs, part, :v0, subpart);

julia> call_my_set_subpart!(s, 1, 1);

julia> @time call_my_set_subpart!(s, 1, 1);
  0.000004 seconds

The real problem appears to be that constant-propagation for :v0 is not happening at the callsite give to the @time macro, which indicates an issue with constant-propagation for top-level chunks. As a temporary workaround, you can embed the necessary constant information within the entry point function, as like call_my_set_subpart!.

@jpfairbanks
Copy link
Member

So would making the caller use a value type like Val{:v0} and then telling them to use const v0 = Val{:v0} to save typing a viable workaround?

@jpfairbanks
Copy link
Member

I guess we could also generate singleton types when we make the ACSet instance and then have the user import those as module level constants. What is the difference between Val{s} where s is a Symbol and a singleton type called s?

@aviatesk
Copy link
Contributor

I investigated the issue a bit further and it seems possible to solve the problem without those workarounds.

Firstly, the allocations occurring in the reduced example (@time my_set_subpart!(...)) are unlikely to happen in a real application. This is because those allocations occur due to calling @time with my_set_subpart! as the entry point, and it's unlikely that functions like my_set_subpart! will be the entry point in a real application (though I might be wrong since I'm not very familiar with ACSets.jl).
(P.S. Of course, having such allocations occur when passing a function like my_set_subpart!—which can be optimized by constant information available in its arguments—to @time complicates performance analysis and should be improved. My earlier comments about constant propagation in top-level chunks were addressing this issue.)

Going back to the original issue, the initial problem reported in this issue seems to be due to the lack of constant-propagation for getindex(::ACSet, ...) or setindex(::ACSet, ...). This can be resolved by adding the @inline annotation to these methods.

aviatesk added a commit to aviatesk/ACSets.jl that referenced this issue May 19, 2024
These functions can be significantly optimized with constant information
available within their arguments, so it would be beneficial to enforce
constant propagation for them.
This can be achieved using `Base.@constprop :aggressive` or `@inline`,
and since these functions are likely to be used frequently, I think
promoting inline expansion with `@inline` is a good approach,
so I chose to use `@inline`.

- fixes AlgebraicJulia#121
@aviatesk
Copy link
Contributor

xref #133

@epatters
Copy link
Member

epatters commented May 20, 2024

@aviatesk, thanks so much for helping us diagnose and fix this issue! I got thrown off track by the interaction between the top-level call and the @time macro.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants