Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia v1.8 -> v1.9 memory allocations change from 0KB to 2.7KB #50073

Closed
ojwoodford opened this issue Jun 5, 2023 · 9 comments
Closed

Julia v1.8 -> v1.9 memory allocations change from 0KB to 2.7KB #50073

ojwoodford opened this issue Jun 5, 2023 · 9 comments
Labels
kind:regression Regression in behavior compared to a previous version performance Must go faster regression 1.9 Regression in the 1.9 release

Comments

@ojwoodford
Copy link

ojwoodford commented Jun 5, 2023

On upgrading to Julia from 1.8.5. to 1.9.0 I noticed code in one of my packages slow down significantly. (I've already highlighted a type inference issue that I found. This is an entirely separate issue.) After upgrading, the following code suddenly causes a lot of memory allocations. This is the simplest code which replicates the problem, but most minor changes will suddenly cause all the allocations to disappear:

using StaticArrays, ForwardDiff, BenchmarkTools, JET

function rodrigues(x::T, y::T, z::T) where T
if x == 0 && y == 0 && z == 0
    # Short cut for derivatives at identity
    return SMatrix{3, 3, T, 9}(T(1), z, -y, -z, T(1), x, y, -x, T(1))
end
theta2 = x * x + y * y + z * z
cosf = T(0.5)
sinc = T(1)
if theta2 > T(2.23e-16)
    theta = sqrt(theta2)
    sinc, cosf = sincos(theta)
    cosf -= 1
    sinc /= theta
    cosf /= -theta2
end
a = x * y * cosf
b = sinc * z
c = x * z * cosf
d = sinc * y
e = y * z * cosf
f = sinc * x
return SMatrix{3, 3, T, 9}((x * x - theta2) * cosf + 1, a + b, c - d,
                            a - b, (y * y - theta2) * cosf + 1, e + f,
                            c + d, e - f, (z * z - theta2) * cosf + 1)
end

struct BarrelDistortion{T}
    k1::T
    k2::T
end
update(var::BarrelDistortion, updatevec, start=1) = BarrelDistortion(var.k1 + updatevec[start], var.k2 + updatevec[start+1])

function ideal2distorted(lens::BarrelDistortion, x)
    z = x' * x
    z = z * (lens.k1 + z * lens.k2) + 1
    return x * z
end

struct SimpleCamera{T}
    f::T
end
SimpleCamera(v::T) where T = SimpleCamera{T}(v::T)
update(var::SimpleCamera, updatevec, start=1) = SimpleCamera(max(var.f, floatmin(var.f)) * exp(updatevec[start]))

struct Point3D{T}
    v::SVector{3, T}
end
Point3D(x, y, z) = Point3D(SVector{3}(x, y, z))
Point3D() = Point3D(SVector{3}(0., 0., 0.))
update(var::Point3D, updatevec, start=1) = Point3D(var.v + updatevec[StaticArrays.SUnitRange(0, 2) .+ start])
project(x::Point3D{T}) where T = SVector{2, T}(x.v[1], x.v[2]) ./ x.v[3]

struct Rotation{T}
    m::SMatrix{3, 3, T, 9}
end
Rotation(x, y, z) = Rotation(rodrigues(x, y, z))
update(var::Rotation, updatevec, start=1) = transform(Rotation(updatevec[start], updatevec[start+1], updatevec[start+2]), var)
transform(rota::Rotation, rotb::Rotation) = Rotation(rota.m * rotb.m)

struct EffPose3D{T}
    rot::Rotation{T}
    camcenter::Point3D{T}
end
EffPose3D(rx, ry, rz, cx, cy, cz) = EffPose3D(Rotation(rx, ry, rz), Point3D(cx, cy, cz))
update(var::EffPose3D, updatevec, start=1) = EffPose3D(update(var.rot, updatevec, start), update(var.camcenter, updatevec, start+3))
transform(pose::EffPose3D, point::Point3D) = Point3D(pose.rot.m * (point.v - pose.camcenter.v))

# Description of BAL image, and function to transform a landmark from world coordinates to pixel coordinates
struct Image{T}
    pose::EffPose3D{T}
    sensor::SimpleCamera{T}
    lens::BarrelDistortion{T}
end
update(var::Image, updatevec, start=1) = Image(update(var.pose, updatevec, start), update(var.sensor, updatevec, start+6), update(var.lens, updatevec, start+7))
function Image(rx::T, ry::T, rz::T, tx::T, ty::T, tz::T, f::T, k1::T, k2::T) where T
    R = Rotation(rx, ry, rz)
    return Image{T}(EffPose3D(R, Point3D(R.m' * -SVector(tx, ty, tz))), SimpleCamera(f), BarrelDistortion(k1, k2))
end
transform(im::Image, X::Point3D) = ideal2distorted(im.lens, -project(Point3D(im.pose.rot.m * (X.v - im.pose.camcenter.v))))

computeresjac(vars...) = ForwardDiff.jacobian(z -> transform(update(vars[1], z, 1), update(vars[2], z, 10)), zeros(SVector{12, Float64}))

function mytest()
    im = Image(1., 1., 1., 1., 1., 1., 1., 1., 1.)
    lm = Point3D(0., 0., 0.)
    @btime computeresjac($im, $lm)
    show(JET.@report_opt computeresjac(im, lm))
end

mytest()
@Seelengrab
Copy link
Contributor

There are type instabilities:

julia> JET.@report_opt computeresjac(im, lm)
═════ 5 possible errors found ═════
┌ @ REPL[29]:1 ForwardDiff.jacobian(#3, zeros(SVector{12, Float64}))
│┌ @ /home/sukera/.julia/packages/ForwardDiff/vXysl/ext/ForwardDiffStaticArraysExt.jl:68 ForwardDiffStaticArraysExt.vector_mode_jacobian(f, x)
││┌ @ /home/sukera/.julia/packages/ForwardDiff/vXysl/ext/ForwardDiffStaticArraysExt.jl:93 ForwardDiffStaticArraysExt.static_dual_eval(T, f, x)
│││┌ @ /home/sukera/.julia/packages/ForwardDiff/vXysl/ext/ForwardDiffStaticArraysExt.jl:24 f(ForwardDiffStaticArraysExt.dualize(T, x))
││││┌ @ REPL[29]:1 update(getfield(#self#, :vars)[1], z, 1)
│││││┌ @ REPL[26]:1 update(var.pose, updatevec, start)
││││││┌ @ REPL[23]:1 update(var.rot, updatevec, start)
│││││││┌ @ REPL[19]:1 Rotation(updatevec[start], updatevec[start + 1], updatevec[start + 2])
││││││││┌ @ REPL[18]:1 rodrigues(x, y, z)
│││││││││┌ @ REPL[5]:22 SMatrix{3, 3, ForwardDiff.Dual{ForwardDiff.Tag{var"#3#4"{Tuple{Image{Float64}, Point3D{Float64}}}, Float64}, Float64, 12}}(x * x - theta2 * cosf + 1, a + b, c - d, a - b, y * y - theta2 * cosf + 1, e + f, c + d, e - f, z * z - theta2 * cosf + 1)
││││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:160 StaticArrays.construct_type(SA, StaticArrays.Args(x))
│││││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:87 StaticArrays.adapt_size(SA, x)
││││││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:115 StaticArrays.tuple_length(SZ)
│││││││││││││┌ @ /home/sukera/.julia/packages/StaticArraysCore/U2Z1K/src/StaticArraysCore.jl:47 StaticArraysCore.length(%1)
││││││││││││││ runtime dispatch detected: StaticArraysCore.length(%1::Any)::Any
│││││││││││││└──────────────────────────────────────────────────────────────────────────────────
││││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:160 StaticArrays.construct_type(SA, StaticArrays.Args(x))(x)
│││││││││││┌ @ /home/sukera/.julia/packages/StaticArraysCore/U2Z1K/src/StaticArraysCore.jl:117 StaticArraysCore.tuple_prod(S)
││││││││││││┌ @ /home/sukera/.julia/packages/StaticArraysCore/U2Z1K/src/StaticArraysCore.jl:49 StaticArraysCore.length(%1)
│││││││││││││ runtime dispatch detected: StaticArraysCore.length(%1::Any)::Any
││││││││││││└──────────────────────────────────────────────────────────────────────────────────
││││││││││││┌ @ /home/sukera/.julia/packages/StaticArraysCore/U2Z1K/src/StaticArraysCore.jl:49 %2 StaticArraysCore.:(==) 0
│││││││││││││ runtime dispatch detected: (%2::Any StaticArraysCore.:(==) 0)::Any
││││││││││││└──────────────────────────────────────────────────────────────────────────────────
││││││┌ @ REPL[23]:1 update(var.camcenter, updatevec, start + 3)
│││││││┌ @ REPL[15]:1 StaticArrays.SUnitRange(0, 2)
││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/SUnitRange.jl:19 %5()
│││││││││ runtime dispatch detected: %5::Type{StaticArrays.SUnitRange{_A, _B}} where {_A, _B}()::StaticArrays.SUnitRange
││││││││└────────────────────────────────────────────────────────────────────────
││││┌ @ REPL[29]:1 transform(update(getfield(#self#, :vars)[1], z, 1), update(getfield(#self#, :vars)[2], z, 10))
│││││┌ @ REPL[28]:1 project(Point3D(im.pose.rot.m * X.v - im.pose.camcenter.v))
││││││┌ @ REPL[16]:1 SVector(x.v[1], x.v[2])
│││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:160 StaticArrays.construct_type(SA, StaticArrays.Args(x))
││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:87 StaticArrays.adapt_size(SA, x)
│││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:97 Length(x)
││││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/convert.jl:9 Length(StaticArrays.length(x.args))
│││││││││││┌ @ /home/sukera/.julia/packages/StaticArrays/J9itA/src/traits.jl:39 %1()
││││││││││││ runtime dispatch detected: %1::Type{Length{_A}} where _A()::Length
│││││││││││└────────────────────────────────────────────────────────────────────

@ojwoodford ojwoodford changed the title Julia v1.8 -> v1.9 memory allocations change from 0KB to 2.7KB (with no type instability) Julia v1.8 -> v1.9 memory allocations change from 0KB to 2.7KB Jun 6, 2023
@ojwoodford
Copy link
Author

Thanks, @Seelengrab . For me, JET.@report_opt reports "No errors detected". I'm using JET v0.8.0. Are you using a newer version?

@ojwoodford
Copy link
Author

@Seelengrab Note that I tried to address some issues highlighted by your output in my MWE in the first comment. However, I saw JET.@report_opt return no errors both before and after making these changes.

@Seelengrab
Copy link
Contributor

Ah, indeed - I used 0.7.14 earlier, I don't get the errors reported with 0.8.

Unfortunately, that means it's likely a bit harder to track down that allocation - I'd suggest the Profiling stdlib, to figure out where the allocations are coming from. It's quite difficult trying to track this down otherwise.

@ojwoodford
Copy link
Author

Allocation profiling results:
Screenshot 2023-06-06 at 13 38 10
It isn't very helpful.

My concern is that these allocations are new in Julia v1.9; they don't occur in v1.8. It seems a change to the compiler has led to a performance regression. In my case it slows down a particular chunk of code by about 5x.

@Seelengrab
Copy link
Contributor

Hm, that's unfortunate :/ Barring a more minimal reproducer, I'm afraid only a bisect of the julia versions between 1.8 and 1.9 can help here; though maybe a look a vector_mode_jacobian by the devs of ForwardDiff.jl can help too.

@vchuravy vchuravy added performance Must go faster kind:regression Regression in behavior compared to a previous version regression 1.9 Regression in the 1.9 release labels Jun 6, 2023
@vchuravy
Copy link
Sponsor Member

vchuravy commented Jun 6, 2023

You might want to set the sampling rate to 1 for the allocation profiler.

@ojwoodford
Copy link
Author

You might want to set the sampling rate to 1 for the allocation profiler.

👍 @vchuravy I had it set 1.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 6, 2024

Cleaning up old issues without an MWE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:regression Regression in behavior compared to a previous version performance Must go faster regression 1.9 Regression in the 1.9 release
Projects
None yet
Development

No branches or pull requests

4 participants