Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some precompiles #201

Merged
merged 1 commit into from
Jan 4, 2021
Merged

Add some precompiles #201

merged 1 commit into from
Jan 4, 2021

Conversation

timholy
Copy link
Member

@timholy timholy commented Dec 23, 2020

On the current release, the first call looks like this:

julia> @time imfilter(img, KernelFactors.gaussian((3,3)));
  2.135609 seconds (3.46 M allocations: 211.109 MiB, 12.01% gc time, 94.98% compilation time)

julia> @time mapwindow(extrema, img, (3,3))
  0.756083 seconds (1.17 M allocations: 68.258 MiB, 6.57% gc time, 99.89% compilation time)

On this branch, it looks like this:

julia> @time imfilter(img, KernelFactors.gaussian((3,3)));
  0.620004 seconds (362.78 k allocations: 40.455 MiB, 1.47% gc time, 96.09% compilation time)

julia> @time mapwindow(extrema, img, (3,3))
  0.385470 seconds (334.52 k allocations: 19.585 MiB, 9.43% gc time, 99.82% compilation time)

Quite a nice reduction in latency.

I should note this requires JuliaMath/AbstractFFTs.jl#48 and also some changes to FFTW.jl that I haven't yet submitted since they rely on the changes in AbstractFFTs.

On the current release, the first call looks like this:
```julia
julia> @time imfilter(img, KernelFactors.gaussian((3,3)));
  2.135609 seconds (3.46 M allocations: 211.109 MiB, 12.01% gc time, 94.98% compilation time)

julia> @time mapwindow(extrema, img, (3,3))
  0.756083 seconds (1.17 M allocations: 68.258 MiB, 6.57% gc time, 99.89% compilation time)
```

On this branch, it looks like this:
```julia
julia> @time imfilter(img, KernelFactors.gaussian((3,3)));
  0.620004 seconds (362.78 k allocations: 40.455 MiB, 1.47% gc time, 96.09% compilation time)

julia> @time mapwindow(extrema, img, (3,3))
  0.385470 seconds (334.52 k allocations: 19.585 MiB, 9.43% gc time, 99.82% compilation time)
```

Quite a nice reduction in latency.
@codecov
Copy link

codecov bot commented Dec 23, 2020

Codecov Report

Merging #201 (4ca2542) into master (83ac3ed) will decrease coverage by 0.58%.
The diff coverage is 95.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #201      +/-   ##
==========================================
- Coverage   91.59%   91.00%   -0.59%     
==========================================
  Files           9       10       +1     
  Lines        1380     1412      +32     
==========================================
+ Hits         1264     1285      +21     
- Misses        116      127      +11     
Impacted Files Coverage Δ
src/ImageFiltering.jl 77.27% <ø> (ø)
src/kernelfactors.jl 91.15% <90.00%> (-2.42%) ⬇️
src/precompile.jl 94.11% <94.11%> (ø)
src/border.jl 93.02% <100.00%> (-1.13%) ⬇️
src/kernel.jl 98.40% <100.00%> (-1.60%) ⬇️
src/mapwindow.jl 86.22% <100.00%> (-0.45%) ⬇️
src/utils.jl 87.30% <0.00%> (-1.59%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83ac3ed...4ca2542. Read the comment docs.

Comment on lines +64 to +65
function mapwindow(f::F, img, window; border="replicate",
indices=default_imginds(img, window, border), callmode=:copy!) where F
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So f::F where F is more friendly to precompilation than f::Any?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you may know, it forces specialization on f; while Julia typically specializes for each argument type, potentially unlimited "abstract" types like Function, Type, and Tuple have heuristics that cause Julia to sometimes choose to compile a generic version rather than a specific one. Historically the main reason to force specialization was better runtime performance, at the cost of increased latency (because you have to do more compilation for each specialization). However, loss of specialization means loss of backedges and thus precompilability, so its a bit of a tradeoff.

It might be worth experimenting here a bit more before merging this, just to see if we can get the same benefit by other means. I didn't poke at this point extensively, I just noticed an inference failure due to lack of specialization. But the main reason I changed it is that mapwindow is pretty performance-sensitive, and it makes sense that at least mapwindow_kernel should be specialized on f. This PR doesn't technically guarantee that, though it seems the current heuristics must be making that happen anyway given the changes here. Like I said, probably worth poking a bit more before we merge this. I've got a few other priorities first (I did this as part of testing out major changes to SnoopCompile, and I want to get that done before coming back to this).

@timholy
Copy link
Member Author

timholy commented Jan 4, 2021

I think we can merge this as-is, and continue in separate efforts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants