Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insane precompile times in Metalhead models #1160

Closed
theabhirath opened this issue Feb 10, 2022 · 2 comments
Closed

Insane precompile times in Metalhead models #1160

theabhirath opened this issue Feb 10, 2022 · 2 comments

Comments

@theabhirath
Copy link
Member

theabhirath commented Feb 10, 2022

Xref FluxML/Metalhead.jl#105 (comment). When running gradient tests on some Metalhead models, the times taken for the first gradient are off the charts. Specifically, the function gradtest was run on the models in Metalhead#master:

function gradtest(model, input)
  y, pb = Zygote.pullback(() -> model(input), Flux.params(model))
  gs = pb(ones(Float32, size(y)))

  # if we make it to here with no error, success!
  return true
end

and the times taken for the tests were:
image

The ViT model can be ignored because gradtest wasn't run on it, but it seems to be taking quite a large amount of time for precompilation of the other models...

cc @CarloLucibello

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Feb 10, 2022

Definitely a regression from v0.4, and v0.5.

Honestly surprised no one has complained about it yet. I've been running into this often too. Thanks for the summary!

+1 for enabling the gradtests, we really need those in our testing suite.

@CarloLucibello
Copy link
Member

Let's close this since it is a duplicate of #1126, sorry I didn't notice earlier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants