Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reinterpret performance #28707

Merged
merged 1 commit into from
Aug 17, 2018
Merged

Fix reinterpret performance #28707

merged 1 commit into from
Aug 17, 2018

Commits on Aug 17, 2018

  1. Fix reinterpret performance

    This fixes #25014 by making it more obvious what's going on to LLVM.
    Instead of a memcpy loop, we use a ccall to :memcpy and turn this into
    llvm.memcpy at the IR level, which is enough for LLVM to fold everything
    away. In the benchmark from #25014, we still see some regressions from
    0.6, but that is because it needs to dereference through the pointers
    in the reinterpret and reshape wrappers. In any real code, that
    dereferencing should be loop-invariantly moved out of the inner loop.
    Keno committed Aug 17, 2018
    Configuration menu
    Copy the full SHA
    93164b7 View commit details
    Browse the repository at this point in the history