-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unrolled transform! #2
Conversation
How do the |
These are compile time ifs. No runtime branches left. |
src/core.jl
Outdated
pbuf = buffer_pointer(context) | ||
end | ||
ex = quote | ||
A = context.state[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add @inbounds
to all of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole body is @inbounds
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or does it propagate correctly from the last line?
I'm never sure about @inbounds
propergation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just does not propagate into function calls. But I think you are still right it is better style to be explict, if something is refactored into a function.
Ah yes, I just realized that. |
You should add yourself to the LISCENSE.md |
nettle is hand-optimized assembly. |
I read this their baseline is similar to ours and the significant optimizations were unrolling and telling C to put variables into registers. (They do use some assembly, but it does not give much performance.) Not sure if there is a way to control register usage from Julia. |
thanks, this is great MD5 checking isn't a massive bottleneck ever anyway. |
Yeah performance is not critical here. My motivation is just curiosity. |
Towards better performance. Roughly 3-4 times faster then baseline, still 2-3 times slower then nettle.