Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unrolled transform! #2

Merged
merged 2 commits into from
Jan 11, 2018
Merged

unrolled transform! #2

merged 2 commits into from
Jan 11, 2018

Conversation

jw3126
Copy link
Collaborator

@jw3126 jw3126 commented Jan 11, 2018

Towards better performance. Roughly 3-4 times faster then baseline, still 2-3 times slower then nettle.

@jw3126 jw3126 mentioned this pull request Jan 11, 2018
@oxinabox
Copy link
Member

How do the if statements compare to using a series of loops?
I think the series of for loops should be faster

@jw3126
Copy link
Collaborator Author

jw3126 commented Jan 11, 2018

These are compile time ifs. No runtime branches left.

src/core.jl Outdated
pbuf = buffer_pointer(context)
end
ex = quote
A = context.state[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add @inbounds to all of these?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole body is @inbounds.

Copy link
Member

@oxinabox oxinabox Jan 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or does it propagate correctly from the last line?
I'm never sure about @inbounds propergation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just does not propagate into function calls. But I think you are still right it is better style to be explict, if something is refactored into a function.

@oxinabox
Copy link
Member

These are compile time ifs. No runtime branches left.

Ah yes, I just realized that.

@oxinabox
Copy link
Member

You should add yourself to the LISCENSE.md

@oxinabox
Copy link
Member

oxinabox commented Jan 11, 2018

nettle is hand-optimized assembly.
I'm not sure how bad one can feel about losing to it.
I'm pretty sure it actually manages to not use RAM for anything but the message.
Keeping the A,B,C,D,F all in the CPU registers.

@jw3126
Copy link
Collaborator Author

jw3126 commented Jan 11, 2018

I read this their baseline is similar to ours and the significant optimizations were unrolling and telling C to put variables into registers. (They do use some assembly, but it does not give much performance.) Not sure if there is a way to control register usage from Julia.

@oxinabox
Copy link
Member

thanks, this is great
There might be a way to to register forcing via Core.Intrinsics.llvmcall but I'm not sure.
It can come in another PR if it matters.

MD5 checking isn't a massive bottleneck ever anyway.
Its basically only for legacy file checksums, right?

@oxinabox oxinabox merged commit 61514df into JuliaCrypto:master Jan 11, 2018
@jw3126
Copy link
Collaborator Author

jw3126 commented Jan 11, 2018

Yeah performance is not critical here. My motivation is just curiosity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants