Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sha/asm/keccak1600-c64x.pl #3708

Closed
wants to merge 2 commits into from
Closed

Conversation

dot-asm
Copy link
Contributor

@dot-asm dot-asm commented Jun 17, 2017

[skip ci]

if ($rot&1) {
$code.=<<___;
$p ROTL B$src,$rot/2+1,A$dst
|| ROTL A$src,$rot/2, B$dst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea :-)

CMPLTU LEN,BSZ,A0 ; len < bsz?
|| SHRU BSZ,3,BSZ
[A0] BNOP ret?
||[A0] ZERO BSZ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The || indicate parallel execution, and [A0] execute if A0 != 0, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spot on. Just in case, it's worth remembering that it's not like you can pair whatever and however you like. There are limitations of various kinds. One should be obvious, only one pair of rotations in execution packet, one on A- and one on B-file, possibly cross-wise... Another counter-intuitive thing is that branch is actually taken five cycles later. I mean instructions past this branch up to and including NOP 4 still execute...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instructions past this branch up to and including NOP 4 still execute...

Well, non-NOPs are predicated with [BSZ], which is zero if branch is taken. So that corresponding instructions are not executed in sense that they don't affect processor state. But they are executed in sense that processor does decode them, does all the intricate steps, and then just does nothing as prescribed by current value of the predicate register.

[BSZ] LDNDW *INP++,A1:A0
||[BSZ] SUB LEN,8,LEN
||[BSZ] SUB BSZ,1,BSZ
NOP 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting, initially I thought the NOP 4 here is to allow for LDNDW to hit the A1:A0,
but the BNOP above can abort the NOP one cycle earlier then ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, NOP 4 is there to allow data to show up in A1:A0. It just so happens that it coincides with the cycle branch is actually taken. Load latency is 4 cycles and it's executed one cycle after BNOP, so they are kind of "aligned".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are kind of "aligned"

"They" are moments the branch is taken and data becoming available in registers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misused term "latency" here. You rather think of "delay slots" here. Branch is 5 delay slots, load - 4, while for example addition is 0. Latency is normally a non-zero value, so that it's rather amount of delay slots plus 1.

||[A0] LDW *SP[1],A2 ; pull A[][]
[BSZ] LDNDW *INP++,A1:A0
||[BSZ] SUB LEN,8,LEN
||[BSZ] SUB BSZ,1,BSZ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what's the reason for the large difference in indentation here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is typo. Space vs. tab. Note that if you look at it as file and not as diff, there won't be any irregularities. I've spotted one more such typo. Fix is pushed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is kind of a reference to jagged alignment in #3705. Reply there was "if you see it, look for something special." But it doesn't mean that all special things are marked with jagged alignment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I learn...

||[A0] LDW *SP[1],A2 ; pull A[][]
[BSZ] LDNDW *INP++,A1:A0
||[BSZ] SUB LEN,8,LEN
||[BSZ] SUB BSZ,1,BSZ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks!

@richsalz richsalz added branch: master Merge to master branch approval: done This pull request has the required number of approvals labels Jun 20, 2017
levitte pushed a commit that referenced this pull request Jun 21, 2017
[skip ci]

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from #3708)
@dot-asm
Copy link
Contributor Author

dot-asm commented Jun 21, 2017

Merged. Thanks.

@dot-asm dot-asm closed this Jun 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: done This pull request has the required number of approvals branch: master Merge to master branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants