-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[gen]DC CVAU and IC IVAU relaxations #707
Conversation
a23d589
to
5905deb
Compare
de16555
to
ea2295c
Compare
This PR has also been tested using the following configuration file. This generates 500 tests for me and scanning through the tests I haven't been able to see anything clearly wrong. Note that this conf file uses
|
Hi @hugookeeffe |
Hi @murzinv
I completely agree, I have another PR open at the moment, #703, where I have made the change for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Specifically, I looked through the code, the examples and some of the generated tests, which all seemed good.
gen/AArch64Compile_gen.ml
Outdated
| D.CTRL -> | ||
let c,st = emit_ctrl_gen csel st vdep r1 in | ||
let init,cs,st = emit_fence st p init n f in | ||
None,init,insert_isb false c cs,st | ||
| D.CTRLISYNC -> | ||
let c,st = emit_ctrl_gen csel st vdep r1 in | ||
let init,cs,st = emit_fence st p init n f in | ||
None,init,insert_isb true c cs,st |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This two cases can be merged into one like so:
| D.CTRL -> | |
let c,st = emit_ctrl_gen csel st vdep r1 in | |
let init,cs,st = emit_fence st p init n f in | |
None,init,insert_isb false c cs,st | |
| D.CTRLISYNC -> | |
let c,st = emit_ctrl_gen csel st vdep r1 in | |
let init,cs,st = emit_fence st p init n f in | |
None,init,insert_isb true c cs,st | |
| D.CTRL | D.CTRLISYNC -> | |
let c,st = emit_ctrl_gen csel st vdep r1 in | |
let init,cs,st = emit_fence st p init n f in | |
None,init,insert_isb (is_ctrlisync dp) c cs,st |
What do you think?
Although, I think CTRLISYNC is redundant (one can insert an ISB pseudo relation as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion! I have implemented the change
let vdep = node2vdep n1 in | ||
match dp with | ||
| D.ADDR | ||
| D.DATA -> let init,cs, st = emit_fence st p init n f in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried generating some examples with DpAddr and DpCtrl but I didn't get what I was expecting. For example:
$> diyone7 -moreedges true -arch AArch64 -variant self "DMB.ISHdWW Rfe [DpAddrdR ISB DC.CVAUp DSB.ISH IC.IVAUp DSB.ISH ISB Ifre]"
Throws an error that there can't be an address dependency to code. Which makes sense. But then I think I don't know how to use DpAddr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that is strange, what behaviour would you expect here? Ctrl does seem to work, but not Addr. should we expect both to work or neither?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what are the meaningful tests we can produce with DpAddr*R then. For example, if we ignore the address dependency to code location:
$> diyone7 -moreedges true -arch AArch64 -variant self "DMB.ISHdWW Rfe [DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB] Fre"
AArch64 A
"DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Fre"
Generator=diyone7 (version 7.56+03)
Com=Rf Fr
Orig=DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Fre
{
0:X1=x; 0:X3=y;
1:X0=y; 1:X2=x;
}
P0 | P1 ;
MOV W0,#1 | LDR W1,[X0] ;
STR W0,[X1] | DC CVAU,X2 ;
DMB ISH | DSB ISH ;
MOV W2,#1 | IC IVAU,X2 ;
STR W2,[X3] | DSB ISH ;
| ISB ;
| EOR W3,W1,W1 ;
| LDR W4,[X2,W3,SXTW] ;
exists (1:X1=1 /\ 1:X4=0)
The DC doesn't have an address dependency from the 1st LDR in P1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring ifetch for a minute:
$> diyone7 -moreedges true -arch AArch64 -variant self "DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Fre"
AArch64 A
"DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Fre"
Generator=diyone7 (version 7.56+03)
Com=Rf Fr
Orig=DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Fre
{
0:X1=x; 0:X3=y;
1:X0=y; 1:X2=x;
}
P0 | P1 ;
MOV W0,#1 | LDR W1,[X0] ;
STR W0,[X1] | DC CVAU,X2 ;
DMB ISH | DSB ISH ;
MOV W2,#1 | IC IVAU,X2 ;
STR W2,[X3] | DSB ISH ;
| ISB ;
| EOR W3,W1,W1 ;
| LDR W4,[X2,W3,SXTW] ;
exists (1:X1=1 /\ 1:X4=0)
But I would expect:
AArch64 A
"DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Fre"
Generator=diyone7 (version 7.56+03)
Com=Rf Fr
Orig=DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Fre
{
0:X1=x; 0:X3=y;
1:X0=y; 1:X2=x;
}
P0 | P1 ;
MOV W0,#1 | LDR W1,[X0] ;
| EOR W3,W1,W1 ;
| ADD X5,X2,W3,SXTW ;
STR W0,[X1] | DC CVAU,X5 ;
DMB ISH | DSB ISH ;
MOV W2,#1 | IC IVAU,X5 ;
STR W2,[X3] | DSB ISH ;
| ISB ;
| LDR W4,[X5] ;
exists (1:X1=1 /\ 1:X4=0)
And if it won't be too hard, I think for ifetch we should get this test:
diyone7 -moreedges true -arch AArch64 -variant self "DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Ifre"
AArch64 A
"DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Ifre"
Generator=diyone7 (version 7.56+03)
Com=Rf Ifr
Orig=DMB.ISHdWW Rfe DpAddrdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB Ifre
{
0:X0=NOP; 0:X1=P1:Lself00; 0:X3=x;
1:X0=x; 1:X2=P1:Lself00;
}
P0 | P1 ;
STR W0,[X1] | LDR W1,[X0] ;
| EOR W4,W1,W1 ;
| ADD X3,X2,W4,SXTW ;
DMB ISH | DC CVAU,X3 ;
MOV W2,#1 | DSB ISH ;
STR W2,[X3] | IC IVAU,X3 ;
| DSB ISH ;
| ISB ;
| Lself00: ;
| B L00 ;
| MOV W3,#2 ;
| B L01 ;
| L00: ;
| MOV W3,#1 ;
| L01: ;
exists (1:X1=1 /\ 1:X3=1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Nikos, I have implemented this change now for the normal case without ifetch, however the case with ifetch is a littl more complicated so for now I am throwing an error. Let me know if it is now doing what you expect it to.
I will now also rebase the patch on top of master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR has now been rebased. let me know if there is anything else you would like me to do for this
4f1afc2
to
c647ff8
Compare
This change implements the ability to generate DC CVAU and IC IVAU instructions as a fence instruction. This allows the user to implement these instructions with more fine-grain control into litmus tests. The relaxations for DC CVAU and IC IVAU are: DC.CVAU<n/p> IC.IVAU<n/p> where n and p allow the user to choose whether the DC/IC instruction will use the previous location or the next location. Examples that couldn't be generated previously: $> diyone7 -moreedges true -arch AArch64 -variant self "PodWW DC.CVAUp DMB.ISH Rfe PodRR DSB.ISH IC.IVAUn DSB.ISH ISB FrePI" AArch64 A "PodWWIP DC.CVAUp DMB.ISH Rfe PodRR DSB.ISH IC.IVAUn DSB.ISH ISB FrePI" Generator=diyone7 (version 7.56+03) Com=Rf Fr Orig=PodWWIP DC.CVAUp DMB.ISH Rfe PodRR DSB.ISH IC.IVAUn DSB.ISH ISB FrePI { 0:X0=instr:"NOP"; 0:X1=P1:Lself00; 0:X3=x; 1:X0=x; 1:X2=P1:Lself00; } P0 | P1 ; STR W0,[X1] | LDR W1,[X0] ; DC CVAU,X1 | DSB ISH ; DMB ISH | IC IVAU,X2 ; MOV W2,herd#1 | DSB ISH ; STR W2,[X3] | ISB ; | Lself00: ; | B .+12 ; | MOV W3,herd#2 ; | B .+8 ; | MOV W3,herd#1 ; exists (1:X1=1 /\ 1:X3=1) $> diyone7 -moreedges true -arch AArch64 -variant self "DMB.ISHdWW Rfe DpCtrlIsbdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB FrePI" AArch64 A "DMB.ISHdWWIP Rfe DpCtrlIsbdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB FrePI" Generator=diyone7 (version 7.56+03) Com=Rf Fr Orig=DMB.ISHdWWIP Rfe DpCtrlIsbdR DC.CVAUn DSB.ISH IC.IVAUn DSB.ISH ISB FrePI { 0:X0=instr:"NOP"; 0:X1=P1:Lself00; 0:X3=x; 1:X0=x; 1:X2=P1:Lself00; } P0 | P1 ; STR W0,[X1] | LDR W1,[X0] ; DMB ISH | CBNZ W1,LC00 ; MOV W2,herd#1 | LC00: ; STR W2,[X3] | ISB ; | DC CVAU,X2 ; | DSB ISH ; | IC IVAU,X2 ; | DSB ISH ; | ISB ; | Lself00: ; | B .+12 ; | MOV W3,herd#2 ; | B .+8 ; | MOV W3,herd#1 ; exists (1:X1=1 /\ 1:X3=1)
Thanks @hugookeeffe. This PR LGTM and I will merge tomorrow at the end of the day unless anyone objects or needs more time to review this. |
Merged into master, thanks @hugookeeffe! |
This pull request implements the ability to generate DC CVAU and IC IVAU instructions as a fence instruction. This allows the user to implement these instructions with more fine-grain control into litmus tests.
This PR also implements the ability for dependencies to be generated before Insert relaxations.
The relaxations for DC CVAU and IC IVAU are:
where
n
andp
allow the user to choose whether DC/IC act on thep
revious location or then
ext locationexamples that couldn't be generated previously: