Refactor/improve LLVM intrinsic handling #451

shaobo-he · 2019-05-28T00:48:12Z

keram88 · 2019-05-28T02:36:16Z

I'm not sure what the intended scope of this is. I would like to move the Rust functions out into something like isSpecialFunction and processSpecialFunction as well.

shaobo-he · 2019-05-28T03:36:32Z

I'm not sure what the intended scope of this is. I would like to move the Rust functions out into something like isSpecialFunction and processSpecialFunction as well.

I think they are out of the scope of this PR.

lib/smack/SmackInstGenerator.cpp

zvonimir · 2019-05-28T05:40:31Z

@keram88 : Please create a separate pull request for that. I guess the function should be called processSpecialRustCall since currently they are all Rust-specific.

lib/smack/SmackInstGenerator.cpp

zvonimir

Looks good to me.

zvonimir · 2019-05-28T20:36:59Z

Please squash and then I'll merge it in.

shaobo-he · 2019-05-28T22:26:36Z

Please squash and then I'll merge it in.

Shouldn't we also add the handling of intrinsics like llvm.fabs and llvm.ctlz?

zvonimir · 2019-05-28T22:43:04Z

I thought you wanted to do that as a separate pull request, which maybe would not be a bad idea.

shaobo-he · 2019-05-29T00:30:33Z

I thought you wanted to do that as a separate pull request, which maybe would not be a bad idea.

I think we should implement them in this PR so that we have a better comprehensive picture about what's a good way to implement them.

keram88 · 2019-05-29T05:07:22Z

I thought it was separate too.
Anyway, this is my (three-quarters baked?) Prelude.cpp implementation for bswap, so we can see what that approach looks like.
6ae5c93

keram88

This looks fine if we're just moving intrinsics out.

zvonimir · 2019-05-29T06:29:37Z

When you started working on the intrinsics, I said we should try to handle them uniformly, meaning either put all of them into intrinsics.c or put all of them into Prelude.cpp. Now it seems that some will go into intrinsics.c and some into Prelude.cpp. Why is that so? Why could they not all be at the same place?
Wouldn't you agree that having all intrinsics be modeled at the same place would make sense?

Looking at the bvswap implementation, to me it seems it would be hard to implement it nicely in intrinsics.c without introducing some crazy macros or something like that.

Hence, my proposal for refactoring is as follows:

Move what is currently in intrinsics.c into Prelude.cpp. I would guess that should be pretty easy.
Delete intrinsics.c.
Implement a function called generateIntrinsics or something like that in Prelude.cpp. All models for intrinsics should be generated from that functions. Be careful about float and bit-precise flags.

shaobo-he · 2019-05-30T05:30:08Z

When you started working on the intrinsics, I said we should try to handle them uniformly, meaning either put all of them into intrinsics.c or put all of them into Prelude.cpp. Now it seems that some will go into intrinsics.c and some into Prelude.cpp. Why is that so? Why could they not all be at the same place?
Wouldn't you agree that having all intrinsics be modeled at the same place would make sense?

Looking at the bvswap implementation, to me it seems it would be hard to implement it nicely in intrinsics.c without introducing some crazy macros or something like that.

Hence, my proposal for refactoring is as follows:

Move what is currently in intrinsics.c into Prelude.cpp. I would guess that should be pretty easy.

Delete intrinsics.c.

Implement a function called generateIntrinsics or something like that in Prelude.cpp. All models for intrinsics should be generated from that functions. Be careful about float and bit-precise flags.

By talking about handling LLVM intrinsics uniformly, I thought we meant to choose one between the two design choices,

Replace the function call to an LLVM intrinsic with a call to the intrinsic handling function which is what we implemented.
Replace the function call to an LLVM intrinsic with Boogie commands directly. For example, a call to llvm.expect is translated to an assignment.

I voted for the first option because I thought models of LLVM intriniscs can be much more easily expressed using C. For example, there are a lot of resources online about how to implement counting the leading zeros in loop-free C code. But Mark figures out ways to implement these models using Boogie functions, which is more efficient than C code. So the second option is more promising because we do not need intrinsics.c, although it's unclear that we can do this to all the intriniscs.

The bottom line is that there is nothing in intrinsics.c that could be moved to Prelude.cpp because the former only contains functions using __SMACK_code. Its existence is due to our adoption of option 1.

shaobo-he · 2019-05-30T05:43:56Z

In spite of our ongoing discussion, I squashed the commits so it's ready to be merged.

zvonimir · 2019-05-30T15:06:19Z

When you started working on the intrinsics, I said we should try to handle them uniformly, meaning either put all of them into intrinsics.c or put all of them into Prelude.cpp. Now it seems that some will go into intrinsics.c and some into Prelude.cpp. Why is that so? Why could they not all be at the same place?
Wouldn't you agree that having all intrinsics be modeled at the same place would make sense?
Looking at the bvswap implementation, to me it seems it would be hard to implement it nicely in intrinsics.c without introducing some crazy macros or something like that.
Hence, my proposal for refactoring is as follows:

Move what is currently in intrinsics.c into Prelude.cpp. I would guess that should be pretty easy.

Delete intrinsics.c.

Implement a function called generateIntrinsics or something like that in Prelude.cpp. All models for intrinsics should be generated from that functions. Be careful about float and bit-precise flags.

By talking about handling LLVM intrinsics uniformly, I thought we meant to choose one between the two design choices,

Replace the function call to an LLVM intrinsic with a call to the intrinsic handling function which is what we implemented.

Replace the function call to an LLVM intrinsic with Boogie commands directly. For example, a call to llvm.expect is translated to an assignment.

I voted for the first option because I thought models of LLVM intriniscs can be much more easily expressed using C. For example, there are a lot of resources online about how to implement counting the leading zeros in loop-free C code. But Mark figures out ways to implement these models using Boogie functions, which is more efficient than C code. So the second option is more promising because we do not need intrinsics.c, although it's unclear that we can do this to all the intriniscs.

The bottom line is that there is nothing in intrinsics.c that could be moved to Prelude.cpp because the former only contains functions using __SMACK_code. Its existence is due to our adoption of option 1.

Well, note that your option (1) can be implemented in the following two ways:

Implement models for intrinsics as C functions in intrinsics.c, which is what we did.
Implement models for intrinsics as Boogie functions in Prelude.cpp, which is what Mark did.

So I think our options for implementing models for intrinsics are as follows:

Generate Boogie code that models an intrinsic directly and inline it. In this case there would be no functions or function calls being generated.
Implement models for intrinsics as C functions in intrinsics.c, and then invoke the appropriate function every time an intrinsic is encountered.
Implement models for intrinsics as Boogie functions/procedures in Prelude.cpp, and then invoke the appropriate function every time an intrinsic is encountered.
Some combination of the above depending on an intrinsic.

My main goal has been uniformity, meaning to avoid (4).

I actually don't have a strong preference among options (1)-(3). I like (2), which is what @shaobo-he implemented. But then it seems that bwswap would be hard to implement cleanly using that approach, which is why @keram88 went with (3). Is that right @keram88?

I have my doubts about (1). On one hand, it generates Boogie code directly and already inlined, and so there are no extraneous procedure invocations and definitions. That might be good for performance. On the other hand, we lose procedure boundaries that might be leveraged by Corral to abstract away intrinsics that are not needed to prove a property, which might be bad for performance.

Ultimately, I would choose between (1)-(3) based on which approach we think will lead to uniform and clean implementation of models of intrinsics.

shaobo-he · 2019-05-30T21:10:40Z

So I think our options for implementing models for intrinsics are as follows:

1. Generate Boogie code that models an intrinsic directly and inline it. In this case there would be no functions or function calls being generated.

2. Implement models for intrinsics as C functions in `intrinsics.c`, and then invoke the appropriate function every time an intrinsic is encountered.

3. Implement models for intrinsics as Boogie functions/procedures in `Prelude.cpp`, and then invoke the appropriate function every time an intrinsic is encountered.

4. Some combination of the above depending on an intrinsic.

Let's use a concrete example to illustrate the differences between the four approaches. For example, we want to handle lvm.bswap.i16, my understanding of 1 is listed as follows,

Option	Translation
1	res := i[8:0]++i[16:8]
2	call res := __SMACK_bswap_i16(i)
3	res := $bswap.bv16(i)
4	Not sure

Am I right here? It so, isn't option 1 and option 3 essentially the same since $bswap.bv16 is Boogie function with inline attribute. Option 2 could also use this Boogie function via __SMACK_code, which means both intrinsic.c and Prelude.cpp are both modified.

Another example is llvm.expect.i32. Does option 3 imply that we add a Boogie function $expect.i32 like this function {:inline} $expect.i32(i1:i32) returns (i32) {i}.

Now I'm inclined to 3 because we are able to avoid C functions which, to me can reduce performance significantly. Moreover, we only write C++ code.

zvonimir · 2019-05-30T21:16:47Z

You are comparing them based on their Boogie syntax. I am more interested in comparing them based on their SMACK implementations. Meaning, I suggest we pick whichever solution is the easiest and cleanest to implement in SMACK, as opposed to picking them based on how the generated code looks like at the Boogie level. I hope this makes sense.

shaobo-he · 2019-05-30T21:17:57Z

You are comparing them based on their Boogie syntax. I am more interested in comparing them based on their SMACK implementations. Meaning, I suggest we pick whichever solution is the easiest and cleanest to implement in SMACK, as opposed to picking them based on how the generated code looks like at the Boogie level. I hope this makes sense.

I see. Let me push a version that I thought is ideal.

shaobo-he · 2019-05-30T23:01:38Z

I pushed a new version. We will follow the option 3.

CMakeLists.txt

lib/smack/SmackInstGenerator.cpp

zvonimir · 2019-05-31T07:35:26Z

@keram88 and @shaobo-he : could we do bvswap using this pattern as well? This intrinsic is much more complicated, and so I am curious how its implementation will look like.

shaobo-he · 2019-05-31T18:32:40Z

@keram88 and @shaobo-he : could we do bvswap using this pattern as well? This intrinsic is much more complicated, and so I am curious how its implementation will look like.

Yes, we can. The models for bvswap is inductively defined like this,

function {:inline} $bswap.bv16(i1: bv16) returns (bv16) { i1[8:0]++i1[16:8] }
function {:inline} $bswap.bv32(i1: bv32) returns (bv32) { i1[8:0]++$bswap.bv16(i1[24:8])++i1[32:24] }
function {:inline} $bswap.bv48(i1: bv48) returns (bv48) { i1[8:0]++$bswap.bv32(i1[40:8])++i1[48:40] }
function {:inline} $bswap.bv64(i1: bv64) returns (bv64) { i1[8:0]++$bswap.bv48(i1[56:8])++i1[64:56] }
function {:inline} $bswap.bv80(i1: bv80) returns (bv80) { i1[8:0]++$bswap.bv64(i1[72:8])++i1[80:72] }
function {:inline} $bswap.bv96(i1: bv96) returns (bv96) { i1[8:0]++$bswap.bv80(i1[88:8])++i1[96:88] }

lib/smack/Prelude.cpp

lib/smack/SmackInstGenerator.cpp

shaobo-he · 2019-05-31T23:56:55Z

@keram88 It seems LLVM even has fixed point arithmetic intrinsics: https://llvm.org/docs/LangRef.html#id1924. You can put your fixed point encoding here.

keram88 · 2019-06-01T01:55:34Z

Just saw this:

If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.

in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?

shaobo-he · 2019-06-01T04:29:27Z

Just saw this:
If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.
in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?

Does it mean we can also let the result be the size of bits even if is_zero_undef==1?

keram88 · 2019-06-01T04:40:06Z

Just saw this:
If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.
in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?
Does it mean we can also let the result be the size of bits even if is_zero_undef==1?

Pretty much. Apparently some architectures return garbage when the input is zero. Modern x86 and ARM evidently work as expected. Maybe we should just leave it alone since I can only imagine it appearing with higher optimization levels.

zvonimir · 2019-06-01T07:49:59Z

Why do they have both fptrunc instruction and trunc intrinsic?
Their semantics seems identical to me.

zvonimir · 2019-06-01T07:51:10Z

Just saw this:
If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.
in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?
Does it mean we can also let the result be the size of bits even if is_zero_undef==1?
Pretty much. Apparently some architectures return garbage when the input is zero. Modern x86 and ARM evidently work as expected. Maybe we should just leave it alone since I can only imagine it appearing with higher optimization levels.

We should implement the semantics as fateful as we can, based on the LLVM documentation. So I think that in these situations this function should return an unconstrained value.

lib/smack/SmackInstGenerator.cpp

zvonimir · 2019-06-01T08:08:40Z

I left my review. Just as a reminder, our goal is to write code that is easy to understand and follow by others. That often does not equate to having the smallest implementation or having the smallest number of branches or using fancy C++ features or anything like that. In fact, it is often quite the opposite. So please think about this when implementing features in SMACK.

lib/smack/SmackInstGenerator.cpp

shaobo-he · 2019-06-01T18:20:33Z

Why do they have both fptrunc instruction and trunc intrinsic?
Their semantics seems identical to me.

More precisely are we talking about these two?
https://llvm.org/docs/LangRef.html#fptrunc-to-instruction and https://llvm.org/docs/LangRef.html#llvm-trunc-intrinsic.

The former truncates a higher precision floating-point type to a lower precision one such as double -> float. The latter rounds a floating-point value to an integer value that is less than or equal to the floating-point value. So they are different.

lib/smack/Prelude.cpp

lib/smack/SmackInstGenerator.cpp

lib/smack/Prelude.cpp

lib/smack/SmackInstGenerator.cpp

zvonimir

Looks good to me.

zvonimir · 2019-06-05T07:08:46Z

Please squash this and I'll merge it once @michael-emmi approves it.

This commit contains the following changes, 1. Legacy LLVM intrinsic handling such as those for llvm.expect and llvm.dbg is moved to the designated function. 2. Added handling of LLVM float intrinsics. 3. Added handling of LLVM byte swap intrinsics which leads to adding two Boogie AST classes, BvExtract and BvConcat. 4. Regressions are added to test handling of the aforementioned LLVM intrinsics. Co-authored-by: Mark Stanislaw Baranowski <mark.s.baranowski@gmail.com> Co-authored-by: Shaobo He <polarishehn@gmail.com>

michael-emmi

This PR looks good overall, but could use a few simplifications in the stmtMap construction.

michael-emmi · 2019-06-05T19:23:17Z

lib/smack/SmackInstGenerator.cpp

+    {llvm::Intrinsic::bswap, bswap},
+    {llvm::Intrinsic::expect, identity},
+    {llvm::Intrinsic::fabs,
+      [] (CallInst* ci){ assignUnFPFuncApp(ci,"$abs"); }},


Seems like you could simplify the following cases quite a bit by currying. For instance, rather than defining the assignUnFPFuncApp closure above, define the following function:

std::function<void(CallInst*)> SmackInstGenerator::assignUnFPFuncApp(std::string fnBase) { return [this, fnBase] (CallInst* ci) { // translation: $res := $<func>.bv*($arg1); if (SmackOptions::FloatEnabled) emit(Stmt::assign( rep->expr(ci), Expr::fn(indexedName(fnBase, {rep->type(ci->getArgOperand(0)->getType())}), rep->expr(ci->getArgOperand(0))))); else generateUnModeledCall(ci); }; }

Then you could replace the current line with simply:

assignUnFPFuncApp("$abs")},

Good suggestion. Is there a particular reason that this function is made as a member method instead of a local variable?

That’s probably just stylistic preferences; either way resolves the issue.

shaobo-he requested review from keram88 and zvonimir May 28, 2019 00:48

zvonimir requested changes May 28, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Show resolved Hide resolved

zvonimir requested changes May 28, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

zvonimir approved these changes May 28, 2019

View reviewed changes

keram88 approved these changes May 29, 2019

View reviewed changes

shaobo-he force-pushed the refactor-intrinsic-handling branch from ea85127 to 7ac96f0 Compare May 30, 2019 05:40

zvonimir reviewed May 31, 2019

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

zvonimir requested changes May 31, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

shaobo-he force-pushed the refactor-intrinsic-handling branch from 7b6694b to 0d49a1d Compare May 31, 2019 18:28

keram88 force-pushed the refactor-intrinsic-handling branch from 0d49a1d to 0350892 Compare May 31, 2019 20:15

zvonimir requested a review from michael-emmi May 31, 2019 20:31

zvonimir requested changes May 31, 2019

View reviewed changes

lib/smack/Prelude.cpp Outdated Show resolved Hide resolved

lib/smack/Prelude.cpp Outdated Show resolved Hide resolved

michael-emmi reviewed May 31, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

keram88 self-requested a review June 1, 2019 05:12

zvonimir requested changes Jun 1, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

zvonimir reviewed Jun 1, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

zvonimir reviewed Jun 1, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

zvonimir requested changes Jun 1, 2019

View reviewed changes

lib/smack/Prelude.cpp Outdated Show resolved Hide resolved

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

lib/smack/Prelude.cpp Outdated Show resolved Hide resolved

zvonimir requested changes Jun 4, 2019

View reviewed changes

lib/smack/SmackInstGenerator.cpp Show resolved Hide resolved

lib/smack/SmackInstGenerator.cpp Outdated Show resolved Hide resolved

keram88 approved these changes Jun 4, 2019

View reviewed changes

zvonimir approved these changes Jun 5, 2019

View reviewed changes

shaobo-he force-pushed the refactor-intrinsic-handling branch from c82c45e to 156c2a1 Compare June 5, 2019 17:40

michael-emmi reviewed Jun 5, 2019

View reviewed changes

currying

ef25f25

michael-emmi approved these changes Jun 5, 2019

View reviewed changes

zvonimir merged commit 33e468b into develop Jun 5, 2019

zvonimir deleted the refactor-intrinsic-handling branch June 5, 2019 23:35

This was referenced Jun 29, 2019

LLVM Intrinsics handling #424

Closed

Handling different translation of __finite with new LLVM version #449

Closed

Add a more consistent and sophisticated warning system #466

Merged

Refactor/improve LLVM intrinsic handling #451

Refactor/improve LLVM intrinsic handling #451

Uh oh!

Conversation

shaobo-he commented May 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keram88 commented May 28, 2019

Uh oh!

shaobo-he commented May 28, 2019

Uh oh!

Uh oh!

zvonimir commented May 28, 2019

Uh oh!

Uh oh!

zvonimir left a comment

Choose a reason for hiding this comment

Uh oh!

zvonimir commented May 28, 2019

Uh oh!

shaobo-he commented May 28, 2019

Uh oh!

zvonimir commented May 28, 2019

Uh oh!

shaobo-he commented May 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keram88 commented May 29, 2019

Uh oh!

keram88 left a comment

Choose a reason for hiding this comment

Uh oh!

zvonimir commented May 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shaobo-he commented May 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shaobo-he commented May 30, 2019

Uh oh!

zvonimir commented May 30, 2019

Uh oh!

shaobo-he commented May 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zvonimir commented May 30, 2019

Uh oh!

shaobo-he commented May 30, 2019

Uh oh!

shaobo-he commented May 30, 2019

Uh oh!

Uh oh!

Uh oh!

zvonimir commented May 31, 2019

Uh oh!

shaobo-he commented May 31, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shaobo-he commented May 31, 2019

Uh oh!

keram88 commented Jun 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shaobo-he commented Jun 1, 2019

Uh oh!

keram88 commented Jun 1, 2019

Uh oh!

zvonimir commented Jun 1, 2019

Uh oh!

zvonimir commented Jun 1, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zvonimir commented Jun 1, 2019

Uh oh!

Uh oh!

shaobo-he commented May 28, 2019 •

edited

Loading

shaobo-he commented May 29, 2019 •

edited

Loading

zvonimir commented May 29, 2019 •

edited

Loading

shaobo-he commented May 30, 2019 •

edited

Loading

shaobo-he commented May 30, 2019 •

edited

Loading

keram88 commented Jun 1, 2019 •

edited

Loading