Skip to content

Conversation

shaobo-he
Copy link
Contributor

@shaobo-he shaobo-he commented May 28, 2019

  • Refactor existing intrinsic handling
  • Add FP intrinsic handling
  • Add BV intrinsic handling
  • Add Fixed Point intrinsic handling
  • Add regressions

@shaobo-he shaobo-he requested review from keram88 and zvonimir May 28, 2019 00:48
@keram88
Copy link
Contributor

keram88 commented May 28, 2019

I'm not sure what the intended scope of this is. I would like to move the Rust functions out into something like isSpecialFunction and processSpecialFunction as well.

@shaobo-he
Copy link
Contributor Author

I'm not sure what the intended scope of this is. I would like to move the Rust functions out into something like isSpecialFunction and processSpecialFunction as well.

I think they are out of the scope of this PR.

@zvonimir
Copy link
Member

@keram88 : Please create a separate pull request for that. I guess the function should be called processSpecialRustCall since currently they are all Rust-specific.

Copy link
Member

@zvonimir zvonimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@zvonimir
Copy link
Member

Please squash and then I'll merge it in.

@shaobo-he
Copy link
Contributor Author

Please squash and then I'll merge it in.

Shouldn't we also add the handling of intrinsics like llvm.fabs and llvm.ctlz?

@zvonimir
Copy link
Member

I thought you wanted to do that as a separate pull request, which maybe would not be a bad idea.

@shaobo-he
Copy link
Contributor Author

shaobo-he commented May 29, 2019

I thought you wanted to do that as a separate pull request, which maybe would not be a bad idea.

I think we should implement them in this PR so that we have a better comprehensive picture about what's a good way to implement them.

@keram88
Copy link
Contributor

keram88 commented May 29, 2019

I thought it was separate too.
Anyway, this is my (three-quarters baked?) Prelude.cpp implementation for bswap, so we can see what that approach looks like.
6ae5c93

Copy link
Contributor

@keram88 keram88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine if we're just moving intrinsics out.

@zvonimir
Copy link
Member

zvonimir commented May 29, 2019

When you started working on the intrinsics, I said we should try to handle them uniformly, meaning either put all of them into intrinsics.c or put all of them into Prelude.cpp. Now it seems that some will go into intrinsics.c and some into Prelude.cpp. Why is that so? Why could they not all be at the same place?
Wouldn't you agree that having all intrinsics be modeled at the same place would make sense?

Looking at the bvswap implementation, to me it seems it would be hard to implement it nicely in intrinsics.c without introducing some crazy macros or something like that.

Hence, my proposal for refactoring is as follows:

  1. Move what is currently in intrinsics.c into Prelude.cpp. I would guess that should be pretty easy.
  2. Delete intrinsics.c.
  3. Implement a function called generateIntrinsics or something like that in Prelude.cpp. All models for intrinsics should be generated from that functions. Be careful about float and bit-precise flags.

@shaobo-he
Copy link
Contributor Author

shaobo-he commented May 30, 2019

When you started working on the intrinsics, I said we should try to handle them uniformly, meaning either put all of them into intrinsics.c or put all of them into Prelude.cpp. Now it seems that some will go into intrinsics.c and some into Prelude.cpp. Why is that so? Why could they not all be at the same place?
Wouldn't you agree that having all intrinsics be modeled at the same place would make sense?

Looking at the bvswap implementation, to me it seems it would be hard to implement it nicely in intrinsics.c without introducing some crazy macros or something like that.

Hence, my proposal for refactoring is as follows:

  1. Move what is currently in intrinsics.c into Prelude.cpp. I would guess that should be pretty easy.
  2. Delete intrinsics.c.
  3. Implement a function called generateIntrinsics or something like that in Prelude.cpp. All models for intrinsics should be generated from that functions. Be careful about float and bit-precise flags.

By talking about handling LLVM intrinsics uniformly, I thought we meant to choose one between the two design choices,

  1. Replace the function call to an LLVM intrinsic with a call to the intrinsic handling function which is what we implemented.
  2. Replace the function call to an LLVM intrinsic with Boogie commands directly. For example, a call to llvm.expect is translated to an assignment.

I voted for the first option because I thought models of LLVM intriniscs can be much more easily expressed using C. For example, there are a lot of resources online about how to implement counting the leading zeros in loop-free C code. But Mark figures out ways to implement these models using Boogie functions, which is more efficient than C code. So the second option is more promising because we do not need intrinsics.c, although it's unclear that we can do this to all the intriniscs.

The bottom line is that there is nothing in intrinsics.c that could be moved to Prelude.cpp because the former only contains functions using __SMACK_code. Its existence is due to our adoption of option 1.

@shaobo-he shaobo-he force-pushed the refactor-intrinsic-handling branch from ea85127 to 7ac96f0 Compare May 30, 2019 05:40
@shaobo-he
Copy link
Contributor Author

In spite of our ongoing discussion, I squashed the commits so it's ready to be merged.

@zvonimir
Copy link
Member

When you started working on the intrinsics, I said we should try to handle them uniformly, meaning either put all of them into intrinsics.c or put all of them into Prelude.cpp. Now it seems that some will go into intrinsics.c and some into Prelude.cpp. Why is that so? Why could they not all be at the same place?
Wouldn't you agree that having all intrinsics be modeled at the same place would make sense?
Looking at the bvswap implementation, to me it seems it would be hard to implement it nicely in intrinsics.c without introducing some crazy macros or something like that.
Hence, my proposal for refactoring is as follows:

  1. Move what is currently in intrinsics.c into Prelude.cpp. I would guess that should be pretty easy.
  2. Delete intrinsics.c.
  3. Implement a function called generateIntrinsics or something like that in Prelude.cpp. All models for intrinsics should be generated from that functions. Be careful about float and bit-precise flags.

By talking about handling LLVM intrinsics uniformly, I thought we meant to choose one between the two design choices,

  1. Replace the function call to an LLVM intrinsic with a call to the intrinsic handling function which is what we implemented.
  2. Replace the function call to an LLVM intrinsic with Boogie commands directly. For example, a call to llvm.expect is translated to an assignment.

I voted for the first option because I thought models of LLVM intriniscs can be much more easily expressed using C. For example, there are a lot of resources online about how to implement counting the leading zeros in loop-free C code. But Mark figures out ways to implement these models using Boogie functions, which is more efficient than C code. So the second option is more promising because we do not need intrinsics.c, although it's unclear that we can do this to all the intriniscs.

The bottom line is that there is nothing in intrinsics.c that could be moved to Prelude.cpp because the former only contains functions using __SMACK_code. Its existence is due to our adoption of option 1.

Well, note that your option (1) can be implemented in the following two ways:

  1. Implement models for intrinsics as C functions in intrinsics.c, which is what we did.
  2. Implement models for intrinsics as Boogie functions in Prelude.cpp, which is what Mark did.

So I think our options for implementing models for intrinsics are as follows:

  1. Generate Boogie code that models an intrinsic directly and inline it. In this case there would be no functions or function calls being generated.
  2. Implement models for intrinsics as C functions in intrinsics.c, and then invoke the appropriate function every time an intrinsic is encountered.
  3. Implement models for intrinsics as Boogie functions/procedures in Prelude.cpp, and then invoke the appropriate function every time an intrinsic is encountered.
  4. Some combination of the above depending on an intrinsic.

My main goal has been uniformity, meaning to avoid (4).

I actually don't have a strong preference among options (1)-(3). I like (2), which is what @shaobo-he implemented. But then it seems that bwswap would be hard to implement cleanly using that approach, which is why @keram88 went with (3). Is that right @keram88?

I have my doubts about (1). On one hand, it generates Boogie code directly and already inlined, and so there are no extraneous procedure invocations and definitions. That might be good for performance. On the other hand, we lose procedure boundaries that might be leveraged by Corral to abstract away intrinsics that are not needed to prove a property, which might be bad for performance.

Ultimately, I would choose between (1)-(3) based on which approach we think will lead to uniform and clean implementation of models of intrinsics.

@shaobo-he
Copy link
Contributor Author

shaobo-he commented May 30, 2019

So I think our options for implementing models for intrinsics are as follows:

1. Generate Boogie code that models an intrinsic directly and inline it. In this case there would be no functions or function calls being generated.

2. Implement models for intrinsics as C functions in `intrinsics.c`, and then invoke the appropriate function every time an intrinsic is encountered.

3. Implement models for intrinsics as Boogie functions/procedures in `Prelude.cpp`, and then invoke the appropriate function every time an intrinsic is encountered.

4. Some combination of the above depending on an intrinsic.

Let's use a concrete example to illustrate the differences between the four approaches. For example, we want to handle lvm.bswap.i16, my understanding of 1 is listed as follows,

Option Translation
1 res := i[8:0]++i[16:8]
2 call res := __SMACK_bswap_i16(i)
3 res := $bswap.bv16(i)
4 Not sure

Am I right here? It so, isn't option 1 and option 3 essentially the same since $bswap.bv16 is Boogie function with inline attribute. Option 2 could also use this Boogie function via __SMACK_code, which means both intrinsic.c and Prelude.cpp are both modified.

Another example is llvm.expect.i32. Does option 3 imply that we add a Boogie function $expect.i32 like this function {:inline} $expect.i32(i1:i32) returns (i32) {i}.

Now I'm inclined to 3 because we are able to avoid C functions which, to me can reduce performance significantly. Moreover, we only write C++ code.

@zvonimir
Copy link
Member

You are comparing them based on their Boogie syntax. I am more interested in comparing them based on their SMACK implementations. Meaning, I suggest we pick whichever solution is the easiest and cleanest to implement in SMACK, as opposed to picking them based on how the generated code looks like at the Boogie level. I hope this makes sense.

@shaobo-he
Copy link
Contributor Author

You are comparing them based on their Boogie syntax. I am more interested in comparing them based on their SMACK implementations. Meaning, I suggest we pick whichever solution is the easiest and cleanest to implement in SMACK, as opposed to picking them based on how the generated code looks like at the Boogie level. I hope this makes sense.

I see. Let me push a version that I thought is ideal.

@shaobo-he
Copy link
Contributor Author

I pushed a new version. We will follow the option 3.

@zvonimir
Copy link
Member

@keram88 and @shaobo-he : could we do bvswap using this pattern as well? This intrinsic is much more complicated, and so I am curious how its implementation will look like.

@shaobo-he shaobo-he force-pushed the refactor-intrinsic-handling branch from 7b6694b to 0d49a1d Compare May 31, 2019 18:28
@shaobo-he
Copy link
Contributor Author

@keram88 and @shaobo-he : could we do bvswap using this pattern as well? This intrinsic is much more complicated, and so I am curious how its implementation will look like.

Yes, we can. The models for bvswap is inductively defined like this,

function {:inline} $bswap.bv16(i1: bv16) returns (bv16) { i1[8:0]++i1[16:8] }
function {:inline} $bswap.bv32(i1: bv32) returns (bv32) { i1[8:0]++$bswap.bv16(i1[24:8])++i1[32:24] }
function {:inline} $bswap.bv48(i1: bv48) returns (bv48) { i1[8:0]++$bswap.bv32(i1[40:8])++i1[48:40] }
function {:inline} $bswap.bv64(i1: bv64) returns (bv64) { i1[8:0]++$bswap.bv48(i1[56:8])++i1[64:56] }
function {:inline} $bswap.bv80(i1: bv80) returns (bv80) { i1[8:0]++$bswap.bv64(i1[72:8])++i1[80:72] }
function {:inline} $bswap.bv96(i1: bv96) returns (bv96) { i1[8:0]++$bswap.bv80(i1[88:8])++i1[96:88] }

@keram88 keram88 force-pushed the refactor-intrinsic-handling branch from 0d49a1d to 0350892 Compare May 31, 2019 20:15
@zvonimir zvonimir requested a review from michael-emmi May 31, 2019 20:31
@shaobo-he
Copy link
Contributor Author

@keram88 It seems LLVM even has fixed point arithmetic intrinsics: https://llvm.org/docs/LangRef.html#id1924. You can put your fixed point encoding here.

@keram88
Copy link
Contributor

keram88 commented Jun 1, 2019

Just saw this:

If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.

in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?

@shaobo-he
Copy link
Contributor Author

Just saw this:

If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.

in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?

Does it mean we can also let the result be the size of bits even if is_zero_undef==1?

@keram88
Copy link
Contributor

keram88 commented Jun 1, 2019

Just saw this:

If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.

in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?

Does it mean we can also let the result be the size of bits even if is_zero_undef==1?

Pretty much. Apparently some architectures return garbage when the input is zero. Modern x86 and ARM evidently work as expected. Maybe we should just leave it alone since I can only imagine it appearing with higher optimization levels.

@keram88 keram88 self-requested a review June 1, 2019 05:12
@zvonimir
Copy link
Member

zvonimir commented Jun 1, 2019

Why do they have both fptrunc instruction and trunc intrinsic?
Their semantics seems identical to me.

@zvonimir
Copy link
Member

zvonimir commented Jun 1, 2019

Just saw this:

If src == 0 then the result is the size in bits of the type of src 
if is_zero_undef == 0 and undef otherwise.

in:
https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic
How would we want to handle this?

Does it mean we can also let the result be the size of bits even if is_zero_undef==1?

Pretty much. Apparently some architectures return garbage when the input is zero. Modern x86 and ARM evidently work as expected. Maybe we should just leave it alone since I can only imagine it appearing with higher optimization levels.

We should implement the semantics as fateful as we can, based on the LLVM documentation. So I think that in these situations this function should return an unconstrained value.

@zvonimir
Copy link
Member

zvonimir commented Jun 1, 2019

I left my review. Just as a reminder, our goal is to write code that is easy to understand and follow by others. That often does not equate to having the smallest implementation or having the smallest number of branches or using fancy C++ features or anything like that. In fact, it is often quite the opposite. So please think about this when implementing features in SMACK.

@shaobo-he
Copy link
Contributor Author

Why do they have both fptrunc instruction and trunc intrinsic?
Their semantics seems identical to me.

More precisely are we talking about these two?
https://llvm.org/docs/LangRef.html#fptrunc-to-instruction and https://llvm.org/docs/LangRef.html#llvm-trunc-intrinsic.

The former truncates a higher precision floating-point type to a lower precision one such as double -> float. The latter rounds a floating-point value to an integer value that is less than or equal to the floating-point value. So they are different.

Copy link
Member

@zvonimir zvonimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@zvonimir
Copy link
Member

zvonimir commented Jun 5, 2019

Please squash this and I'll merge it once @michael-emmi approves it.

This commit contains the following changes,
1. Legacy LLVM intrinsic handling such as those for llvm.expect and
llvm.dbg is moved to the designated function.
2. Added handling of LLVM float intrinsics.
3. Added handling of LLVM byte swap intrinsics which leads to adding two Boogie
AST classes, BvExtract and BvConcat.
4. Regressions are added to test handling of the aforementioned LLVM intrinsics.

Co-authored-by: Mark Stanislaw Baranowski <mark.s.baranowski@gmail.com>
Co-authored-by: Shaobo He <polarishehn@gmail.com>
@shaobo-he shaobo-he force-pushed the refactor-intrinsic-handling branch from c82c45e to 156c2a1 Compare June 5, 2019 17:40
Copy link
Contributor

@michael-emmi michael-emmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good overall, but could use a few simplifications in the stmtMap construction.

{llvm::Intrinsic::bswap, bswap},
{llvm::Intrinsic::expect, identity},
{llvm::Intrinsic::fabs,
[] (CallInst* ci){ assignUnFPFuncApp(ci,"$abs"); }},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you could simplify the following cases quite a bit by currying. For instance, rather than defining the assignUnFPFuncApp closure above, define the following function:

std::function<void(CallInst*)> SmackInstGenerator::assignUnFPFuncApp(std::string fnBase) {
  return [this, fnBase] (CallInst* ci) {
    // translation: $res := $<func>.bv*($arg1);
    if (SmackOptions::FloatEnabled)
      emit(Stmt::assign(
            rep->expr(ci),
            Expr::fn(indexedName(fnBase,
                {rep->type(ci->getArgOperand(0)->getType())}),
              rep->expr(ci->getArgOperand(0)))));
    else
      generateUnModeledCall(ci);
  };
}

Then you could replace the current line with simply:

assignUnFPFuncApp("$abs")},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Is there a particular reason that this function is made as a member method instead of a local variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s probably just stylistic preferences; either way resolves the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants