New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add memcpy to alive2 #88
Conversation
llvm_util/utils.cpp
Outdated
|
||
RetTy visitMemCpyInst(llvm::MemCpyInst &i) { | ||
auto fn = i.getCalledFunction(); | ||
if (!fn) // TODO: check real memcpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this error checking isn't needed. The intrinsic should be valid
ir/instr.h
Outdated
unsigned align_dst, align_src; | ||
public: | ||
Memcpy(Value &dst, Value &src, Value &bytes, unsigned align_dst, unsigned align_src) | ||
: Instr(Type::voidTy, "memcpy"), dst(dst), src(src), bytes(bytes), align_dst(align_dst), align_src(align_src) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines are too long. Please wrap at 80 columns
ir/memory.cpp
Outdated
@@ -304,10 +312,27 @@ void Memory::memset(const expr &p, const StateValue &val, const expr &bytes, | |||
|
|||
void Memory::memcpy(const expr &d, const expr &s, const expr &bytes, | |||
unsigned align_dst, unsigned align_src) { | |||
// TODO: Add ub condition - bytes == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this requites some API changes. Either address it in this PR or in a subsequent patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that when 'bytes' is zero and 'src (or tgt) pointer' is poison, the code has ub or not.
I am going to address similar cases in a subsequent patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. From my understanding, LLVM folks didn't want to be too evil here, so it should be UB only if bytes > 0
ir/memory.cpp
Outdated
@@ -149,6 +149,14 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) { | |||
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align); | |||
} | |||
|
|||
expr disjoint(expr offset1, const expr &len1, expr offset2, const expr &len2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be static
ir/memory.cpp
Outdated
@@ -149,6 +149,14 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) { | |||
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align); | |||
} | |||
|
|||
expr disjoint(expr offset1, const expr &len1, expr offset2, const expr &len2) { | |||
return ((offset1+len1).ule(offset2) || (offset2+len2).ule(offset1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has too many parenthesis
ir/memory.cpp
Outdated
@@ -149,6 +149,14 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) { | |||
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align); | |||
} | |||
|
|||
expr disjoint(expr offset1, const expr &len1, expr offset2, const expr &len2) { | |||
return ((offset1+len1).ule(offset2) || (offset2+len2).ule(offset1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this condition correct if offset + len overflows? Or can't that happen (because of the half of the address space limitation)? If that's the case, we need a comment there explaining the assumption/reasoning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I use is_disjoint in memcpy, overflow checking is run in is_dereferenceable.
If we need this condition at other code, overflow condition is considered.
I add a comment in the code also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments.
Also, there are still many lines over 80 characters. The build is broken; please have a look.
llvm_util/utils.cpp
Outdated
} | ||
|
||
RetTy visitMemCpyInst(llvm::MemCpyInst &i) { | ||
auto ty = llvm_type2alive(i.getType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't needed. same for memset
ir/instr.cpp
Outdated
} | ||
|
||
StateValue Memset::toSMT(State &s) const { | ||
// TODO: check following lines are correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given the discussion below, it seems that we should have someting like s.addUB(vbytes.ugt(0).implies(np_ptr))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified memset, memcpy and is_dereferenceable using your conditions.
is_disjoint doesn't need modifying because when bytes is 0, is_disjoint is already no ub.
ir/instr.cpp
Outdated
} | ||
|
||
|
||
void Memcpy::print(std::ostream &os) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no std:: here
ir/memory.cpp
Outdated
@@ -304,10 +312,27 @@ void Memory::memset(const expr &p, const StateValue &val, const expr &bytes, | |||
|
|||
void Memory::memcpy(const expr &d, const expr &s, const expr &bytes, | |||
unsigned align_dst, unsigned align_src) { | |||
// TODO: Add ub condition - bytes == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. From my understanding, LLVM folks didn't want to be too evil here, so it should be UB only if bytes > 0
e93bb6c
to
7652232
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good. almost there :)
ir/instr.cpp
Outdated
auto &[vdst, np] = s[*dst]; | ||
auto &[vsrc, np2] = s[*src]; | ||
auto &[vbytes, np3] = s[*bytes]; | ||
s.addUB(vbytes.ugt(0).implies(np)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please combine the two checks into one: ".implies(np && np2)"
ir/instr.cpp
Outdated
auto &[vptr, np] = s[*ptr]; | ||
auto &[vbytes, np2] = s[*bytes]; | ||
s.addUB(vbytes.ugt(0).implies(np)); | ||
s.addUB(vbytes.ugt(0).implies(np2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this second line should be simply "s.addUB(np2)". If vbytes is poison, the copy is UB since the value to copy is unbounded.
ir/instr.cpp
Outdated
auto &[vbytes, np3] = s[*bytes]; | ||
s.addUB(vbytes.ugt(0).implies(np)); | ||
s.addUB(vbytes.ugt(0).implies(np2)); | ||
s.addUB(vbytes.ugt(0).implies(np3)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as for memset: should be addUB(np3) only.
Also, please rename vars to np_dst, np_src, or something like that. np2/np3 are error prone :)
ir/instr.cpp
Outdated
} | ||
|
||
expr Memcpy::getTypeConstraints(const Function &f) const { | ||
return dst->getType().enforcePtrType() && dst->getType().enforcePtrType() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please do this like:
return enforcefoo() &&
enforceBar &&
etc;
7652232
to
c6bad7c
Compare
I reflected your comments and pushed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, thanks! I left a few more comments, including a question about correctness of the encoding.
ir/memory.h
Outdated
@@ -68,6 +68,7 @@ class Pointer { | |||
smt::expr is_aligned(unsigned align) const; | |||
void is_dereferenceable(unsigned bytes, unsigned align); | |||
void is_dereferenceable(const smt::expr &bytes, unsigned align); | |||
void is_disjoint(const smt::expr &len1, const Pointer &ptr2, const smt::expr &len2) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is over 80 characters
ir/memory.cpp
Outdated
@@ -149,6 +151,20 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) { | |||
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align); | |||
} | |||
|
|||
// general disjoint check for unsigned integer | |||
static expr disjoint(expr begin1, const expr &len1, expr begin2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: begin1 & begin2 should be const references as well.
ir/memory.cpp
Outdated
Pointer idx(*this, expr::mkVar(name.c_str(), dst.bits())); | ||
|
||
expr cond = idx.uge(dst).both() && idx.ult(dst + bytes).both(); | ||
expr val = expr::mkIf(cond, blocks_val.load((src + idx.get_offset()).release()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this lines seems over 80 characters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure the condition for the src load is correct. It assumes that offset of src = 0.
Remember that memcpy both src and dst may start in the middle of an object.
So we need src + "delta between start of copy and idx". Which may not be idx.get_offset()
if the memcpy wasn't done the start of the object.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comment. I fixed this bug.
c6bad7c
to
05c0d1d
Compare
Sorry for late response. I reflect your comments. |
Add memcpy's SMT encoding(in memory.cpp).
Make Alive2 interpret memset and memcpy correctly(in instr.cpp, instr.h, utils.cpp).
Previous Alive2 doesn't interpret memset and memcpy, because there is no visit function for them.
So basic tests in 'test/memcpy/' made 'LLVM error: Unsupported instruction'.
I fixed it, as a result basic tests which I made are successfully verified by Alive2.