Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add memcpy to alive2 #88

Merged
merged 5 commits into from Sep 14, 2019
Merged

Conversation

Dongjoo-Kim
Copy link
Contributor

  1. Add memcpy's SMT encoding(in memory.cpp).

  2. Make Alive2 interpret memset and memcpy correctly(in instr.cpp, instr.h, utils.cpp).

    Previous Alive2 doesn't interpret memset and memcpy, because there is no visit function for them.
    So basic tests in 'test/memcpy/' made 'LLVM error: Unsupported instruction'.
    I fixed it, as a result basic tests which I made are successfully verified by Alive2.


RetTy visitMemCpyInst(llvm::MemCpyInst &i) {
auto fn = i.getCalledFunction();
if (!fn) // TODO: check real memcpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this error checking isn't needed. The intrinsic should be valid

tests/memcpy/1-src.ll Show resolved Hide resolved
ir/instr.h Outdated
unsigned align_dst, align_src;
public:
Memcpy(Value &dst, Value &src, Value &bytes, unsigned align_dst, unsigned align_src)
: Instr(Type::voidTy, "memcpy"), dst(dst), src(src), bytes(bytes), align_dst(align_dst), align_src(align_src) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these lines are too long. Please wrap at 80 columns

ir/memory.cpp Outdated
@@ -304,10 +312,27 @@ void Memory::memset(const expr &p, const StateValue &val, const expr &bytes,

void Memory::memcpy(const expr &d, const expr &s, const expr &bytes,
unsigned align_dst, unsigned align_src) {
// TODO: Add ub condition - bytes == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this requites some API changes. Either address it in this PR or in a subsequent patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that when 'bytes' is zero and 'src (or tgt) pointer' is poison, the code has ub or not.

I am going to address similar cases in a subsequent patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. From my understanding, LLVM folks didn't want to be too evil here, so it should be UB only if bytes > 0

ir/memory.cpp Outdated
@@ -149,6 +149,14 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) {
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align);
}

expr disjoint(expr offset1, const expr &len1, expr offset2, const expr &len2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be static

ir/memory.cpp Outdated
@@ -149,6 +149,14 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) {
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align);
}

expr disjoint(expr offset1, const expr &len1, expr offset2, const expr &len2) {
return ((offset1+len1).ule(offset2) || (offset2+len2).ule(offset1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has too many parenthesis

ir/memory.cpp Outdated
@@ -149,6 +149,14 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) {
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align);
}

expr disjoint(expr offset1, const expr &len1, expr offset2, const expr &len2) {
return ((offset1+len1).ule(offset2) || (offset2+len2).ule(offset1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this condition correct if offset + len overflows? Or can't that happen (because of the half of the address space limitation)? If that's the case, we need a comment there explaining the assumption/reasoning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I use is_disjoint in memcpy, overflow checking is run in is_dereferenceable.

If we need this condition at other code, overflow condition is considered.

I add a comment in the code also.

Copy link
Member

@nunoplopes nunoplopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.
Also, there are still many lines over 80 characters. The build is broken; please have a look.

}

RetTy visitMemCpyInst(llvm::MemCpyInst &i) {
auto ty = llvm_type2alive(i.getType());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't needed. same for memset

ir/instr.cpp Outdated
}

StateValue Memset::toSMT(State &s) const {
// TODO: check following lines are correct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the discussion below, it seems that we should have someting like s.addUB(vbytes.ugt(0).implies(np_ptr))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified memset, memcpy and is_dereferenceable using your conditions.
is_disjoint doesn't need modifying because when bytes is 0, is_disjoint is already no ub.

ir/instr.cpp Outdated
}


void Memcpy::print(std::ostream &os) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no std:: here

ir/memory.cpp Outdated
@@ -304,10 +312,27 @@ void Memory::memset(const expr &p, const StateValue &val, const expr &bytes,

void Memory::memcpy(const expr &d, const expr &s, const expr &bytes,
unsigned align_dst, unsigned align_src) {
// TODO: Add ub condition - bytes == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. From my understanding, LLVM folks didn't want to be too evil here, so it should be UB only if bytes > 0

tests/memcpy/1-src.ll Show resolved Hide resolved
Copy link
Member

@nunoplopes nunoplopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good. almost there :)

ir/instr.cpp Outdated
auto &[vdst, np] = s[*dst];
auto &[vsrc, np2] = s[*src];
auto &[vbytes, np3] = s[*bytes];
s.addUB(vbytes.ugt(0).implies(np));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please combine the two checks into one: ".implies(np && np2)"

ir/instr.cpp Outdated
auto &[vptr, np] = s[*ptr];
auto &[vbytes, np2] = s[*bytes];
s.addUB(vbytes.ugt(0).implies(np));
s.addUB(vbytes.ugt(0).implies(np2));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this second line should be simply "s.addUB(np2)". If vbytes is poison, the copy is UB since the value to copy is unbounded.

ir/instr.cpp Outdated
auto &[vbytes, np3] = s[*bytes];
s.addUB(vbytes.ugt(0).implies(np));
s.addUB(vbytes.ugt(0).implies(np2));
s.addUB(vbytes.ugt(0).implies(np3));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as for memset: should be addUB(np3) only.
Also, please rename vars to np_dst, np_src, or something like that. np2/np3 are error prone :)

ir/instr.cpp Outdated
}

expr Memcpy::getTypeConstraints(const Function &f) const {
return dst->getType().enforcePtrType() && dst->getType().enforcePtrType()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do this like:

return enforcefoo() &&
       enforceBar &&
       etc;

@Dongjoo-Kim
Copy link
Contributor Author

I reflected your comments and pushed it.

Copy link
Member

@nunoplopes nunoplopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks! I left a few more comments, including a question about correctness of the encoding.

ir/memory.h Outdated
@@ -68,6 +68,7 @@ class Pointer {
smt::expr is_aligned(unsigned align) const;
void is_dereferenceable(unsigned bytes, unsigned align);
void is_dereferenceable(const smt::expr &bytes, unsigned align);
void is_disjoint(const smt::expr &len1, const Pointer &ptr2, const smt::expr &len2) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is over 80 characters

ir/memory.cpp Outdated
@@ -149,6 +151,20 @@ void Pointer::is_dereferenceable(unsigned bytes, unsigned align) {
is_dereferenceable(expr::mkUInt(bytes, m.bits_for_offset), align);
}

// general disjoint check for unsigned integer
static expr disjoint(expr begin1, const expr &len1, expr begin2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: begin1 & begin2 should be const references as well.

ir/memory.cpp Show resolved Hide resolved
ir/memory.cpp Outdated
Pointer idx(*this, expr::mkVar(name.c_str(), dst.bits()));

expr cond = idx.uge(dst).both() && idx.ult(dst + bytes).both();
expr val = expr::mkIf(cond, blocks_val.load((src + idx.get_offset()).release()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lines seems over 80 characters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure the condition for the src load is correct. It assumes that offset of src = 0.
Remember that memcpy both src and dst may start in the middle of an object.
So we need src + "delta between start of copy and idx". Which may not be idx.get_offset() if the memcpy wasn't done the start of the object.
What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment. I fixed this bug.

@Dongjoo-Kim
Copy link
Contributor Author

Sorry for late response. I reflect your comments.

@nunoplopes nunoplopes merged commit 5356f55 into AliveToolkit:master Sep 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants