Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for fold #1823

Closed
weaversa opened this issue Nov 1, 2022 · 10 comments · Fixed by #1941
Closed

Better support for fold #1823

weaversa opened this issue Nov 1, 2022 · 10 comments · Fixed by #1941
Assignees
Labels
[C] Bug This is a bug. Something isn't working. [E] Performance Track performance improvement (Time / Memory / CPU) T-CBMC Issue related to an existing CBMC issue T-User Tag user issues / requests

Comments

@weaversa
Copy link

weaversa commented Nov 1, 2022

I tried this code:

pub fn array_sum_fold(x : [usize;100]) -> usize {
    x.iter().fold(0, |accumulator, current| accumulator + current)
}

pub fn array_sum_for(x : [usize;100]) -> usize {
    let mut accumulator : usize = 0;
    for i in 0..100 {
      accumulator = x[i] + accumulator
    }
    accumulator
}

#[cfg(kani)]
mod proofs {
    use super::*;

    #[kani::proof]
    fn array_sum_fold_proof() {
        let x : [usize;100] = kani::any();
        array_sum_fold(x);
    }

    #[kani::proof]
    fn array_sum_for_proof() {
        let x : [usize;100] = kani::any();
        array_sum_for(x);
    }
}

using the following command line invocation:

$ cargo kani

with Kani version: 0.13.0

I expected to see this happen: Proof runtimes for the fold and for versions of array_sum complete in roughly the same time.

Instead, this happened: kani completed the for version quite quickly and then runs until eventually crashing (out of memory?) on the fold version.

@weaversa weaversa added the [C] Bug This is a bug. Something isn't working. label Nov 1, 2022
@tedinski tedinski added the [E] Performance Track performance improvement (Time / Memory / CPU) label Nov 1, 2022
@tedinski
Copy link
Contributor

tedinski commented Nov 1, 2022

Thank you for the test case. This is a great example of a performance problem that shouldn't exist, and we'll have to investigate.

(There are several things in how we translate Rust code down to CBMC's IR that CBMC hasn't yet been "tuned" for. Vectors are a current focus area, but this seems like another one that likely has a very addressable cause...)

(BTW: Since you use += the + accumulator is not needed. I don't think this will change anything about performance here but these two are not currently equivalent code.)

@weaversa
Copy link
Author

weaversa commented Nov 1, 2022

Thank you for the test case. This is a great example of a performance problem that shouldn't exist, and we'll have to investigate.

(There are several things in how we translate Rust code down to CBMC's IR that CBMC hasn't yet been "tuned" for. Vectors are a current focus area, but this seems like another one that likely has a very addressable cause...)

Thank you. I've also noticed issues with vectors. Is it worthwhile to submit an issue there as well?

(BTW: Since you use += the + accumulator is not needed. I don't think this will change anything about performance here but these two are not currently equivalent code.)

Quite right! I made the correction.

@tedinski
Copy link
Contributor

tedinski commented Nov 1, 2022

We are tracking some Vec issues here: #1676

Another thought I want to write down before I forget: I wonder if for i in x would display different performance from the other two implementations...

@zhassan-aws
Copy link
Contributor

I added a for-iter version:

pub fn array_sum_fold(x : [usize;100]) -> usize {
    x.iter().fold(0, |accumulator, current| accumulator + current)
}

pub fn array_sum_for(x : [usize;100]) -> usize {
    let mut accumulator : usize = 0;
    for i in 0..100 {
      accumulator += x[i];
    }
    accumulator
}

pub fn array_sum_for_iter(x : [usize;100]) -> usize {
    let mut accumulator : usize = 0;
    for i in x {
      accumulator += i;
    }
    accumulator
}

#[cfg(kani)]
mod proofs {
    use super::*;

    #[kani::proof]
    #[kani::unwind(101)]
    fn array_sum_fold_proof() {
        let x : [usize;100] = kani::any();
        array_sum_fold(x);
    }

    #[kani::proof]
    #[kani::unwind(101)]
    fn array_sum_for_proof() {
        let x : [usize;100] = kani::any();
        array_sum_for(x);
    }

    #[kani::proof]
    #[kani::unwind(101)]
    fn array_sum_for_iter_proof() {
        let x : [usize;100] = kani::any();
        array_sum_for_iter(x);
    }
}

and I got the following numbers with 33e226b:

fold.log:Verification Time: 35.88157s
fold.log:       Maximum resident set size (kbytes): 3459548
for.log:Verification Time: 11.837213s
for.log:        Maximum resident set size (kbytes): 532692
for_iter.log:Verification Time: 11.500173s
for_iter.log:   Maximum resident set size (kbytes): 592848

So the fold version seems to be at least 3X slower, and consumes ~6X more memory.

@zhassan-aws
Copy link
Contributor

It seems that the difference in performance is caused by iterators. Comparing these two versions, which are identical expect for the .iter() used in the second, Kani performs much worse on the second:

Without .iter():

pub fn array_sum_for_iter1(x : [usize;100]) -> usize {
    let mut accumulator: usize = 0;
    for i in x {
      accumulator += i;
    }
    accumulator
}
Verification Time: 11.509705s
Maximum resident set size (kbytes): 592648

With .iter():

pub fn array_sum_for_iter2(x : [usize;100]) -> usize {
    let mut accumulator: usize = 0;
    for i in x.iter() {
      accumulator += i;
    }
    accumulator
}
Verification Time: 38.08861s
Maximum resident set size (kbytes): 3495444

@tedinski
Copy link
Contributor

tedinski commented Nov 1, 2022

I confess I thought those two would be the same. What does for desugar to then?

Oh wait, is it doing into_iter()?

@zhassan-aws
Copy link
Contributor

Yes, for by default uses into_iter():

https://doc.rust-lang.org/std/iter/index.html#for-loops-and-intoiterator

@tedinski
Copy link
Contributor

tedinski commented Nov 1, 2022

Well that brings to mind another option: x.into_iter().fold(0, |accumulator, current| accumulator + current)

@zhassan-aws
Copy link
Contributor

Indeed, this version (slightly modified from yours due to a warning) performs the same as the for i in x version:

   IntoIterator::into_iter(x).fold(0, |accumulator, current| accumulator + current)
Verification Time: 11.920093s
Maximum resident set size (kbytes): 640040

@zhassan-aws zhassan-aws self-assigned this Nov 4, 2022
@adpaco-aws adpaco-aws added the T-User Tag user issues / requests label Nov 9, 2022
@tautschnig
Copy link
Member

The changes hinted at in #1226 (comment) also bring down time for array_sum_fold_proof to the same as the other two cases (i.e., approximately 11 seconds).

@tedinski tedinski added the T-CBMC Issue related to an existing CBMC issue label Dec 6, 2022
zhassan-aws added a commit to zhassan-aws/kani that referenced this issue Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[C] Bug This is a bug. Something isn't working. [E] Performance Track performance improvement (Time / Memory / CPU) T-CBMC Issue related to an existing CBMC issue T-User Tag user issues / requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants