Memory allocation failed error when running 'cargo test' #343

joshwilding4444 · 2020-08-31T20:10:33Z

When running the current master branch as of Aug 31, 2020, when I run the tests using:
$ cargo test

I receive the following error:

... other successful tests ...
test stats::pairhmm::homopolypairhmm::tests::test_interleave_gaps_y ... ok
test stats::pairhmm::homopolypairhmm::tests::test_gap_x ... ok
memory allocation of 2147483640 bytes failederror: test failed, to rerun pass '--lib'

Caused by:
  process didn't exit successfully: /path/to/rust/bio/rust-bio/target/debug/deps/bio-cf8b33efbb5080cc (signal: 6, SIGABRT: process abort signal)

The failure to allocate memory happens at the test_interleave_gaps_y or at test_interleave_gaps_x within stats::pairhmm::homopolypairhmm::tests. This same error happens if I run all tests or if I just run the tests in stats.

Has anyone else experienced this problem when trying to run tests for the latest build?

The text was updated successfully, but these errors were encountered:

vsoch · 2020-08-31T21:52:43Z

I made it much further, but failed on another test (Ubuntu 16.04)

...
test stats::pairhmm::homopolypairhmm::tests::test_hompolymer_run_in_x ... ok
test stats::pairhmm::homopolypairhmm::tests::test_hompolymer_run_in_y ... ok
test stats::pairhmm::homopolypairhmm::tests::impossible_global_alignment ... ok
test stats::pairhmm::homopolypairhmm::tests::test_gap_x_2 ... ok
test stats::pairhmm::homopolypairhmm::tests::test_interleave_gaps_x ... ok
test stats::pairhmm::homopolypairhmm::tests::test_interleave_gaps_y ... ok
test stats::pairhmm::homopolypairhmm::tests::test_phmm_vs_phhmm ... ok
test stats::pairhmm::homopolypairhmm::tests::test_gap_y ... ok
test stats::pairhmm::pairhmm::tests::impossible_global_alignment ... ok
test stats::pairhmm::pairhmm::tests::test_gap_x ... ok
test stats::pairhmm::homopolypairhmm::tests::test_gap_x ... ok
test stats::pairhmm::pairhmm::tests::test_gap_y ... ok
test stats::pairhmm::pairhmm::tests::test_interleave_gaps_y ... ok
test stats::pairhmm::homopolypairhmm::tests::test_mismatch ... ok
test stats::probs::cdf::test::test_cdf ... ok
test stats::pairhmm::pairhmm::tests::test_interleave_gaps_x ... ok
test stats::pairhmm::pairhmm::tests::test_same ... ok
test stats::pairhmm::pairhmm::tests::test_mismatch ... ok
test stats::probs::tests::test_cap_numerical_overshoot ... ok
test stats::probs::tests::test_cumsum ... ok
test stats::probs::tests::test_cap_numerical_overshoot_panic ... ok
test stats::probs::tests::test_empty_sum ... ok
test stats::probs::tests::test_simpsons_integrate ... ok
test stats::probs::tests::test_sub ... ok
test stats::probs::tests::test_sum_one_zero ... ok
test stats::probs::tests::test_trapezoidal_integrate ... ok
test stats::probs::tests::test_zero ... ok
test stats::probs::tests::test_sum ... ok
test utils::interval::tests::negative_width_range ... ok
test utils::interval::tests::range_interval_conversions ... ok
test utils::tests::test_prescan ... ok
test utils::tests::test_scan ... ok
test utils::text::tests::test_print_sequence ... ok
test utils::fastexp::tests::test_fastexp ... ok
test stats::probs::tests::test_one_minus ... ok
test stats::pairhmm::homopolypairhmm::tests::test_same ... ok
test stats::pairhmm::pairhmm::tests::test_banded ... ok
test stats::pairhmm::homopolypairhmm::tests::test_banded ... ok
error: test failed, to rerun pass '--lib'

Caused by:
  process didn't exit successfully: `/home/vanessa/Desktop/Code/rust-bio/target/debug/deps/bio-364865763329b4a8` (signal: 9, SIGKILL: kill)

joshwilding4444 · 2020-08-31T22:03:33Z

Interesting, @vsoch . My system is running Linux Mint 20 and has 8GB RAM. Running top shows that there's some memory available. What are the specifications for your machine?

vsoch · 2020-08-31T22:13:39Z

But I have a gazillion things running and open, so probably it isn't all available!

joshwilding4444 · 2020-09-01T17:16:04Z

Ok, both of our machines should have enough memory to run a few tests. I think that there is an issue with memory allocation somewhere, but I'm not sure exactly where. If you have time, @vsoch you could try running the tests without other major programs running, just to see if that makes a difference. I think that there will still be an allocation issue somewhere along the line.

I understand that memory management in Rust primarily depends on the current scope, so I don't know exactly why the tests are trying to allocate so much memory at once. I will take a look at the test suite later today to see what I can find. Does anyone else know why these tests could be running into these memory issues?

joshwilding4444 · 2020-09-01T22:35:12Z

When rerunning the tests, I find that every time the tests fail, the test fails to allocate the same amount of memory, 2147483640 bytes, every time it runs, even though the test output will stop at different points. For me, the test will always fail right after either test_interleave_gaps_x or test_interleave_gaps_y. Looking at the status of top in another window shows that the program will go up to 4g worth of memory, then stop.

Daniel-Liu-c0deb0t · 2020-09-03T10:14:13Z

Is it possible to narrow down the location of the error and check if it still happens when test_interleave_gaps_x/y is ran by itself with cargo test test_interleave_gaps_x?

vsoch · 2020-09-03T17:04:11Z

Running by themselves:

$ cargo test test_interleave_gaps_x
    Finished test [unoptimized + debuginfo] target(s) in 0.20s
     Running target/debug/deps/bio-364865763329b4a8

running 2 tests
test stats::pairhmm::pairhmm::tests::test_interleave_gaps_x ... ok
test stats::pairhmm::homopolypairhmm::tests::test_interleave_gaps_x ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 343 filtered out

     Running target/debug/deps/mod-2eda39cc1fbd9701

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out

and

$ cargo test test_interleave_gaps_y
    Finished test [unoptimized + debuginfo] target(s) in 0.10s
     Running target/debug/deps/bio-364865763329b4a8

running 2 tests
test stats::pairhmm::pairhmm::tests::test_interleave_gaps_y ... ok
test stats::pairhmm::homopolypairhmm::tests::test_interleave_gaps_y ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 343 filtered out

     Running target/debug/deps/mod-2eda39cc1fbd9701

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out

Ok - I just tested running tests with my chrome open (failed at same point) and closed (all finished successfully), but that doesn't tell us any new information. :)

vsoch · 2020-09-03T17:12:58Z

Hmm, when I set the number of jobs (CPU) to use, it started working for me, and now I can't get it to fail again. This worked for me (success with chrome open)

$ cargo test --jobs 8

and my machine has

$ nproc
8

I tested from 2 to 8. And now the regular test command is no longer failing either!

joshwilding4444 · 2020-09-03T18:50:38Z

@vsoch I am now experiencing something similar. When I run the tests using:
$ cargo test --jobs 8
while I have other programs open, the tests fail in the same way, but at a slightly later point. When I run the tests without other programs open, the tests pass just fine, and now when running:
$ cargo test
or
$ cargo test --jobs 1
the tests all pass. Maybe there is a problem with how the tests are optimized or compiled?

joshwilding4444 · 2020-09-03T18:56:44Z

When running the tests again with several other programs open, the tests will fail again as before. This is true even when running:
$ cargo test --jobs 8

@vsoch did you happen to try testing with a bunch of other programs open? @Daniel-Liu-c0deb0t what does running the tests look like on your machine?

vsoch · 2020-09-03T19:37:59Z

My initial failure had two firefox windows (with many tabs open) plus Audacity, later I just had the browsers. Now I just opened audacity again and it's still working! lol. The bug that got away...

Daniel-Liu-c0deb0t · 2020-09-03T22:42:32Z

For me, all tests pass with just cargo test. I do have more than enough memory (16GB) on this computer, though.

I don't think compilation or optimization is the issue here. I guess if you want to check, you can use cargo clean && cargo test to see if rebuilding everything changes the results? Or maybe you could test cargo test --release, to get it to use a higher optimization level? For me, all tests pass even when I use those commands or use multiple jobs.

I think the real problem here is why the test is trying to allocate ~2GB of memory, as @joshwilding4444 mentioned. I see that the memory usage spikes up to ~5GB during the tests. I went through a few possibilities based on when the memory usage spiked, and believe the stats::pairhmm::homopolypairhmm tests are the culprit here. I ran a few of the tests individually, and they all allocate around 2GB memory each. Additionally, those tests take noticeably longer to run. Interestingly, this does not happen with the stats::pairhmm::pairhmm tests. IIRC from the code (I'm not very familiar with it), the tests for those should be very similar, so I believe there is a bug somewhere in the algorithm causing the large memory allocations. These tests most likely should not need 2GB of memory.

tedil · 2020-09-04T10:10:27Z

I think the real problem here is why the test is trying to allocate ~2GB of memory, as @joshwilding4444 mentioned. I see that the memory usage spikes up to ~5GB during the tests. I went through a few possibilities based on when the memory usage spiked, and believe the stats::pairhmm::homopolypairhmm tests are the culprit here. I ran a few of the tests individually, and they all allocate around 2GB memory each. Additionally, those tests take noticeably longer to run. Interestingly, this does not happen with the stats::pairhmm::pairhmm tests. IIRC from the code (I'm not very familiar with it), the tests for those should be very similar, so I believe there is a bug somewhere in the algorithm causing the large memory allocations. These tests most likely should not need 2GB of memory.

It's neither the tests fault nor a bug in the algorithm. The problem here is that a transition table is pre-allocated (and pre-computed) with 2147483640 entries of size 8, i.e. 268435455 * 8 = 2GiB. So it really does need the 2GiB of memory. (The HomopolyPairHMM has way more states and transitions than the traditional PairHMM, which is why there's the memory allocation difference between the two.)

An easy "fix": since most transitions are 0 anyways, use a sparse datastructure (which supports indexing) instead.
Or just enumerate every possible transition and have a dense transition table. I guess there are a lot of options to tackle this problem, I'd welcome any suggestions ;)

Ultimately, it's a problem of striking a balance between memory consumption and cpu time, I guess.

Edit: Only ~80 or so transitions are actually used, so, whoops, 268435455 is a bit overkill 😆

Daniel-Liu-c0deb0t · 2020-09-04T19:56:41Z

Ah, I see. Glad to see this fixed.

tedil added a commit that referenced this issue Sep 4, 2020

fix #343

819391d

dlaehnemann closed this as completed in bcfe1ac Sep 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory allocation failed error when running 'cargo test' #343

Memory allocation failed error when running 'cargo test' #343

joshwilding4444 commented Aug 31, 2020 •

edited

vsoch commented Aug 31, 2020

joshwilding4444 commented Aug 31, 2020 •

edited

vsoch commented Aug 31, 2020

joshwilding4444 commented Sep 1, 2020 •

edited

joshwilding4444 commented Sep 1, 2020

Daniel-Liu-c0deb0t commented Sep 3, 2020

vsoch commented Sep 3, 2020 •

edited

vsoch commented Sep 3, 2020

joshwilding4444 commented Sep 3, 2020

joshwilding4444 commented Sep 3, 2020

vsoch commented Sep 3, 2020

Daniel-Liu-c0deb0t commented Sep 3, 2020 •

edited

tedil commented Sep 4, 2020 •

edited

Daniel-Liu-c0deb0t commented Sep 4, 2020

Memory allocation failed error when running 'cargo test' #343

Memory allocation failed error when running 'cargo test' #343

Comments

joshwilding4444 commented Aug 31, 2020 • edited

vsoch commented Aug 31, 2020

joshwilding4444 commented Aug 31, 2020 • edited

vsoch commented Aug 31, 2020

joshwilding4444 commented Sep 1, 2020 • edited

joshwilding4444 commented Sep 1, 2020

Daniel-Liu-c0deb0t commented Sep 3, 2020

vsoch commented Sep 3, 2020 • edited

vsoch commented Sep 3, 2020

joshwilding4444 commented Sep 3, 2020

joshwilding4444 commented Sep 3, 2020

vsoch commented Sep 3, 2020

Daniel-Liu-c0deb0t commented Sep 3, 2020 • edited

tedil commented Sep 4, 2020 • edited

Daniel-Liu-c0deb0t commented Sep 4, 2020

joshwilding4444 commented Aug 31, 2020 •

edited

joshwilding4444 commented Aug 31, 2020 •

edited

joshwilding4444 commented Sep 1, 2020 •

edited

vsoch commented Sep 3, 2020 •

edited

Daniel-Liu-c0deb0t commented Sep 3, 2020 •

edited

tedil commented Sep 4, 2020 •

edited