Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-11896: [Rust] Disable Debug symbols on CI test builds #9653

Closed
wants to merge 2 commits into from

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Mar 7, 2021

The theory is that the inclusion of debug symbols is increasing the memory requirements of compiling the test binaries which is causing the tests to hit the CI builder's memory limits (and being OOM killed).

Changes:
Since the debug symbols aren't used for tests, run the CI tests without them to save memory

In theory this might also make the builds faster as well

@nevi-me and I theorize (without proof) that some new release of a dependency pushed the memory usage over the github builder limit which is why we started seeing this all of a sudden on master without any obvious corresponding code change

@github-actions
Copy link

github-actions bot commented Mar 7, 2021

@alamb alamb changed the title ARROW-11896: [Rust] Disable Debug symbols on CI test builds ARROW-11896: [Rust] Disable Debug symbols on CI test builds - WORK IN PROGRESS Mar 7, 2021
@codecov-io
Copy link

codecov-io commented Mar 7, 2021

Codecov Report

Merging #9653 (7bd587e) into master (dec5ab9) will decrease coverage by 0.01%.
The diff coverage is 56.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9653      +/-   ##
==========================================
- Coverage   82.51%   82.50%   -0.02%     
==========================================
  Files         245      245              
  Lines       57329    57347      +18     
==========================================
+ Hits        47306    47312       +6     
- Misses      10023    10035      +12     
Impacted Files Coverage Δ
rust/arrow/src/array/array_list.rs 92.88% <ø> (-0.07%) ⬇️
rust/arrow/src/datatypes/native.rs 76.59% <22.22%> (-12.88%) ⬇️
rust/arrow/src/array/array_binary.rs 91.82% <100.00%> (ø)
rust/arrow/src/array/array_primitive.rs 95.00% <100.00%> (+0.04%) ⬆️
rust/arrow/src/array/array_string.rs 96.06% <100.00%> (ø)
rust/datafusion/src/physical_plan/merge.rs 76.56% <100.00%> (ø)
rust/parquet/src/encodings/encoding.rs 95.05% <0.00%> (+0.19%) ⬆️
rust/arrow/src/array/equal/utils.rs 76.47% <0.00%> (+0.98%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2585f7c...7bd587e. Read the comment docs.

@nevi-me
Copy link
Contributor

nevi-me commented Mar 7, 2021

@alamb this seems to have worked. I'm triggering the jobs to run again, if it passes again; it's a 👍 from me.

We've already observed from our local repro attempts that there wasn't any specific PR that caused the issue. It could be that some dependency updates suddenly caused us to hit the memory limits on CI, as you're exploring.

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thinking and cool idea. 🚢

@jorgecarleitao
Copy link
Member

Curious, when the CI fails, we will no longer have the trace, right? Isn't that a concern whenever we want to share logs e.g. on PRs, mailing lists, etc?

@nevi-me
Copy link
Contributor

nevi-me commented Mar 7, 2021

Curious, when the CI fails, we will no longer have the trace, right? Isn't that a concern whenever we want to share logs e.g. on PRs, mailing lists, etc?

A compromise could be to disable debug symbols only on the Debian Rust stable runs, as the other platforms seem fine.

@alamb
Copy link
Contributor Author

alamb commented Mar 7, 2021

We've already observed from our local repro attempts that there wasn't any specific PR that caused the issue. It could be that some dependency updates suddenly caused us to hit the memory limits on CI, as you're exploring.

I think this is a likely theory but I don't have any specific data to prove one way or the other

Curious, when the CI fails, we will no longer have the trace, right? Isn't that a concern whenever we want to share logs e.g. on PRs, mailing lists, etc?

I think we will still get backtraces even without "debug" symbols (the debug symbols are used by gdb / lldb to be able to interpret core dumps / map local variables to memory locations, etc).

To confirm this theory I ran the following (on my mac, will try on linux shortly) and the trace is still present

(arrow_dev) alamb@MacBook-Pro:~/Software/arrow2/rust$ RUST_BACKTRACE=1 RUSTFLAGS="-C debuginfo=0" cargo test -p arrow -- test_null_list_primitive
    Finished test [unoptimized + debuginfo] target(s) in 0.10s
     Running target/debug/deps/arrow-99caf14857691f1e

running 1 test
test array::array::tests::test_null_list_primitive ... FAILED

failures:

---- array::array::tests::test_null_list_primitive stdout ----
thread 'array::array::tests::test_null_list_primitive' panicked at 'test error', arrow/src/array/array.rs:621:9
stack backtrace:
   0: std::panicking::begin_panic
   1: arrow::array::array::tests::test_null_list_primitive
   2: arrow::array::array::tests::test_null_list_primitive::{{closure}}
   3: core::ops::function::FnOnce::call_once
   4: core::ops::function::FnOnce::call_once
             at /rustc/cb75ad5db02783e8b0222fee363c5f63f7e2cf5b/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    array::array::tests::test_null_list_primitive

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 667 filtered out; finished in 0.14s

error: test failed, to rerun pass '-p arrow --lib'

A compromise could be to disable debug symbols only on the Debian Rust stable runs, as the other platforms seem fine.

Yeah -- in fact this PR doesn't set the debug flags for Windows or Mac (I believe the Mac builders are larger -- 15GB of memory on github, and windows debugging symbols work significantly different)

@alamb
Copy link
Contributor Author

alamb commented Mar 7, 2021

I plan to run this on a linux machine (to confirm the backtrace behavior and then merge it in)

@alamb
Copy link
Contributor Author

alamb commented Mar 7, 2021

I still get backtraces on my linux machine (Ubuntu 20.04). :shipit: !

@alamb alamb changed the title ARROW-11896: [Rust] Disable Debug symbols on CI test builds - WORK IN PROGRESS ARROW-11896: [Rust] Disable Debug symbols on CI test builds Mar 7, 2021
@alamb alamb closed this in 976ddbf Mar 7, 2021
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
The theory is that the inclusion of debug symbols is increasing the memory requirements of compiling the test binaries which is causing the tests to hit the CI builder's memory limits (and being OOM killed).

Changes:
Since the debug symbols aren't used for tests, run the CI tests without them to save memory

In theory this might also make the builds faster as well

@nevi-me  and I theorize (without proof) that some new release of a dependency pushed the memory usage over the github builder limit which is why we started seeing this all of a sudden on master without any obvious corresponding code change

Closes apache#9653 from alamb/alamb/test_no_debug_symbols

Authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 13, 2021
The theory is that the inclusion of debug symbols is increasing the memory requirements of compiling the test binaries which is causing the tests to hit the CI builder's memory limits (and being OOM killed).

Changes:
Since the debug symbols aren't used for tests, run the CI tests without them to save memory

In theory this might also make the builds faster as well

@nevi-me  and I theorize (without proof) that some new release of a dependency pushed the memory usage over the github builder limit which is why we started seeing this all of a sudden on master without any obvious corresponding code change

Closes apache#9653 from alamb/alamb/test_no_debug_symbols

Authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants