Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Logging] Reducing stack usage by moving dmlc::LogMessageFatal to the heap #613

Merged
merged 1 commit into from
May 20, 2020

Conversation

junrushao
Copy link
Member

@junrushao junrushao commented May 20, 2020

When working on tvm::runtime::Array, we found that CHECK causes significant regression in stack utilization. This is not acceptable especially in compilers written in recursive visitor pattern, which crashes on comparably deep neural networks.

As diving deeper into this case, we figured out that dmlc::LogMessageFatal is the cause - it is super sophisticated and throws inside the destructor.

So we would love to patch dmlc-core, move dmlc::LogMessageFatal from stack to heap. A simple trick is to wrap it with std::unique_ptr - it is doable because dmlc-core has been updated to C++ 11.

Besides patching dmlc-core, we can change the implementation of tvm::runtime::Array as well as follows:
0) no change (using CHECK)

  1. throw dmlc::Error instead of using CHECK
  2. throw std::runtime_error instead of using CHECK
  3. remove and disable the CHECK

Below are the benchmarks we did. We use g++ -fstack-usage to dump the stack usage information for the target function.

Settings:

  • Compiler: g++ 5.4
  • OS: Ubuntu 18.04
  • Stack size: ulimit -s 8192
  • Target function: virtual tvm::relay::Expr tvm::relay::CommonSubexprEliminator::VisitExpr_(const tvm::relay::CallNode*), which originally crashed this PR on a very deep network

Results:

Patch dmlc-core No patch dmlc-core
CHECK 560 4656
throw dmlc::Error 736 1264
throw std::runtime_error 480 1008
No checks (no-op) 480 1008

Side notes: It is unclear right now why throwing dmlc::Error takes more stack space than std::runtime_error, although they intent to be identical.

See also: apache/tvm#5585 (comment)

@junrushao
Copy link
Member Author

CC: @hcho3 @leezu @tqchen if you are interested :-)

@tqchen tqchen merged commit f6bd689 into dmlc:master May 20, 2020
@hcho3
Copy link
Contributor

hcho3 commented May 20, 2020

@junrushao1994 Thanks for the ping. It's going to be useful for reducing overhead in error checking. cc @trivialfis

@junrushao
Copy link
Member Author

Just to follow up, now we are able to get rid of std::unique_ptr (it is problematic in this case because it assumes no-except destructor), and in the mean time, we further reduced the stack usage from 560 to 544. I will send a PR shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants