Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upCoredump when using latest nightly rustc #37105
Comments
sfackler
added
I-crash
regression-from-stable-to-nightly
labels
Oct 12, 2016
This comment has been minimized.
This comment has been minimized.
|
Thanks for the report @BusyJay! Is there also a set of steps to reproduce the crash you're seeing as well locally? Also, is this on Linux? |
This comment has been minimized.
This comment has been minimized.
|
Yes, it's on Linux. It needs a few steps to reproduce the crash, we will provide it via a demo project later. |
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Oct 14, 2016
•
|
The steps to reproduce the crash may be not easy: Download official binaries
You can use Clone TiKV and build
The Run PD
Run TiKV
Run TiDB
Run TestUnzip bank test bank.zip and run
Then wait a long time...... If you have any problem, please tell me. |
brson
added
the
T-compiler
label
Oct 20, 2016
This comment has been minimized.
This comment has been minimized.
|
(Shot in the dark) Does this reproduce on |
This comment has been minimized.
This comment has been minimized.
|
@siddontang Is there any way this crash can be reduced to a smaller test case? It's going to be quite tricky to track down otherwise. Do you have instructions for reproducing the whole thing from source, without a binary download? If parts of it are not open source we may be able to arrange to debug it privately, but in any case a smaller test case would help immensely. |
This comment has been minimized.
This comment has been minimized.
|
@eddyb: Doesn't that nightly already default to orbit? As far as I can tell the switch happened between |
This comment has been minimized.
This comment has been minimized.
|
@TimNN Ah you are indeed correct. Well, that's one down, 99 potential causes to go. |
This comment has been minimized.
This comment has been minimized.
|
Everyone involved in this thread: if you haven't already, check out http://rr-project.org/. |
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Oct 22, 2016
|
Hi @brson We tried to reproduce this coredump with a simple test, but failed. PD and TiDB are both written with Go, and are all open source under Apache-2, so you can use the If you want to build yourself, you must install go 1.6+ first (https://golang.org/doc/install), then: git clone https://github.com/pingcap/tidb.git $GOPATH/src/github.com/pingcap/tidb
cd $GOPATH/src/github.com/pingcap/tidb
make
# the tidb-server is installed in $GOPATH/src/github.com/pingcap/tidb/bin directory
git clone https://github.com/pingcap/pd.git $GOPATH/src/github.com/pingcap/pd
cd $GOPATH/src/github.com/pingcap/pd
make
# the pd-server is installed in $GOPATH/src/github.com/pingcap/pd/bin directory |
This comment has been minimized.
This comment has been minimized.
|
@siddontang what is the source to the bank program? The zip file only has an executable. |
This comment has been minimized.
This comment has been minimized.
|
@siddontang I cannot reproduce the scenario; when I try to run the https://gist.github.com/pnkfelix/4c87f20badee2c5110c23005984830cd and the
(The other two services keep running...) I second @eddyb 's suggestion of trying to use |
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Oct 29, 2016
|
Hi @pnkfelix In the bank case, we will create at lease Of course, you can use another concurrency in bank like |
This comment has been minimized.
This comment has been minimized.
|
Thanks for the new info @siddontang. @pnkfelix maybe you can try again? |
pnkfelix
self-assigned this
Nov 3, 2016
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Nov 11, 2016
|
Hi @pnkfelix, can you reproduce it? |
brson
added
the
I-nominated
label
Nov 17, 2016
This comment has been minimized.
This comment has been minimized.
|
@rust-lang/compiler this needs a P- tag. |
brson
added
regression-from-stable-to-beta
and removed
regression-from-stable-to-nightly
labels
Nov 17, 2016
This comment has been minimized.
This comment has been minimized.
|
triage: P-high In particular, we should figure out if we can reproduce this or not! |
rust-highfive
added
P-high
and removed
I-nominated
labels
Nov 17, 2016
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Just tested with rr, but got an unexpected error. I will retry it later once it's resolved. |
This comment has been minimized.
This comment has been minimized.
|
rr issue is resolved. |
This comment has been minimized.
This comment has been minimized.
|
Yes, and I have been testing it for more than 12 hours, and it still not crash yet. I guess rr emulates a single-core machine just makes it very slow. |
This comment has been minimized.
This comment has been minimized.
|
It's looking like we're unlikely to solve this before release. |
This comment has been minimized.
This comment has been minimized.
|
In @rust-lang/compiler meeting, discussed. We're basically still having trouble reproducing the bug (ideally in rr). No real status update. |
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix maybe we can reduce this to P-medium until it's clear there's a bug to tackle on the rust side here. |
This comment has been minimized.
This comment has been minimized.
|
triage: P-medium Seeing as how we have not been able to reproduce, and we've basically stalled out, we're going to downgrade this in priority. @BusyJay, please let us know current status (is this still reproducing for you outside rr?) and if there is anything we can do to help track it down. |
rust-highfive
added
P-medium
and removed
P-high
labels
Dec 22, 2016
This comment has been minimized.
This comment has been minimized.
|
Is this now stable-to-stable? |
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Dec 23, 2016
|
Sorry that we don't have enough time to reproduce it. We tested it three weeks ago and this issue still existed. We can test it again after you release the newest nightly version :-) |
This comment has been minimized.
This comment has been minimized.
I'm sorry that we can't reproduce it either. =( Please do give it another try so at least we know if it is still a problem! |
This comment has been minimized.
This comment has been minimized.
|
Have we at least narrowed this down to a specific nightly where the problem seems to occur? It seems like it is not due to the switch to MIR, right? |
brson
added
regression-from-stable-to-stable
and removed
regression-from-stable-to-beta
labels
Dec 29, 2016
This comment has been minimized.
This comment has been minimized.
|
Could this be some sort of data race? Being unable to reduce it under rr sounds like a race condition. Can you try ThreadSanitizer? |
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Jan 2, 2017
|
We do some test and can't reproduce it with the newest rust + newest TiKV, but to our surprise, we can reproduce it with the newest rust + old TiKV (2016-08-21 version). Our rust version is:
We don't know why now, maybe the changes in TiKV skip the trigger condition for the core dump, or the problem still exists but we don't meet it sadly. Now we decide to use the newest rust for TiKV, and if we meet this core dump later, we will update the issue. |
This comment has been minimized.
This comment has been minimized.
|
@siddontang ok, well, I'm glad you're not hitting the issue anymore, but I wish we had a better handle on what the problem is exactly. Of course, it is also possible that the bug is in fact in TikV (or some other package featuring unsafe code), so it's quite likely that the problem is indeed fixed by a newer version. |
This comment has been minimized.
This comment has been minimized.
|
I'm going to downgrade to P-low until we have more data. |
This comment has been minimized.
This comment has been minimized.
|
triage: P-low |
rust-highfive
added
P-low
and removed
P-medium
labels
Jan 2, 2017
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Jan 8, 2017
|
Sadly, we meet the coredump again with the newest rustc + newest TiKV. We will try to reproduce it again.
|
This comment has been minimized.
This comment has been minimized.
|
@siddontang argh, sorry to hear that. :( |
This comment has been minimized.
This comment has been minimized.
siddontang
commented
Feb 8, 2017
|
A strange update, after we merge tikv/tikv#1512, we find that using newest rust is ok, we run many tests for a long time and the core dump doesn't happen, so we guess this PR fixes the problem, but we don't know why, could you help us to find the reason? We used nightly-2016-08-06 before, so I think the bug is introduced after this version. |
This comment has been minimized.
This comment has been minimized.
|
@thanks for the continued investigations and update @siddontang . |
Mark-Simulacrum
added
the
C-bug
label
Jul 26, 2017
This comment has been minimized.
This comment has been minimized.
|
unassigning self. I'm not sure we can reasonably expect to determine the underlying problem that has either been fixed or masked, since as far as I can tell, no one working on the rustc compiler has locally reproduced the problem. |
pnkfelix
removed their assignment
Aug 31, 2017
overvenus
referenced this issue
Dec 21, 2017
Closed
Can't run tests or build on latest nightly #2603
This comment has been minimized.
This comment has been minimized.
|
It’s been over a year since any update, and almost two years since this issue was originally reported. I’m going to go ahead and close this, @BusyJay, if you have a way to reproduce and still care about this, please let me know! |
BusyJay commentedOct 12, 2016
Hi, recently we upgrade our rustc compiler to the latest nightly version, but the compiled binary core dumps quickly under stress tests. A few stacks can be found in tikv/tikv#1144. But when we downgrade rustc to
rustc 1.12.0-nightly (b30eff7ba 2016-08-05), the binary works just fine.The stacks look weird, because the segment fault happens in
liballoc, but we don't manage memory by ourselves. We are guessing that there might be some problems in the versions later thanrustc 1.12.0-nightly (b30eff7ba 2016-08-05). Could you please help us check it out? Thanks!