-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPDK mode bugs cassandra-stress #327
Comments
Hi Vlad. You asked me to verify if this issue still happens with seastar-dev/issue-187-v2. It does. |
On Sep 9, 2015 9:20 PM, "Glauber Costa" notifications@github.com wrote:
Good. Thanks.
|
Check if this is 40G specific If this is 40G specific - invest uypto 0.5 day to find the issue in code - and validate it is only for 40G. If its only for 40G and the firx does not take 2 hours - defer it we will move it to next release |
On Sun, Sep 13, 2015 at 9:58 AM, slivne notifications@github.com wrote:
IIRC @vladzcloudius said it's a generic issue which may not even be dpdk
|
Any idea what could the issue be? I just hit this with POSIX. It is the first time I ever hit this with POSIX, and I was very surprised to see this happening. In the client, I can see the same message:
In the server, I see an assertion being hit explicitly. Here is the backtrace:
Note that in the 24th frame in the backtrace above, there is a call to I don't know what this is, since I am running --network-stack posix Technically, it may not even be the same bug: it may be that every time we forceably close the connection in the server we will see that in the client. And we may do that for different reasons. But here it is... |
On Sep 13, 2015 9:58 AM, "slivne" notifications@github.com wrote:
That's confusing. Could u clarify, pls?
|
On Sep 13, 2015 10:04 AM, "Dor Laor" notifications@github.com wrote:
I only said it's absolutely unclear what the issue is. So, it may still be 40G only issue.
|
On Sep 17, 2015 11:29 PM, "Glauber Costa" notifications@github.com wrote:
As Nadav has mentioned - it's by design and it's harmless. Gleb has added
|
On Sep 17, 2015 11:29 PM, "Glauber Costa" notifications@github.com wrote:
Isn't do_flush() a function recently patched by Avi in the "batching"
|
@avikivity can comment, but while do_flush is introduced by him, the real problem here is that do_flush finds _ready_to_respond to be empty. His patch does not seem to touch assignments to _ready_to_respond, so I would say he is innocent on this - maybe made it more likely on POSIX. |
I have reverted Avi's patches, and while the assertion stopped happening, the client side errors lingered. I don't think we want to revert @avikivity's patches for the release. So we should really find the issue here and fix it. |
On Sep 18, 2015 5:43 PM, "Glauber Costa" notifications@github.com wrote:
One "good" news here is that it's apparently not related to native stack But cutting the long story short i agree with Glauber - it seems like a
|
I am getting the issue of do_flush on AWS AMI using head using ami: ami-1fa3d37a using head commit 5f32c00
and the three patches I sent on seastar-dev:shlomi/fix_dist_v3
machine on AWS is 54.173.222.60 @avikivity as this has debuginfo installed as well - maybe this will simplify finding the issue I am leaving this machine up for now |
@gleb-cloudius, @pdziepak - I think the patch commit a15c062
fix this issue , can I close this ? |
@gleb-cloudius, @pdziepak ping |
On Wed, Oct 07, 2015 at 06:41:20AM -0700, slivne wrote:
|
When trying to run a high load, with 16 columns on it, cassandra-stress will bail on us, with the following message:
Despite what the client says, this does not happen with our posix mode, nor with Origin.
Avi suspects we may be closing the connection too soon.
Test setup is intel1 / intel2
DML file is:
HEAD is 8405aa1
seastar branch has seastar-dev/issue-187-v1-work applied.
cassandra stress is
where:
LDIR is whatever your heart desires
scylla cmd line is
The text was updated successfully, but these errors were encountered: