[Merged by Bors] - Drop peers when outbound message queue is full#2450
[Merged by Bors] - Drop peers when outbound message queue is full#2450
Conversation
Prevents blocking further up the gossip stack
|
This queue is not only for broadcast, but also for direct requests (msg server and fetcher), is it the case? If so we can't simply drop messages, we must disconnect peer and return with error. |
This is correct. I updated this to drop peers if the outbound queue is full. Could you please take a look and see if this is what you had in mind? |
|
bors try |
|
bors merge |
## Motivation Closely related to #2435 Partially addresses #1868, #2386 (and maybe #2405) ## Changes When the outbound message queue for a given connection is full, rather than blocking on `Send`, begin dropping messages and printing errors (rather than allowing the blocking to propagate up the gossip stack) It would be good to go one step further and drop peers that are hanging, but this is a quick, simple fix for the acute issue we're seeing on the testnets. See #2385. ## Test Plan Includes a new unit test for both `conn` and `msgconn` types ## TODO <!-- This section should be removed when all items are complete --> - [x] Explain motivation or link existing issue(s) - [x] Test changes and document test plan - [ ] Update documentation as needed ## DevOps Notes <!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases --> - [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources) - [x] This PR does not affect public APIs - [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.) - [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
|
Build failed (retrying...): |
## Motivation Closely related to #2435 Partially addresses #1868, #2386 (and maybe #2405) ## Changes When the outbound message queue for a given connection is full, rather than blocking on `Send`, begin dropping messages and printing errors (rather than allowing the blocking to propagate up the gossip stack) It would be good to go one step further and drop peers that are hanging, but this is a quick, simple fix for the acute issue we're seeing on the testnets. See #2385. ## Test Plan Includes a new unit test for both `conn` and `msgconn` types ## TODO <!-- This section should be removed when all items are complete --> - [x] Explain motivation or link existing issue(s) - [x] Test changes and document test plan - [ ] Update documentation as needed ## DevOps Notes <!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases --> - [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources) - [x] This PR does not affect public APIs - [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.) - [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
|
Build failed: |
|
bors merge |
## Motivation Closely related to #2435 Partially addresses #1868, #2386 (and maybe #2405) ## Changes When the outbound message queue for a given connection is full, rather than blocking on `Send`, begin dropping messages and printing errors (rather than allowing the blocking to propagate up the gossip stack) It would be good to go one step further and drop peers that are hanging, but this is a quick, simple fix for the acute issue we're seeing on the testnets. See #2385. ## Test Plan Includes a new unit test for both `conn` and `msgconn` types ## TODO <!-- This section should be removed when all items are complete --> - [x] Explain motivation or link existing issue(s) - [x] Test changes and document test plan - [ ] Update documentation as needed ## DevOps Notes <!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases --> - [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources) - [x] This PR does not affect public APIs - [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.) - [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
|
Build failed: |
|
bors merge |
## Motivation Closely related to #2435 Partially addresses #1868, #2386 (and maybe #2405) ## Changes When the outbound message queue for a given connection is full, rather than blocking on `Send`, begin dropping messages and printing errors (rather than allowing the blocking to propagate up the gossip stack) It would be good to go one step further and drop peers that are hanging, but this is a quick, simple fix for the acute issue we're seeing on the testnets. See #2385. ## Test Plan Includes a new unit test for both `conn` and `msgconn` types ## TODO <!-- This section should be removed when all items are complete --> - [x] Explain motivation or link existing issue(s) - [x] Test changes and document test plan - [ ] Update documentation as needed ## DevOps Notes <!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases --> - [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources) - [x] This PR does not affect public APIs - [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.) - [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
|
Build failed (retrying...): |
## Motivation Closely related to #2435 Partially addresses #1868, #2386 (and maybe #2405) ## Changes When the outbound message queue for a given connection is full, rather than blocking on `Send`, begin dropping messages and printing errors (rather than allowing the blocking to propagate up the gossip stack) It would be good to go one step further and drop peers that are hanging, but this is a quick, simple fix for the acute issue we're seeing on the testnets. See #2385. ## Test Plan Includes a new unit test for both `conn` and `msgconn` types ## TODO <!-- This section should be removed when all items are complete --> - [x] Explain motivation or link existing issue(s) - [x] Test changes and document test plan - [ ] Update documentation as needed ## DevOps Notes <!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases --> - [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources) - [x] This PR does not affect public APIs - [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.) - [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
|
Build failed: |
|
This is done and ready to be merged, but stuck on flaky tests. Assigning to @antonlerner to make sure it gets merged. Let me know if I've missed anything or if there are any questions. Thanks! |
|
bors try |
tryBuild failed: |
Add coinbase addr back to StartSmeshing
|
bors merge |
## Motivation Closely related to #2435 Partially addresses #1868, #2386 (and maybe #2405) ## Changes When the outbound message queue for a given connection is full, rather than blocking on `Send`, begin dropping messages and printing errors (rather than allowing the blocking to propagate up the gossip stack) It would be good to go one step further and drop peers that are hanging, but this is a quick, simple fix for the acute issue we're seeing on the testnets. See #2385. ## Test Plan Includes a new unit test for both `conn` and `msgconn` types ## TODO <!-- This section should be removed when all items are complete --> - [x] Explain motivation or link existing issue(s) - [x] Test changes and document test plan - [ ] Update documentation as needed ## DevOps Notes <!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases --> - [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources) - [x] This PR does not affect public APIs - [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.) - [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on) Co-authored-by: Dmitry Shulyak <yashulyak@gmail.com>
|
Pull request successfully merged into develop. Build succeeded: |
Motivation
Closely related to #2435
Partially addresses #1868, #2386 (and maybe #2405)
Changes
When the outbound message queue for a given connection is full, rather than blocking on
Send, begin dropping messages and printing errors (rather than allowing the blocking to propagate up the gossip stack)It would be good to go one step further and drop peers that are hanging, but this is a quick, simple fix for the acute issue we're seeing on the testnets. See #2385.
Test Plan
Includes a new unit test for both
connandmsgconntypesTODO
DevOps Notes