Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain block order after peer selection or exhaustion during peer streaming #258

Merged
merged 2 commits into from
Feb 13, 2017

Conversation

xichen2020
Copy link

cc @robskillington @prateek @ben-lerner

This PR fixes an issue when removing blocks from peers after we select a block from multiple peer candidate or the peer list has been exhausted. In particular when a block is removed, we used to swap it with the last block and shrink the block list, thereby destroying invariant that the blocks are sorted by their start times. This in turn causes issues because we use the block start times to determine which peers can provide the block and relies on the block order to align the peers. When the block ordering is disrupted, it leads to the error case where a block is only selected from one peer, even though multiple peers are eligible to provide this block.

Also added two test cases to catch the error cases. Both cases are failing against HEAD and passing against this PR.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.07%) to 78.371% when pulling 38e0cbf on xichen-maintain-blocks-order into 5535e5f on dev.

opts := newSessionTestAdminOptions().
SetFetchSeriesBlocksMaxBlockRetries(2)
s, err := newSession(opts)
assert.NoError(t, err)
Copy link

@ben-lerner ben-lerner Feb 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably want require.NoError

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me update this file and these changes to reflect this.

Copy link
Collaborator

@robskillington robskillington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 78.312% when pulling 3c765c2 on xichen-maintain-blocks-order into 5535e5f on dev.

@xichen2020 xichen2020 merged commit b0572cb into dev Feb 13, 2017
@xichen2020 xichen2020 deleted the xichen-maintain-blocks-order branch February 13, 2017 18:51
xichen2020 added a commit that referenced this pull request Feb 13, 2017
…treaming (#258)

* Maintain block order after peer selection or exhaustion during peer streaming

* Use require.NoError for segments of tests that must succeed to proceed
xichen2020 added a commit that referenced this pull request Feb 13, 2017
* Format session errors with m3x/errors/.Errors (#257)

* Maintain block order after peer selection or exhaustion during peer streaming (#258)

* Maintain block order after peer selection or exhaustion during peer streaming

* Use require.NoError for segments of tests that must succeed to proceed

* Peer streaming only fails when unable to read data from any peer during merging reads (#259)

* Peer streaming only fails when unable to read data from any peer during merging reads

* Update comments

* Fix tests

* Disable integration tests for native pool testing

* Remove blocks from peers' block lists if they are not eligible

* Better comments

* Add metrics to report final errors during peer streaming (#260)

* Peer streaming only fails when unable to read data from any peer during merging reads

* Update comments

* Fix tests

* Disable integration tests for native pool testing

* Remove blocks from peers' block lists if they are not eligible

* Better comments

* Add metrics to report final errors during peer streaming

* Update m3x dependency (#261)
xichen2020 added a commit that referenced this pull request Feb 14, 2017
* Format session errors with m3x/errors/.Errors (#257)

* Maintain block order after peer selection or exhaustion during peer streaming (#258)

* Maintain block order after peer selection or exhaustion during peer streaming

* Use require.NoError for segments of tests that must succeed to proceed

* Peer streaming only fails when unable to read data from any peer during merging reads (#259)

* Peer streaming only fails when unable to read data from any peer during merging reads

* Update comments

* Fix tests

* Disable integration tests for native pool testing

* Remove blocks from peers' block lists if they are not eligible

* Better comments

* Add metrics to report final errors during peer streaming (#260)

* Peer streaming only fails when unable to read data from any peer during merging reads

* Update comments

* Fix tests

* Disable integration tests for native pool testing

* Remove blocks from peers' block lists if they are not eligible

* Better comments

* Add metrics to report final errors during peer streaming

* Update m3x dependency (#261)

* Flush last writes on host queue close

* Close any times

* Longer timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants