Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - continue processing a layer even when only one hash has blocks #2464

Closed
wants to merge 3 commits into from

Conversation

countvonzero
Copy link
Contributor

Motivation

the code, as is, doesn't work when we receive layer blocks in the following order
t1: for layer hash1, we receive valid block data (err == nil)
t2: for layer hash2, we receive error
sync.getLayerFromNeighbors() will determine there is error for this layer and stop processing.

Changes

  • make sync process a layer as long as there is a layer hash that returns non-empty blocks
  • when there is an empty layer hash, and all other layer hashes return errors, determine this layer as empty

Test Plan

added several unit tests. in particular, Test_notifyLayerPromiseResult_OneHasBlockData() fails before this change.

DevOps Notes

  • This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
  • This PR does not affect public APIs
  • This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
  • This PR does not make changes to log messages (which monitoring infrastructure may rely on)

@countvonzero countvonzero changed the title mark layer non-empty even when only one hash has blocks continue processing a layer even when only one hash has blocks Jun 18, 2021
@lrettig
Copy link
Member

lrettig commented Jun 21, 2021

Copy link
Member

@lrettig lrettig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help me understand why a single zero layer result trumps even many other non-zero layer results? I'm not totally convinced this is the correct logic

layerfetcher/layers.go Show resolved Hide resolved
layerfetcher/layers_test.go Outdated Show resolved Hide resolved
layerfetcher/layers_test.go Outdated Show resolved Hide resolved
@countvonzero
Copy link
Contributor Author

countvonzero commented Jun 21, 2021

Could you help me understand why a single zero layer result trumps even many other non-zero layer results? I'm not totally convinced this is the correct logic

@lrettig that's not the logic. the logic is, if there is any valid non-zero layer results, it returns immediately. see line 355.
[line 355]. after you exit the loop, you are sure all that's left are errors. and if there is any ErrZeroLayer, it trumps all other types of errors.

@countvonzero
Copy link
Contributor Author

Could you help me understand why a single zero layer result trumps even many other non-zero layer results? I'm not totally convinced this is the correct logic

@lrettig this is exactly i am fixing. the test i added Test_notifyLayerPromiseResult_OneHasBlockData(). this test fails before this change and passes afterwards.

Copy link
Member

@lrettig lrettig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation!

@@ -392,7 +395,7 @@ func (l *Logic) notifyLayerPromiseResult(id types.LayerID, expectedResults int,
func (l *Logic) receiveBlockHashes(ctx context.Context, layer types.LayerID, data []byte, expectedResults int, extErr error) {
//if we failed getting layer data - notify
if extErr != nil {
l.log.Error("received error for layer id %v", extErr)
l.log.Error("received error %v for layer id %v", extErr, layer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
l.log.Error("received error %v for layer id %v", extErr, layer)
l.log.With().Error("received error", log.Err(extErr), layer)

@countvonzero
Copy link
Contributor Author

bors merge

bors bot pushed a commit that referenced this pull request Jun 25, 2021
## Motivation
the code, as is, doesn't work when we receive layer blocks in the following order
t1: for layer hash1, we receive valid block data (err == nil)
t2: for layer hash2, we receive error 
sync.getLayerFromNeighbors() will determine there is error for this layer and stop processing.

## Changes
- make sync process a layer as long as there is a layer hash that returns non-empty blocks
- when there is an empty layer hash, and all other layer hashes return errors, determine this layer as empty

## Test Plan
added several unit tests. in particular, `Test_notifyLayerPromiseResult_OneHasBlockData()` fails before this change.

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
@bors
Copy link

bors bot commented Jun 25, 2021

Build failed:

@countvonzero
Copy link
Contributor Author

bors merge

bors bot pushed a commit that referenced this pull request Jun 25, 2021
## Motivation
the code, as is, doesn't work when we receive layer blocks in the following order
t1: for layer hash1, we receive valid block data (err == nil)
t2: for layer hash2, we receive error 
sync.getLayerFromNeighbors() will determine there is error for this layer and stop processing.

## Changes
- make sync process a layer as long as there is a layer hash that returns non-empty blocks
- when there is an empty layer hash, and all other layer hashes return errors, determine this layer as empty

## Test Plan
added several unit tests. in particular, `Test_notifyLayerPromiseResult_OneHasBlockData()` fails before this change.

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
@bors
Copy link

bors bot commented Jun 25, 2021

Build failed:

@countvonzero
Copy link
Contributor Author

bors merge

bors bot pushed a commit that referenced this pull request Jun 25, 2021
## Motivation
the code, as is, doesn't work when we receive layer blocks in the following order
t1: for layer hash1, we receive valid block data (err == nil)
t2: for layer hash2, we receive error 
sync.getLayerFromNeighbors() will determine there is error for this layer and stop processing.

## Changes
- make sync process a layer as long as there is a layer hash that returns non-empty blocks
- when there is an empty layer hash, and all other layer hashes return errors, determine this layer as empty

## Test Plan
added several unit tests. in particular, `Test_notifyLayerPromiseResult_OneHasBlockData()` fails before this change.

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
@bors
Copy link

bors bot commented Jun 25, 2021

Pull request successfully merged into develop.

Build succeeded:

@bors bors bot changed the title continue processing a layer even when only one hash has blocks [Merged by Bors] - continue processing a layer even when only one hash has blocks Jun 25, 2021
@bors bors bot closed this Jun 25, 2021
@bors bors bot deleted the sync-zero-block-layer branch June 25, 2021 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants