Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve nops broadcasting. #150

Merged
merged 19 commits into from
Aug 7, 2019
Merged

Improve nops broadcasting. #150

merged 19 commits into from
Aug 7, 2019

Conversation

hasyimibhar
Copy link
Contributor

@hasyimibhar hasyimibhar commented Aug 6, 2019

This PR modifies the behavior of nop broadcasting such that the ledger will continue broadcasting nops until all of the sender's transactions has been applied (i.e. graph root depth >= highest transaction depth added).

It seems like nop broadcasting isn't stopping at the right time
If I make 100 smart contract transactions in 1 second, nops should keep broadcasting until those 100 transactions are accepted, or rejected
However, nops broadcast up until the current consensus round finalizes, and there may still be 50 or some number of smart contract transactions we have made that are not yet accepted/rejected

TODO:

  • Add failing test to demonstrate the issue. The test should never pass (prior to the fix), as there should be some tx not applied due to nops not being broadcasted.
  • Improve nops broadcasting
  • Remove test code (e.g. I commented some code to assist the test)
  • Disable GC when testing (sometimes the test panics because of GC, dunno why, so I feel it's better to just disable it)
  • Fix excessive nop transaction, resulting in multiple rounds of consensus even though only 1 transaction is added

@hasyimibhar hasyimibhar self-assigned this Aug 6, 2019
@hasyimibhar
Copy link
Contributor Author

hasyimibhar commented Aug 6, 2019

Not sure why it sometimes fail. After sprinkling some printf all over the code, I found out when the test fails, it's because of this: https://github.com/perlin-network/wavelet/blob/master/ledger.go#L292-L297

Somehow, sometimes ReadAccountBalance always returns 0, false, causing BroadcastNop to get stuck in a loop, effectively "stopping" the broadcast, eventhough broadcastNop is true.

EDIT: this happened because the test does not wait for the faucet tx to be applied. This commit fixed it: 38e2e08

ledger.go Outdated
return broadcastNops
}

// WaitForConsensus blocks until the ledger reaches consensus.
Copy link
Contributor Author

@hasyimibhar hasyimibhar Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iwasaki-kenta Need your thought on this. WaitForConsensus is basically added purely for testing purposes. What it does is it waits for a round to be finalized, or timeout if it takes too long. I'm not sure if it will impact the ledger in production, as it involves using a mutex. And it doesn't feel right to me at the moment.

Is there a better way to do this? Basically I just need a way to listen for a finalized round when testing. Otherwise either I'll just use time.Sleep (which makes the test run longer), or somehow poll the HTTP API to see if the round has increased.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm, I just polled ledger.Round().Latest().Index to see if the round increased, and it works quite well.

@hasyimibhar hasyimibhar changed the title [WIP] Improve nops broadcasting Improve nops broadcasting Aug 6, 2019
ledger.go Outdated Show resolved Hide resolved
ledger.go Outdated Show resolved Hide resolved
@hasyimibhar
Copy link
Contributor Author

Here's a result of running TestLedger_BroadcastNop a lot of times on my local machine:

$ for i in `seq 1 100`; do go test -count=1 . -run TestLedger_BroadcastNop; done
ok      github.com/perlin-network/wavelet       10.352s
ok      github.com/perlin-network/wavelet       14.797s
ok      github.com/perlin-network/wavelet       23.724s
ok      github.com/perlin-network/wavelet       12.258s
ok      github.com/perlin-network/wavelet       23.016s
ok      github.com/perlin-network/wavelet       19.127s
ok      github.com/perlin-network/wavelet       18.382s
ok      github.com/perlin-network/wavelet       23.336s
ok      github.com/perlin-network/wavelet       22.875s
ok      github.com/perlin-network/wavelet       16.629s
ok      github.com/perlin-network/wavelet       12.374s
ok      github.com/perlin-network/wavelet       11.273s
ok      github.com/perlin-network/wavelet       14.515s
ok      github.com/perlin-network/wavelet       12.304s
ok      github.com/perlin-network/wavelet       13.471s
ok      github.com/perlin-network/wavelet       9.887s
ok      github.com/perlin-network/wavelet       18.063s
ok      github.com/perlin-network/wavelet       17.351s
ok      github.com/perlin-network/wavelet       23.170s
ok      github.com/perlin-network/wavelet       13.903s
ok      github.com/perlin-network/wavelet       11.106s
ok      github.com/perlin-network/wavelet       16.657s
ok      github.com/perlin-network/wavelet       10.209s
ok      github.com/perlin-network/wavelet       13.523s
ok      github.com/perlin-network/wavelet       14.397s
ok      github.com/perlin-network/wavelet       16.439s
ok      github.com/perlin-network/wavelet       12.575s
ok      github.com/perlin-network/wavelet       13.298s
ok      github.com/perlin-network/wavelet       11.379s
ok      github.com/perlin-network/wavelet       15.091s
ok      github.com/perlin-network/wavelet       11.677s
ok      github.com/perlin-network/wavelet       13.190s
ok      github.com/perlin-network/wavelet       18.089s
ok      github.com/perlin-network/wavelet       9.469s
ok      github.com/perlin-network/wavelet       21.140s
ok      github.com/perlin-network/wavelet       16.229s
ok      github.com/perlin-network/wavelet       11.810s
ok      github.com/perlin-network/wavelet       13.395s
ok      github.com/perlin-network/wavelet       12.391s
ok      github.com/perlin-network/wavelet       7.623s
--- FAIL: TestLedger_BroadcastNop (0.47s)
    testutil.go:113: failed to ping peer: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed
FAIL
FAIL    github.com/perlin-network/wavelet       0.479s
ok      github.com/perlin-network/wavelet       21.029s
ok      github.com/perlin-network/wavelet       13.913s
ok      github.com/perlin-network/wavelet       23.232s
ok      github.com/perlin-network/wavelet       12.341s
ok      github.com/perlin-network/wavelet       24.581s
ok      github.com/perlin-network/wavelet       16.044s
ok      github.com/perlin-network/wavelet       13.836s
ok      github.com/perlin-network/wavelet       14.442s
ok      github.com/perlin-network/wavelet       26.663s
ok      github.com/perlin-network/wavelet       27.613s
ok      github.com/perlin-network/wavelet       15.243s
ok      github.com/perlin-network/wavelet       29.367s
ok      github.com/perlin-network/wavelet       23.324s
ok      github.com/perlin-network/wavelet       12.154s
ok      github.com/perlin-network/wavelet       14.566s
ok      github.com/perlin-network/wavelet       29.775s
ok      github.com/perlin-network/wavelet       11.507s
ok      github.com/perlin-network/wavelet       18.384s
ok      github.com/perlin-network/wavelet       14.826s
ok      github.com/perlin-network/wavelet       10.777s
ok      github.com/perlin-network/wavelet       12.885s
ok      github.com/perlin-network/wavelet       13.480s
ok      github.com/perlin-network/wavelet       16.274s
ok      github.com/perlin-network/wavelet       22.283s
ok      github.com/perlin-network/wavelet       24.099s
ok      github.com/perlin-network/wavelet       16.860s
ok      github.com/perlin-network/wavelet       12.907s
ok      github.com/perlin-network/wavelet       15.806s
ok      github.com/perlin-network/wavelet       13.523s
ok      github.com/perlin-network/wavelet       12.431s
ok      github.com/perlin-network/wavelet       14.450s
ok      github.com/perlin-network/wavelet       27.459s
ok      github.com/perlin-network/wavelet       8.779s
ok      github.com/perlin-network/wavelet       21.389s
ok      github.com/perlin-network/wavelet       13.330s
ok      github.com/perlin-network/wavelet       12.663s
ok      github.com/perlin-network/wavelet       14.804s
ok      github.com/perlin-network/wavelet       16.405s
ok      github.com/perlin-network/wavelet       18.123s
ok      github.com/perlin-network/wavelet       12.831s
ok      github.com/perlin-network/wavelet       13.674s
ok      github.com/perlin-network/wavelet       12.029s
ok      github.com/perlin-network/wavelet       10.349s
ok      github.com/perlin-network/wavelet       17.242s
ok      github.com/perlin-network/wavelet       16.378s
ok      github.com/perlin-network/wavelet       11.362s
ok      github.com/perlin-network/wavelet       12.857s
adding transactions...
0/1000 tx applied
307/1000 tx applied
389/1000 tx applied
Snowball (syncer) liveness fault: Last ID is 207798a3e6bb9e43629bf1eb1f0e25b3fe91e4b508052491d289b00c7b231dfd with count 8, and new ID is 58fc64d8804fdbe85f3f22a106b95a9ab2b1edc5c7dfb7536d1cb88112f7bdaa.
576/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
710/1000 tx applied
--- FAIL: TestLedger_BroadcastNop (305.06s)
    ledger_test.go:55: timed out before all transactions are applied
FAIL
FAIL    github.com/perlin-network/wavelet       305.082s
ok      github.com/perlin-network/wavelet       11.289s
ok      github.com/perlin-network/wavelet       16.805s
ok      github.com/perlin-network/wavelet       12.292s
ok      github.com/perlin-network/wavelet       11.408s
ok      github.com/perlin-network/wavelet       26.922s
--- FAIL: TestLedger_BroadcastNop (0.63s)
    testutil.go:113: failed to ping peer: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection closed
FAIL
FAIL    github.com/perlin-network/wavelet       0.641s

Out of 96 runs, the test failed 3 times. 2 times it failed due to too many open files error (I ran this on a cheap laptop, so it probably ran out of resources while running the test). But there's one time it failed because not all transactions were applied (the test runs for up to 5 minutes) despite nops broadcasting being enabled (the test asserts that it is enabled). I have not figured out why it failed for this specific run.

@hasyimibhar
Copy link
Contributor Author

After latest adjustments to TestLedger_BroadcastNop (reducing number of nodes to just 5, and adding sleep), looks like it will always pass:

$ for i in `seq 1 100`; do go test -count=1 . -run TestLedger_BroadcastNop; done
ok      github.com/perlin-network/wavelet       9.295s
ok      github.com/perlin-network/wavelet       6.365s
ok      github.com/perlin-network/wavelet       8.200s
ok      github.com/perlin-network/wavelet       7.104s
ok      github.com/perlin-network/wavelet       8.687s
ok      github.com/perlin-network/wavelet       6.573s
ok      github.com/perlin-network/wavelet       7.810s
ok      github.com/perlin-network/wavelet       8.028s
ok      github.com/perlin-network/wavelet       6.970s
ok      github.com/perlin-network/wavelet       8.329s
ok      github.com/perlin-network/wavelet       8.372s
ok      github.com/perlin-network/wavelet       7.502s
ok      github.com/perlin-network/wavelet       9.276s
ok      github.com/perlin-network/wavelet       6.988s
ok      github.com/perlin-network/wavelet       6.830s
ok      github.com/perlin-network/wavelet       7.751s
ok      github.com/perlin-network/wavelet       7.316s
ok      github.com/perlin-network/wavelet       8.705s
ok      github.com/perlin-network/wavelet       8.311s
ok      github.com/perlin-network/wavelet       5.863s
ok      github.com/perlin-network/wavelet       8.558s
ok      github.com/perlin-network/wavelet       8.168s
ok      github.com/perlin-network/wavelet       7.110s
ok      github.com/perlin-network/wavelet       6.527s
ok      github.com/perlin-network/wavelet       7.664s
ok      github.com/perlin-network/wavelet       8.657s
ok      github.com/perlin-network/wavelet       8.044s
ok      github.com/perlin-network/wavelet       6.500s
ok      github.com/perlin-network/wavelet       8.734s
ok      github.com/perlin-network/wavelet       6.856s
ok      github.com/perlin-network/wavelet       9.644s
ok      github.com/perlin-network/wavelet       7.572s
ok      github.com/perlin-network/wavelet       6.552s
ok      github.com/perlin-network/wavelet       7.047s
ok      github.com/perlin-network/wavelet       10.459s
ok      github.com/perlin-network/wavelet       7.776s
ok      github.com/perlin-network/wavelet       8.010s
ok      github.com/perlin-network/wavelet       6.892s
ok      github.com/perlin-network/wavelet       7.026s
ok      github.com/perlin-network/wavelet       7.914s
ok      github.com/perlin-network/wavelet       8.387s
ok      github.com/perlin-network/wavelet       7.651s
ok      github.com/perlin-network/wavelet       7.257s
ok      github.com/perlin-network/wavelet       8.231s
ok      github.com/perlin-network/wavelet       8.240s
ok      github.com/perlin-network/wavelet       6.879s
ok      github.com/perlin-network/wavelet       9.379s
ok      github.com/perlin-network/wavelet       8.287s
ok      github.com/perlin-network/wavelet       7.933s
ok      github.com/perlin-network/wavelet       7.285s
ok      github.com/perlin-network/wavelet       7.372s
ok      github.com/perlin-network/wavelet       8.545s
ok      github.com/perlin-network/wavelet       6.807s
ok      github.com/perlin-network/wavelet       9.561s
ok      github.com/perlin-network/wavelet       7.525s
ok      github.com/perlin-network/wavelet       7.978s
ok      github.com/perlin-network/wavelet       6.850s
ok      github.com/perlin-network/wavelet       7.565s
ok      github.com/perlin-network/wavelet       8.025s
ok      github.com/perlin-network/wavelet       7.806s
ok      github.com/perlin-network/wavelet       7.863s
ok      github.com/perlin-network/wavelet       7.685s
ok      github.com/perlin-network/wavelet       9.371s
ok      github.com/perlin-network/wavelet       7.887s
ok      github.com/perlin-network/wavelet       9.298s
ok      github.com/perlin-network/wavelet       6.964s
ok      github.com/perlin-network/wavelet       9.243s
ok      github.com/perlin-network/wavelet       8.368s
ok      github.com/perlin-network/wavelet       7.826s
ok      github.com/perlin-network/wavelet       8.096s
ok      github.com/perlin-network/wavelet       9.054s
ok      github.com/perlin-network/wavelet       8.197s
ok      github.com/perlin-network/wavelet       9.793s
ok      github.com/perlin-network/wavelet       7.799s
ok      github.com/perlin-network/wavelet       8.837s
ok      github.com/perlin-network/wavelet       7.296s
ok      github.com/perlin-network/wavelet       9.001s
ok      github.com/perlin-network/wavelet       8.486s
ok      github.com/perlin-network/wavelet       9.587s
ok      github.com/perlin-network/wavelet       11.230s
ok      github.com/perlin-network/wavelet       10.024s
ok      github.com/perlin-network/wavelet       12.131s
ok      github.com/perlin-network/wavelet       10.496s
ok      github.com/perlin-network/wavelet       10.139s
ok      github.com/perlin-network/wavelet       11.248s
ok      github.com/perlin-network/wavelet       9.056s
ok      github.com/perlin-network/wavelet       6.797s
ok      github.com/perlin-network/wavelet       8.013s
ok      github.com/perlin-network/wavelet       6.891s
ok      github.com/perlin-network/wavelet       9.747s
ok      github.com/perlin-network/wavelet       8.123s
ok      github.com/perlin-network/wavelet       7.272s
ok      github.com/perlin-network/wavelet       7.694s
ok      github.com/perlin-network/wavelet       8.810s
ok      github.com/perlin-network/wavelet       6.597s
ok      github.com/perlin-network/wavelet       6.808s
ok      github.com/perlin-network/wavelet       7.398s
ok      github.com/perlin-network/wavelet       8.533s
ok      github.com/perlin-network/wavelet       7.264s
ok      github.com/perlin-network/wavelet       7.486s

@hasyimibhar hasyimibhar mentioned this pull request Aug 6, 2019
10 tasks
@iwasaki-kenta iwasaki-kenta changed the title Improve nops broadcasting Improve nops broadcasting. Aug 7, 2019
@iwasaki-kenta iwasaki-kenta added bug Something isn't working enhancement New feature or request labels Aug 7, 2019
@iwasaki-kenta
Copy link
Contributor

Tested it thoroughly; transactions are being finalized much more nicely now. LGTM.

@iwasaki-kenta iwasaki-kenta merged commit 69442c3 into master Aug 7, 2019
@iwasaki-kenta iwasaki-kenta deleted the broadcast-nops-better branch August 7, 2019 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants