Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement catch-up phase #80

Merged
merged 5 commits into from Jul 15, 2019
Merged

Conversation

AndrejMitrovic
Copy link
Contributor

@AndrejMitrovic AndrejMitrovic commented Jul 9, 2019

It's not ready for merging yet, but it can be looked at.

This still needs a bit of refactoring, separating atomic changes into their own commits.

@AndrejMitrovic AndrejMitrovic added the type-feature label Jul 9, 2019
@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 9, 2019

Also this only implements catch-up during booting up. The other definitions of done in the Catchup Phase Issue need to be solved.

import std.algorithm;
import std.conv;
import std.format;
import std.digest.sha;
Copy link
Contributor

@Geod24 Geod24 Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be removed

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

damn where did that sneak in from lol

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 9, 2019

I've atomic'ed (atomized?) the commits, it should look much nicer now.

I probably need to think of edge cases to add to the unittests.

source/agora/node/RemoteNode.d Outdated Show resolved Hide resolved
///
public this (Config config)
{
this.mem_metadata = new MemMetadata;
this.ledger = new Ledger;
Copy link
Contributor

@Geod24 Geod24 Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code doesn't make sense to me. You instantiate a new Ledger just to override it with itself in the parent constructor. You also shadow the parent ledger variable ?

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shadowing was an oversight. I'll simplify this a bit tomorrow.

@AndrejMitrovic AndrejMitrovic force-pushed the catchup-phase branch 2 times, most recently from 3cba60a to ce985fb Compare Jul 10, 2019
@AndrejMitrovic AndrejMitrovic changed the title [wip] Implement catch-up phase Implement catch-up phase Jul 10, 2019
@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 10, 2019

Update: Added periodic catch-up phase, and added a network test for it. This completes all of the tasks and should cover everything in the definition of done.

Copy link
Contributor

@Geod24 Geod24 left a comment

I would merge the last two commits. Also, do you want the ability to arbitrarily delay a node ? Because I can give you that power easily in LocalRest.

@@ -175,6 +177,9 @@ public interface TestAPI : API

///
public abstract void metaAddPeer(string peer);

///
public abstract void addBlockToLedger (Block block);
Copy link
Contributor

@Geod24 Geod24 Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this ? I don't see an override anywhere ?

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the question is how do you get access to the Ledger from the unittest. As it's required right now. edit: reworded

Copy link
Contributor

@Geod24 Geod24 Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You never access the internals of a node at the moment.

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the first test is:

Instantiate the nodes. Don't start networking yet. One node (node A) needs to have 100 blocks, the 3 are empty.

Then, make them connect to each other. The 3 other nodes will retrieve blocks from the 1st node on boot.

But to make this work, I need to add those blocks to node A before it connects to the 3 other nodes.

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more specific here. I can't just use the sendTransaction API here because this will trigger the gossip protocol to send the transaction over the whole network, and then each node will create their own block. So by the time the node tries to retrieve the blocks from the other nodes, it will already have all the blocks.

Delaying the node might not help. I would need some way to temporarily disable the gossip protocol, but then this becomes very messy to have all these special cases for unittesting.

The more features the node has, the more complicated it is to test a specific feature in the node without triggering an avalanche of other features.

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be able to replace this code once LocalRest has the ability to filter requests. Then I can do something like:

  • Use the sendTransaction API to one node
  • Make the other nodes ignore transaction gossiping
  • The other nodes will then catch-up using the periodic getBlocksFrom call.

This is way nicer than messing with Node's internals.

{
import core.thread;
import std.conv;
import vibe.core.log;
Copy link
Contributor

@Geod24 Geod24 Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two leftover imports

source/agora/node/RemoteNode.d Outdated Show resolved Hide resolved
@AndrejMitrovic AndrejMitrovic force-pushed the catchup-phase branch 2 times, most recently from f2ec64b to 2636b48 Compare Jul 10, 2019
@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 10, 2019

I've squashed the two commits.

I'm thinking about how to get rid of those methods in the TestAPI now.

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 10, 2019

Also, do you want the ability to arbitrarily delay a node ? Because I can give you that power easily in LocalRest.

Hmm.. that might be very useful.

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 10, 2019

All the Mac builds fail. Hmm..

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 10, 2019

It's a timing issue.

@AndrejMitrovic AndrejMitrovic force-pushed the catchup-phase branch 2 times, most recently from dc1e472 to 0cd569a Compare Jul 10, 2019
@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 10, 2019

Ok I made a more stable version of the tests that shouldn't suffer too much from timing issues. I added polling and a timeout of 4 seconds.

@AndrejMitrovic AndrejMitrovic force-pushed the catchup-phase branch 2 times, most recently from c29d528 to 3ea3762 Compare Jul 10, 2019
@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 11, 2019

Very peculiar that macOS dmd 2.085.1 seems to consistently be failing now..

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 11, 2019

~@Geod24 I'm having consistent time-outs in the MacOS build: https://travis-ci.com/bpfkorea/agora/jobs/214998228

It seems it gets stuck trying to install DMD & Dub.~

And it's green.

@Geod24
Copy link
Contributor

@Geod24 Geod24 commented Jul 11, 2019

I guess you want to rebase ?

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 11, 2019

I guess you want to rebase ?

Ah I thought this one was rebased.

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 11, 2019

I mean merged.

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 11, 2019

Rebased.

Copy link
Contributor

@Geod24 Geod24 left a comment

As discussed IRL: Bad rebase, some code is missing

// periodic task
while (1)
{
logInfo("Retrieving latest blocks..");
Copy link
Contributor

@Geod24 Geod24 Jul 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logging should only happen if we are actually catching up


***************************************************************************/

public ulong getBlockHeight ( )
Copy link
Contributor

@Geod24 Geod24 Jul 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public ulong getBlockHeight ( )
public ulong getBlockHeight ()

@@ -22,13 +23,24 @@ import agora.common.Hash;
import agora.common.Transaction;
import agora.test.Base;

/// Returns: true if all node's block heights are equal to the provided height
private bool allHeightsEqual ( TestAPI[] nodes, size_t height )
Copy link
Contributor

@Geod24 Geod24 Jul 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private bool allHeightsEqual ( TestAPI[] nodes, size_t height )
private bool allHeightsEqual (TestAPI[] nodes, size_t height)

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 13, 2019

Fixed the suggested changes.

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 15, 2019

Rebased to use new LocalRest sleep() feature.

However it's a bit ugly because now I have to store RemoteAPI!TestAPI in TestNetworkManager, not just TestAPI. And then there's the issue of RemoteAPI!TestAPI[] not implicitly converting to TestAPI[].

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 15, 2019

***************************************************************************/

private void getBlocksFrom (ulong block_height,
void delegate(Block[]) @safe onReceivedBlocks)
Copy link
Contributor

@Geod24 Geod24 Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void delegate(Block[]) @safe onReceivedBlocks)
scope void delegate(Block[]) @safe onReceivedBlocks)


protected Ledger getLedger ()
{
return new Ledger();
Copy link
Contributor

@Geod24 Geod24 Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this anymore ?

{
import core.thread;
import std.algorithm;
import std.format;
Copy link
Contributor

@Geod24 Geod24 Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those 2 imports are not necessary anymore

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need each still.

@@ -22,13 +23,24 @@ import agora.common.Hash;
import agora.common.Transaction;
import agora.test.Base;

/// Returns: true if all node's block heights are equal to the provided height
private bool allHeightsEqual (API)(API[] nodes, size_t height)
Copy link
Contributor

@Geod24 Geod24 Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: isConsistent and check the block headers for equality ?

Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 15, 2019

Updated with suggested changes.

import core.thread;
import std.algorithm;
import std.conv;
import std.format;
Copy link
Contributor

@Geod24 Geod24 Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conv and format superfluous

Geod24
Geod24 approved these changes Jul 15, 2019
Copy link
Contributor

@Geod24 Geod24 left a comment

LGTM

After a node connects to all its peers,
it will retrieve the latest blocks and
apply them to its ledger.

The node will keep periodically trying
to retrieve the latest blocks from its
connected peers.
@AndrejMitrovic
Copy link
Contributor Author

@AndrejMitrovic AndrejMitrovic commented Jul 15, 2019

Pushed a small update.

Geod24
Geod24 approved these changes Jul 15, 2019
@Geod24 Geod24 merged commit 450fbba into bosagora:v0.x.x Jul 15, 2019
1 check passed
@AndrejMitrovic AndrejMitrovic deleted the catchup-phase branch Jul 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants