Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Sync API and improve syncing #539

Merged
merged 19 commits into from
Jul 14, 2020
Merged

Implement Sync API and improve syncing #539

merged 19 commits into from
Jul 14, 2020

Conversation

austinabell
Copy link
Contributor

@austinabell austinabell commented Jul 10, 2020

Summary of changes
Changes introduced in this pull request:

  • Implements sync API Chain sync RPC methods #535
    • Additional context is needed for API so I just included what was needed to be least burdensome, then can refactor later
  • Implements Block gossipsub type GossipBlock to be able to serialize/deserialize them (Equivalent of BlockMsg in Lotus)
  • Moves bad block cache logic to own type to be made threadsafe
  • Creates type for sync state and updates the checkpoints for each stage (was assuming switching to graphsync previously)
  • Fix bug in zero case of AMT (was serializing as [0] instead of [] for the AMT node bitfield)
  • Update and fix blocksync conversion into FullTipset/Tipset
    • Previously was duplicate logic and bug with conversion to tipset

Also did a few other couple small things so feel free to ask about motivation or explanation for anything

Edit: oops forgot to add example commands to test manually:

curl -X POST --data '{ "jsonrpc": "2.0", "method": "Filecoin.SyncCheckBad", "params": [{"/":"bafy2bzacea3wsdh6y3a36tb3skempjoxqpuyompjbmfeyf34fi3uy6uue42v4"}], "id": 1 }' -H "Content-Type: application/json" http://localhost:1234/rpc/v0
curl -X POST --data '{ "jsonrpc": "2.0", "method": "Filecoin.SyncMarkBad", "params": [{"/":"bafy2bzacea3wsdh6y3a36tb3skempjoxqpuyompjbmfeyf34fi3uy6uue42v4"}], "id": 1 }' -H "Content-Type: application/json" http://localhost:1234/rpc/v0
curl -X POST --data '{ "jsonrpc": "2.0", "method": "Filecoin.SyncState", "id": 3 }' -H "Content-Type: application/json" http://localhost:1234/rpc/v0
curl -X POST --data '{ "jsonrpc": "2.0", "method": "Filecoin.SyncSubmitBlock", "params": [{"Header":{"Miner":"t01234","Ticket":{"VRFProof":"Ynl0ZSBhcnJheQ=="},"ElectionProof":{"VRFProof":"Ynl0ZSBhcnJheQ=="},"BeaconEntries":null,"WinPoStProof":null,"Parents":null,"ParentWeight":"0","Height":10101,"ParentStateRoot":{"/":"bafy2bzacea3wsdh6y3a36tb3skempjoxqpuyompjbmfeyf34fi3uy6uue42v4"},"ParentMessageReceipts":{"/":"bafy2bzacea3wsdh6y3a36tb3skempjoxqpuyompjbmfeyf34fi3uy6uue42v4"},"Messages":{"/":"bafy2bzacea3wsdh6y3a36tb3skempjoxqpuyompjbmfeyf34fi3uy6uue42v4"},"BLSAggregate":{"Type":2,"Data":"Ynl0ZSBhcnJheQ=="},"Timestamp":42,"BlockSig":{"Type":2,"Data":"Ynl0ZSBhcnJheQ=="},"ForkSignaling":42},"BlsMessages":null,"SecpkMessages":null}], "id": 3 }' -H "Content-Type: application/json" http://localhost:1234/rpc/v0

Reference issue to close (if applicable)

Closes #535

Other information and links

}

/// Puts a bad block Cid in the cache with a given reason.
pub async fn put(&self, c: Cid, reason: String) -> Option<String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you wanted to make the reason a String? I would've done a enum similar to the Error Types we have for the Actors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there just isn't a reason to distinguish specifically yet, and the flexibility is nice. Is a good idea though and I may look into it before this PR comes in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the change, but decided to force push back because using an enum for this uses 56 bytes rather than the 24 needed for a string. There is no functional gain for this, so the extra bytes on such a large cache is counterproductive

blockchain/chain_sync/src/sync.rs Show resolved Hide resolved
blockchain/chain_sync/src/sync.rs Show resolved Hide resolved
blockchain/chain_sync/src/sync_state.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@StaticallyTypedAnxiety StaticallyTypedAnxiety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some suggestions

}

Ok(FullTipset::new(blocks).map_err(|e| e.to_string())?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional : another approach would be to move this into a try_fold but the downside is that you would be allocating

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for what benefit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would be potentially getting rid of a mutable block vec and switching control to the try_fold but the scope if fairly small so not too worried about that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would you try fold instead of just doing an iter() map()? I'm not really seeing any optimization here, but I can commit this?

Edit: I committed and I get that it's more idiomatic, but the negative to doing things like this is that it's less readable to people who don't understand rust which seems like a con with no benefit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I can see that.

Will retract my suggestion

}
Ok(msgs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we just collect indexes.iter().map() into a Result<Vec<>,> here without allocating the msg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets allocated either way and is functionally the same, but I switched it

Copy link
Member

@ec2 ec2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Main comment is w.r.t. readability. You interchangeably use stage, state, status for the SyncState all over. I'd fix that for consistant terminology

pub struct ChainSyncer<DB, TBeacon> {
/// Syncing state of chain sync
state: SyncState,
// TODO should be a vector once syncing done async and ideally not wrap each state in mutex.
state: Arc<RwLock<SyncState>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename this to stage since the getter/setter are (get/set)_stage. Or change the method name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SyncStage is a different struct and meaning, and there is no methods you are referring to, can you reference what you mean? The set_stage function sets the stage in the sync status, maybe confusing but suggest a different word for the state and stage if you'd like

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh youre right. I misread, the stage is inside the state. M'bad, carry on :)

@austinabell austinabell merged commit 1b04f1c into main Jul 14, 2020
@austinabell austinabell deleted the austin/rpc/sync branch July 14, 2020 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chain sync RPC methods
5 participants