Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Introduce stateful restorer in stmgr. #2091

Merged
merged 2 commits into from Jul 24, 2017
Merged

Introduce stateful restorer in stmgr. #2091

merged 2 commits into from Jul 24, 2017

Conversation

srkukarni
Copy link
Contributor

This allows stmgr to handle all checkpoint save/restore as well as exactly once messages.

all checkpoint save/restore as well as exactly once messages
CHECK(stateful_restorer_);

// Start the restore process
stateful_restorer_->StartRestore(_checkpoint_id, _restore_txid, pplan_);
}

// Called by TmasterClient when it receives directive from tmaster
// to restore the topology to _checkpoint_id checkpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment here is duplicated with the one in the above method. Could you update it to correctly describe the actual behavior of this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the comment

@@ -587,6 +610,11 @@ const proto::system::PhysicalPlan* StMgr::GetPhysicalPlan() const { return pplan

void StMgr::HandleStreamManagerData(const sp_string&,
proto::stmgr::TupleStreamMessage2* _message) {
if (stateful_restorer_ && stateful_restorer_->InProgress()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have some metrics for the amount of data being dropped for both stmgr data and instance data for future tracking and investigation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

// in case we are not in 2pc
if (stateful_restorer_) {
if (!stateful_restorer_->InProgress() && tmaster_client_) {
LOG(INFO) << "We lost connection withi stmgr " << _stmgr_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/withi/with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

if (stateful_restorer_->InProgress()) {
// We are in the middle of a restore
stateful_restorer_->HandleAllInstancesConnected();
} else if (tmaster_client_ && tmaster_client_->IsConnected()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you elaborate why in this case the ResetTopologyState message needs to be sent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the description

@srkukarni srkukarni merged commit f9bddff into apache:master Jul 24, 2017
@srkukarni srkukarni deleted the sanjeevk/ext1_stmgrrestorer branch July 24, 2017 20:43
nicknezis pushed a commit that referenced this pull request Sep 14, 2020
* Introduce stateful restorer in stmgr. This allows stmgr to handle
all checkpoint save/restore as well as exactly once messages

* Added metrics to keep track of bytes/tuples discarded during restore
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants