Skip to content

Latest commit

 

History

History
68 lines (43 loc) · 6.33 KB

2pc.md

File metadata and controls

68 lines (43 loc) · 6.33 KB

Two-phase commit protocol

reference wiki 中文wiki

In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC) is a type of atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort (roll back) the transaction (it is a specialized type of consensus protocol). The protocol achieves its goal even in many cases of temporary system failure (involving either process, network node, communication, etc. failures), and is thus widely used. However, it is not resilient to all possible failure configurations, and in rare cases, user (e.g., a system's administrator) intervention is needed to remedy an outcome. To accommodate recovery from failure (automatic in most cases) the protocol's participants use logging of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's recovery procedures. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol.

    Coordinator                                         Cohort
                          QUERY TO COMMIT
            -------------------------------->
                          VOTE YES/NO           prepare*/abort*
            <-------------------------------
    commit*/abort*                COMMIT/ROLLBACK
            -------------------------------->
                          ACKNOWLEDGMENT        commit*/abort*
            <--------------------------------  
    end
  1. The commit-request phase (or voting phase), in which a coordinator process attempts to prepare all the transaction's participating processes (named participants, cohorts, or workers) to take the necessary steps for either committing or aborting the transaction and to vote, either "Yes": commit (if the transaction participant's local portion execution has ended properly), or "No": abort (if a problem has been detected with the local portion), and
  2. The commit phase, in which, based on voting of the cohorts, the coordinator decides whether to commit (only if all have voted "Yes") or abort the transaction (otherwise), and notifies the result to all the cohorts. The cohorts then follow with the needed actions (commit or abort) with their local transactional resources (also called recoverable resources; e.g., database data) and their respective portions in the transaction's other output (if applicable).

Assumption

The protocol works in the following manner: one node is designated the coordinator, which is the master site, and the rest of the nodes in the network are designated the cohorts. The protocol assumes that there is stable storage at each node with a write-ahead log, that no node crashes forever, that the data in the write-ahead log is never lost or corrupted in a crash, and that any two nodes can communicate with each other. The last assumption is not too restrictive, as network communication can typically be rerouted. The first two assumptions are much stronger; if a node is totally destroyed then data can be lost.

The protocol is initiated by the coordinator after the last step of the transaction has been reached. The cohorts then respond with an agreement message or an abort message depending on whether the transaction has been processed successfully at the cohort.


Basic algorithm

Commit request phase

or voting phase

  1. The coordinator sends a query to commit message to all cohorts and waits until it has received a reply from all cohorts.
  2. The cohorts execute the transaction up to the point where they will be asked to commit. They each write an entry to their undo log and an entry to their redo log.
  3. Each cohort replies with an agreement message (cohort votes Yes to commit), if the cohort's actions succeeded, or an abort message (cohort votes No, not to commit), if the cohort experiences a failure that will make it impossible to commit.

Commit phase

or Completion phase on Success

If the coordinator received an agreement message from all cohorts during the commit-request phase:

  1. The coordinator sends a commit message to all the cohorts.
  2. Each cohort completes the operation, and releases all the locks and resources held during the transaction.
  3. Each cohort sends an acknowledgment to the coordinator.
  4. The coordinator completes the transaction when all acknowledgments have been received.

Failure

If any cohort votes No during the commit-request phase (or the coordinator's timeout expires):

  1. The coordinator sends a rollback message to all the cohorts.
  2. Each cohort undoes the transaction using the undo log, and releases the resources and locks held during the transaction.
  3. Each cohort sends an acknowledgement to the coordinator.
  4. The coordinator undoes the transaction when all acknowledgements have been received.

Disadvantage

The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some cohorts will never resolve their transactions: After a cohort has sent an agreement message to the coordinator, it will block until a commit or rollback is received.

二阶段提交算法的最大缺点就在于 它的执行过程中间,节点都处于阻塞状态。即节点之间在等待对方的相应消息时,它将什么也做不了。特别是,当一个节点在已经占有了某项资源的情况下,为了等待其他节点的响应消息而陷入阻塞状态时,当第三个节点尝试访问该节点占有的资源时,这个节点也将连带陷入阻塞状态。

另外,协调者节点指示参与者节点进行提交等操作时,如有参与者节点出现了崩溃等情况而导致协调者始终无法获取所有参与者的响应信息,这是协调者将只能依赖协调者自身的超时机制来生效。但往往超时机制生效时,协调者都会指示参与者进行回滚操作。这样的策略显得比较保守。