Break the precondition of `handle_committed` #50

yuezato · 2018-12-17T05:34:53Z

Description

We can break the following precondition of the handle_committed method:

frugalos/frugalos_mds/src/node/node.rs

Line 478 in 4b9ca1e

track_assert_eq!(self.next_commit, commit, ErrorKind::InvalidInput);

Reproduce

Use these files: https://gist.github.com/yuezato/9c0af68320935b342d0b152811f58cfc

Why is the precondition broken

In this while-loop:
https://github.com/frugalos/frugalos/blob/master/frugalos_mds/src/node/node.rs#L729-L738
here we assume the two raft events [ Event::SnapshotLoaded, Event::Committed ] come in this order.

First, we deal the Event::SnapshotLoaded

frugalos/frugalos_mds/src/node/node.rs

Lines 449 to 464 in 4b9ca1e

    
           E::SnapshotLoaded { new_head, snapshot } => { 
        
               info!( 
        
                   self.logger, 
        
                   "New snapshot is loaded: new_head={:?}, bytes={}", 
        
                   new_head, 
        
                   snapshot.len() 
        
               ); 
        
               let logger = self.logger.clone(); 
        
               let future = fibers_tasque::DefaultCpuTaskQueue.async_call(move || { 
        
                   let machine = track!(codec::decode_machine(&snapshot))?; 
        
                   let versions = machine.to_versions(); 
        
                   info!(logger, "Snapshot decoded: {} bytes", snapshot.len()); 
        
                   Ok((new_head, machine, versions)) 
        
               }); 
        
               self.decoding_snapshot = Some(future); 
        
           }

without updating self.next_commit.

Immediately after receiving Event::Committed, we reach this line:

frugalos/frugalos_mds/src/node/node.rs

Line 478 in 4b9ca1e

track_assert_eq!(self.next_commit, commit, ErrorKind::InvalidInput);

Finally, the precondition is broken.

How Solve This

Once we encounter a SnapshotLoaded event,
we should wait to deal committed events that follows the loaded event among decoding the snapshot.

Indeed, in this part (Especially line 704)

frugalos/frugalos_mds/src/node/node.rs

Lines 688 to 709 in 4b9ca1e

    
           match track!(self.decoding_snapshot.poll().map_err(Error::from))? { 
        
               Async::NotReady => return Ok(Async::NotReady), 
        
               Async::Ready(None) => {} 
        
               Async::Ready(Some(result)) => { 
        
                   let (new_head, machine, versions) = track!(result)?; 
        
                   info!(self.logger, "Snapshot decoded: new_head={:?}", new_head); 
        
                   let delay = env::var("FRUGALOS_SNAPSHOT_REPAIR_DELAY") 
        
                       .ok() 
        
                       .and_then(|v| v.parse().ok()) 
        
                       .unwrap_or(10); 
        
                   self.events.reserve_exact(machine.len()); 
        
                   self.events 
        
                       .extend(versions.into_iter().map(|version| Event::Putted { 
        
                           version, 
        
                           put_content_timeout: Seconds(delay), 
        
                       })); 
        
                   self.next_commit = new_head.index; 
        
                   self.machine = machine; 
        
                   self.metrics.objects.set(self.machine.len() as f64); 
        
                   self.decoding_snapshot = None; 
        
               } 
        
           }

we can correctly update self.next_commit and this maybe solve the present issue.

The text was updated successfully, but these errors were encountered:

yuezato · 2018-12-28T13:14:50Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Break the precondition of `handle_committed` #50

Break the precondition of `handle_committed` #50

yuezato commented Dec 17, 2018

yuezato commented Dec 28, 2018

yuezato commented Jan 8, 2019

Break the precondition of handle_committed #50

Break the precondition of handle_committed #50

Comments

yuezato commented Dec 17, 2018

Description

Reproduce

Why is the precondition broken

How Solve This

yuezato commented Dec 28, 2018

Related Problem

yuezato commented Jan 8, 2019

Break the precondition of `handle_committed` #50

Break the precondition of `handle_committed` #50