Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/backups upgrade path -- WIP #220

Merged
merged 1 commit into from Nov 29, 2017
Merged

Conversation

ZenGround0
Copy link
Collaborator

@ZenGround0 ZenGround0 commented Nov 1, 2017

The basic approach is to read the json backup and replace all raft state with a single snapshot of the new version, while preserving the necessary raft config meta-data. I added migration as an ipfs-cluster-service rather than ipfs-cluster-ctl command because it reinforces to the user that the ipfs-cluster-service daemon should be stopped, reuses some of the config functions already built into this tool, and can't talk to an ipfs-cluster-service daemon endpoint so including it in ipfs-cluster-ctl would break the common pattern.

  • Still testing this out
  • Needs sharness, possibly other tests
  • Still thinking about upgrading general migration framework

--Update--
@hsanjuan The core functionality is here and tests out manually. When you have a chance your feedback would be awesome. This is still WIP as I want to add a few things (sharness, cleaner development upgrade path and UX for trying to load bad raft state after an update without a migration). Thanks!

@ghost ghost assigned ZenGround0 Nov 1, 2017
@ghost ghost added the status/in-progress In progress label Nov 1, 2017
@ZenGround0 ZenGround0 force-pushed the feat/backups-upgrade-path branch 2 times, most recently from 3c09093 to eed5e3c Compare November 7, 2017 17:07
return bhost, nil
}

func encodeState(state mapstate.MapState) ([]byte, error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just copy pasted from libp2p-raft/codec. I'd like your opinion on whether this should be exported from the module or kept inline here.

@@ -52,6 +52,10 @@ func copyEmptyStructToIfaces(in []struct{}) []interface{} {
return ifaces
}

func MultiaddrSplit(addr ma.Multiaddr) (peer.ID, ma.Multiaddr, error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shadowed multiaddrSplit with an exported version but probably shouldn't keep it this way. I wanted to check if you think it is OK to export.

Copy link
Collaborator

@hsanjuan hsanjuan Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it needs to be exported?

Edit: ah, maybe we do

raftDataPath = c.Args().Get(1)
}

//Migrate backup to new state
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think there is a better file (maybe I should make one?) to execute these things I am open to suggestions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps let's put most of this action in a migrate() function. migrate.go file won't hurt either.

@@ -182,6 +182,10 @@ func makeDataFolder(baseDir, folder string) (string, error) {
return folder, nil
}

func MakeServerConf(peers []peer.ID) hraft.Configuration {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shadowed makeServerConf with an exported version but probably shouldn't keep it this way. I wanted to check if you think it is OK to export.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to export this?

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 28d5b79 on feat/backups-upgrade-path into ** on master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling eb5af2f on feat/backups-upgrade-path into ** on master**.

Copy link
Collaborator

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey,

I think the way of doing things is probably correct, but I'd move functionality to the related component. There are two main things here:

Checking that the state version is up to date is a State component operation: You can add a function that gives you the current state version (from a file or reader) or similar (just like Restore does):

 var vonly struct{ Version int }
        err = json.Unmarshal(snap, &vonly)

Instead of checking if the state needs a migration within the consensus component, you could do it in main.go, before you load anything. Otherwise you create a too strong dependency between raft<->mapstate which we can live without.

Performing a migration: All the part about trashing and writing a new snapshot given a new state could be in consensus/raft, as it is very specific to raft. It sounds like we'd need a consensus/raft.Restore(st State) or Reset(State) where given a cluster state object (regardless of implementation below), the consensus internal state is replaced by it (the difference will Rollback would be that this is an Offline operation). This would be called from main.go with the migrated State.

Let me know if you have questions :)

checkErr("creating snapshot store", err)
dummyHost, err := makeDummyHost(clusterCfg)
checkErr("creating dummy host for snapshot store", err)
dummyTransport, err := libp2praft.NewLibp2pTransport(dummyHost, consensusCfg.NetworkTimeout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use hraft.InmemTransport rather than going through the trouble of creating a libp2p one?

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 12a184e on feat/backups-upgrade-path into ** on master**.

@ZenGround0
Copy link
Collaborator Author

ZenGround0 commented Nov 8, 2017

@hsanjuan I am in agreement with you for the most part. I've moved the version checking to the state module and moved the raft specific parts of migration to the consensus/raft module. One place where I could use a little more advice is where you say "Instead of checking if the state needs a migration within the consensus component, you could do it in main.go, before you load anything."

I've been planning this out and if I try to validate the version of the state of existing raft snapshots before loading the consensus in main.go then I think I would end up needing to do a bunch of messy stuff like creating dummy raft components to call HasExistingState and traversing the raft data folder in search of snapshots files to load manually.

Would it be acceptable to check state validity after loading things, i.e. after calling NewCluster from main.go? This could still be done from within main.go so that the only dependence introduced between consensus/raft and state/mapstate is in raft.Reset(mapstate) and would simplify the check.

I thought more about this and trying to check the initial state by reading from the state machine after raft is initialized seems to be asking for race conditions and confusion and therefore worse than manually sifting through raft snapshot directories which actually turned out cleaner than expected.

Copy link
Collaborator

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey!

Ok I think you are pretty much on track. My big comment above is the main issue: you took a too manual approach when you can possibly rely on functionality from libp2p-raft and hraft. Specially important to avoid doing serializations/deserializations ourselves.

Keep it up! 👍

@@ -34,23 +34,23 @@ func NewMapState() *MapState {
}

// Add adds a Pin to the internal map.
func (st *MapState) Add(c api.Pin) error {
func (st MapState) Add(c api.Pin) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a pointer receiver? It implements the interface just fine


//ExistingStateReader does a best effort search for existing raft state files
//and returns the bytes of the latest raft snapshot
func ExistingStateReader(cfg *Config) (io.Reader, error){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is overly manual with too many assumptions on how raft puts snapshots on disk.

I think you aim to:

So this could be renamed to func LastSnapshot(raftDir string) (io.ReadCloser, error).

Now once you have that, in main.go, before you launch cluster:

  • Create a dummy mapState for version checking
  • Obtain the readCloser of Raft's last snapshot using function above
  • Use libp2p-raft to read it:
snapR := raft.LastSnapshot(....)
defer snapR.Close()
tempConsensus := libp2praft.NewConsensus(dummyMapstate)
err := tempConsensus.FSM().Restore(snapR) <- if this fails, we definitely need to tell user to attempt a migration
...
if !dummyMapState.VersionOk() { <- I think this might work if dummyMapState is a pointer, otherwise, need to `GetState()` first.
.... tell user to migrate

This is more or less what you did, but here is not right to do decode and re-marshaling here, since we cannot make assumptions on how libp2p-raft has encoded the snapshot.

@@ -182,6 +182,10 @@ func makeDataFolder(baseDir, folder string) (string, error) {
return folder, nil
}

func MakeServerConf(peers []peer.ID) hraft.Configuration {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to export this?

@@ -52,6 +52,10 @@ func copyEmptyStructToIfaces(in []struct{}) []interface{} {
return ifaces
}

func MultiaddrSplit(addr ma.Multiaddr) (peer.ID, ma.Multiaddr, error) {
Copy link
Collaborator

@hsanjuan hsanjuan Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it needs to be exported?

Edit: ah, maybe we do

@@ -50,10 +63,13 @@ type raftWrapper struct {
logStore hraft.LogStore
stableStore hraft.StableStore
boltdb *raftboltdb.BoltStore

HadState bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unused now

@@ -27,4 +27,6 @@ type State interface {
Snapshot(w io.Writer) error
// Restore restores a snapshot from a reader
Restore(r io.Reader) error
// Check whether version of state in reader has correct version
VersionOk(r io.Reader) bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it will be more versatile to have a Version(r io.Reader) int that tells us the version, and we can do comparison with mapstate.Version outside.

util.go Outdated
@@ -153,3 +161,32 @@ func containsPeer(list []peer.ID, peer peer.ID) bool {
}
return false
}

func BackupState(baseDir string, state state.State) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can pass *Config here. This function can probably be improved with a "backup retain number" which will be another Config key and who knows what else. So more versatile to just receive the Config.

@@ -0,0 +1,33 @@
#! /bin/sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharness! 👍

}
if err == nil {
logger.Error("An updated backup of this state has been saved")
logger.Error("to .ipfs-cluster/backups. To setup state for use")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, saved to baseDir/backups...

@@ -231,6 +233,57 @@ configuration.
Usage: "run the IPFS Cluster peer (default)",
Action: daemon,
},
{
Name: "migration",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrate? upgrade? or state upgrade? Perhaps we should leave room for other state commands (state inspect, state clean, state downgrade, state export)...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling e0b7f63 on feat/backups-upgrade-path into ** on master**.

@hsanjuan
Copy link
Collaborator

Heads up, I have published go-libp2p-raft 1.1.1 QmSrgyB4gDbaQnTCSaE9taRxhJkCUc9p5pYFqtXGyTaTbR. It has the exported functions. You should be able to gx update go-libp2p-raft github.com/libp2p/go-libp2p-raft to use it.

@ZenGround0
Copy link
Collaborator Author

Thanks @hsanjuan, I'll ping you when I'm ready for another round of review.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling f9f8eb6 on feat/backups-upgrade-path into ** on master**.

@ZenGround0
Copy link
Collaborator Author

@hsanjuan sharness is killing me today but apart from the failure I like where the code is at. Could you give it another pass?

If anything happens to jump out at you that I am doing wrong with Sharness, particularly the second test, I would be very grateful for any help. These tests pass on my machine albeit with more delay than is required from other tests and with sharness.sh reporting something about Hangup Signals, but consistently fail on Travis. I've ran the steps manually on a minikube cluster and migration always goes smoothly so I suspect the issue is related to cluster setup/teardown more than the state update.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 1fdbaea on feat/backups-upgrade-path into ** on master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 1b9fd6a on feat/backups-upgrade-path into ** on master**.

@ZenGround0 ZenGround0 closed this Nov 11, 2017
@ZenGround0 ZenGround0 reopened this Nov 11, 2017
@ghost ghost removed the status/in-progress In progress label Nov 11, 2017
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 1b9fd6a on feat/backups-upgrade-path into ** on master**.

Copy link
Collaborator

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, more throughout review now.

util.go Outdated
@@ -153,3 +161,32 @@ func containsPeer(list []peer.ID, peer peer.ID) bool {
}
return false
}

func BackupState(cfg *Config, state state.State) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing godoc

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm should this maybe belong to the Config object, as in func (cfg *Config) BackupState(state state.State) error

Desc would be something like: Backs up a state according to this configuration's options.

@@ -427,9 +434,88 @@ func (rw *raftWrapper) Peers() ([]string, error) {
return ids, nil
}

// LastSnapshot does a best effort search for existing raft state files
// and returns a reader to the bytes of the latest raft snapshot
func ExistingStateReader(cfg *Config) (io.Reader, error){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment has new name but function has old name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can name this func LastState(cfg *Config, *state.State) (bool, error) (see below why).

if len(snapMetas) == 0 {
return nil, errors.New("No snapshot files found")
}
_, recentSnapReader, err := store.Open(snapMetas[0].ID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like a test for this function, make sure it picks up snapshots in the right order and let us know if that ever changes.

if err != nil {
return nil, err
}
stateBytes, err := json.Marshal(state)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you are using insider information about how a state backup is encoded. This component has no idea about that. Also this works almost by chance because p2pconsensus.State is an empty interface so DecodeState is going to make a lot of assumptions about types. Since we use standard types you can feed the random object to json.Marshal and it looks fine for mapstate, but this probably has many pitfalls if state types were a bit more complex.

Anyways, this forces us to rethink this function a bit. What we essentially want is:

"Please raft module, figure out what raft's last snapshot was, and decode it into a ipfscluster/State-type object which is what you were storing in the first place".

In order to do that, we can provide *state.State and directly plug it to p2praft.DecodeState so that we de-serialize it directly onto the right thing. In practice, you will be calling this function with a MapState instance, which is what you expect to find in the snapshot.

If for some reason you cannot deserialize, a migration is needed. The caller will need to know if an error happened because there were no snapshots at all (or could not be read), or because DecodeState broke.

All in all, I suggest calling this function func LastState(cfg *Config, *state.State) (validSnapshot bool, err error)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @hsanjuan for explaining. To check my understanding, when you say this code is "using insider information" are you referring to this code importing the consensus.State type, or my implicit assumption that a consensus.State could be json-marshalled into bytes that themselves will work out to be json-unmarshalled into a valid mapState? I understand why the second thing is a problem, is the first thing (importing the consensus module) also a problem?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my implicit assumption that a consensus.State could be json-marshalled into bytes that themselves will work out to be json-unmarshalled into a valid mapState

That's the problem. Importing the consensus module is not

if err != nil {
return err
}
snapshotStore, err := hraft.NewFileSnapshotStoreWithLogger(raftDataPath, RaftMaxSnapshots, nil)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use WithLogger if you're not going to pass in a logger?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have raftStdLogger there available to use though



func upgrade(c *cli.Context) error {
if c.NArg() < 1 || c.NArg() > 2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the argument checking on the Action() and then call upgrade() with the right values

return err
}
//Record peers of cluster
var peers []peer.ID
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if SnapshotReset does as explained above, it should be happy just with current peer ID. Also this is going to break with my PR since we are going to support multiple addresses per peer ID.

return err
}
backupFilePath := c.Args().First()
var raftDataPath string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super sold on letting the user provide a custom raftDataPath with all the logic it adds and needing to carry it down 3 levels of functions. perhaps just set consensusCfg.DataFolder to the value of raftDataPath?

Note that consensusCfg.DataFolder is normally empty and that in that case, baseDir+defaultDataSubfolder should be used. That's mostly for the raft component's functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, given the above info I'm happy using the default or config value.


func needsUpdate(cfg *ipfscluster.Config, cCfg *raft.Config) bool {
state := mapstate.NewMapState()
r, err := raft.ExistingStateReader(cCfg) // Note direct dependence on raft here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following from comments above, this would be like `validSnap, err := raft.LastState(ccfg, &state)

func needsUpdate(cfg *ipfscluster.Config, cCfg *raft.Config) bool {
state := mapstate.NewMapState()
r, err := raft.ExistingStateReader(cCfg) // Note direct dependence on raft here
if err == nil { //err != nil no snapshots so skip check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if validSnap == false && err == nil -> no snapshots / nothing to do
if validSnap == true && err == nil -> do GetVersion check and maybe tell user migration is needed
if validSnap == true && err != nil -> could not deserialize state from snapshot, migration very probably needed or snapshot corrupted
if validSnap == false && err != nil -> some error before trying to read snapshots

@hsanjuan
Copy link
Collaborator

@ZenGround0 we have one problem here though, we don't snapshot anymore on raft.Shutdown.

This had problems when a node had been removed from the cluster (snapshotting would hang). In that situation we cannot check for peers, we cannot snapshot, we can only shutdown. Since we cannot request the peerset, it is hard to figure out that we have been removed (if we wanted to skip snapshotting on that case).

This means that:

  • there might be a snapshot, but it does not have the latest state because there are some newer entries in the log
  • there might not be a snapshot at all, only entries in the log

Either we find a way to raft.Snapshot() on every shutdown or the only other approach I see is to get inspiration from https://github.com/hashicorp/raft/blob/master/api.go#L307 and:

  • Try to load last snapshot
  • Load the logstore
  • Restore snapshot with libp2p-raft.FSM (this does de-serialization)
  • Run fsm.Apply for all missing log entries between the snapshotIndex and the log last index
  • return (the state should be the last state then)

Maybe we are focusing this wrongly. We save a backup on every shutdown. We could directly load the last backup against the current mapstate and check the version. If the last backup has not the right version, then we suggest a migration from it. Then we don't have to load snapshots at all and we just have to make the migrate part (close to what it is now).

On the other side, our backups are inefficient and the snapshot store pretty much does exactly the same. Maybe it would also help if backup format for mapstate is the same as libp2p-raft-snapshot, so it's easier to convert a snapshot to a backup, and backwards, but it's also tying mapstate and raft together.

I'm going to keep giving some thought to this. My feeling is that having a "soft" RecoverCluster which takes the last snapshot or makes a new one, takes the log, commits everything on it and saves it as a newer snapshot, is a useful thing to have. But let's test if we can snapshot right before raft.Shutdown() again.

@ZenGround0
Copy link
Collaborator Author

ZenGround0 commented Nov 22, 2017

Currently Travis always fails on (only) sharness test t0051 on travis, however this test consistently passes on my machine. It has been tough to see exactly what is going on because output is not carrying over on restart for some reason. One possibility that comes to mind is that t0051 accesses a new data folder through relative paths, maybe travis doesn't allow this?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 73.676% when pulling 20e43ab on feat/backups-upgrade-path into 2bc7aec on master.

Copy link
Collaborator

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this looks good to go! We just need to fix the problem test and do some manual testing...

// latestSnapshot looks for the most recent raft snapshot stored at the
// provided basedir. It returns a boolean indicating if any snapshot is
// readable, the snapshot's metadata, and a reader to the snapshot's bytes
func latestSnapshot(raftDataFolder string) (bool, *hraft.SnapshotMeta, io.ReadCloser, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the bool return value? or can we figure out what we need from the combination of reader+meta+err?

just asking, as we added it when we did not return so much info from this function, it's fine if it stays.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, looks like we can use meta!=nil as the boolean flag.

if err != nil {
return err
}
return nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this workflow looks very nice and straightforward now :)

@hsanjuan
Copy link
Collaborator

@ZenGround0 this option needs to be true very possibly to avoid snapshot hanging when a peer is removed:

https://github.com/ipfs/ipfs-cluster/blob/master/consensus/raft/config.go#L233

@ZenGround0
Copy link
Collaborator Author

@hsanjuan, got it. Should I work on changing the default in this PR?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.5%) to 73.577% when pulling cebc4bc on feat/backups-upgrade-path into 2bc7aec on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.5%) to 73.517% when pulling 4a7c609 on feat/backups-upgrade-path into 2bc7aec on master.

@ghost ghost assigned hsanjuan Nov 28, 2017
@hsanjuan hsanjuan force-pushed the feat/backups-upgrade-path branch 2 times, most recently from 627556a to 74fa816 Compare November 28, 2017 15:05
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 73.616% when pulling 74fa816 on feat/backups-upgrade-path into 2bc7aec on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.5%) to 73.517% when pulling ff3c69f on feat/backups-upgrade-path into 2bc7aec on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 73.656% when pulling cb5fdf3 on feat/backups-upgrade-path into 2bc7aec on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.5%) to 73.577% when pulling 8ccc166 on feat/backups-upgrade-path into 2bc7aec on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 73.597% when pulling 46dc11f on feat/backups-upgrade-path into 2bc7aec on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 73.656% when pulling 385d11e on feat/backups-upgrade-path into 2bc7aec on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.5%) to 73.577% when pulling cfffc50 on feat/backups-upgrade-path into 2bc7aec on master.

    ipfs-cluster-service now has a migration subcommand that upgrades
    persistant state snapshots with an out-of-date format version to the
    newest version of raft state. If all cluster members shutdown with
    consistent state, upgrade ipfs-cluster, and run the state upgrade command,
    the new version of cluster will be compatible with persistent storage.
    ipfs-cluster now validates its persistent state upon loading it and exits
    with a clear error in the case the state format version is not up to date.

    Raft snapshotting is enforced on all shutdowns and the json backup is no
    longer run.  This commit makes use of recent changes to libp2p-raft
    allowing raft states to implement their own marshaling strategies. Now
    mapstate handles the logic for its (de)serialization.  In the interest of
    supporting various potential upgrade formats the state serialization
    begins with a varint (right now one byte) describing the version.

    Some go tests are modified and a go test is added to cover new ipfs-cluster
    raft snapshot reading functions.  Sharness tests are added to cover the
    state upgrade command.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 73.636% when pulling 47b744f on feat/backups-upgrade-path into 2bc7aec on master.

@ZenGround0
Copy link
Collaborator Author

@hsanjuan I think this is finally ready. Let me know if there is more to do here before I can merge.

@hsanjuan hsanjuan merged commit cb5012c into master Nov 29, 2017
@ghost ghost removed the status/in-progress In progress label Nov 29, 2017
@hsanjuan hsanjuan deleted the feat/backups-upgrade-path branch November 29, 2017 08:55
@hsanjuan
Copy link
Collaborator

Thanks @ZenGround0 ! Great job here, this was a very complex change and I'm very glad we found a nice approach to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants