Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State: Use go-datastore to implement the state interface #655

Merged
merged 3 commits into from Feb 20, 2019

Conversation

@hsanjuan
Copy link
Collaborator

commented Jan 25, 2019

Since the beginning, we have used a Go map to store the shared state (pinset)
in memory. The mapstate knew how to serialize itself so that libp2p-raft would
know how to write to disk when it:

  • Saved snapshots of the state on shutdown
  • Sent the state to a newcomer peer

hashicorp.Raft assumes an in-memory state which is snapshotted from time to
time and read from disk on boot.

This commit adds a dsstate implementation of the state interface using
go-datastore. This allows to effortlessly switch to a disk-backed state in
the future (as we will need), and also have at our disposal the different
implementations and utilities of Datastore for fine-tuning (caching, batching
etc.).

mapstate has been reworked to use dsstate. Ideally, we would not even need
mapstate, as it would suffice to initialize dsstate with a
MapDatastore. BUT, we still need it separate to be able to auto-migrate to
the new format.

This will be the last migration with the current system. Once this has been
released and users have been able to upgrade we will just remove mapstate as
it is now.

License: MIT
Signed-off-by: Hector Sanjuan code@hector.link

Depends on libp2p/go-libp2p-raft#50

@hsanjuan hsanjuan self-assigned this Jan 25, 2019
@hsanjuan hsanjuan requested a review from lanzafame Jan 25, 2019
@ghost ghost added the in progress label Jan 25, 2019
@hsanjuan

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 25, 2019

Also, kind of depends on discussion #654 . If we slightly changed how things are serialized soon, I'd rather not have to migrate right after this.

@lanzafame

This comment has been minimized.

Copy link
Collaborator

commented Jan 31, 2019

LGTM. Only thing would be a better explanation of what the migration process is and how the migration process will be different in the future, i.e. post v6.

@lanzafame

This comment has been minimized.

Copy link
Collaborator

commented Jan 31, 2019

@hsanjuan It seems that in some cases, this PR has slowed down cluster as there have now been 2 test timeout failures. I am happy to bump the timeout up again and then chip away at getting it down again.

@coveralls

This comment has been minimized.

Copy link

commented Jan 31, 2019

Coverage Status

Coverage decreased (-0.3%) to 64.888% when pulling 46801aa on feat/datastore into 3059ab3 on master.

@hsanjuan

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 31, 2019

Also, kind of depends on discussion #654 . If we slightly changed how things are serialized soon, I'd rather not have to migrate right after this.

I don't fully know. But since checking state version will mean opening the datastore and reading one key, we know that will be different. The migration will be looping the keys and overwriting them with their updated form.

The key idea is that Marshal/Unmarshal will either not be used, or can happen normally regardless of the state version (unlike now).

@hsanjuan

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 31, 2019

@hsanjuan It seems that in some cases, this PR has slowed down cluster as there have now been 2 test timeout failures. I am happy to bump the timeout up again and then chip away at getting it down again.

I only see jenkins, which I retriggered now. It make take a while before I merge this anyway.

Does this approval include changes in go-libp2p-raft ?

@lanzafame

This comment has been minimized.

Copy link
Collaborator

commented Feb 1, 2019

@hsanjuan I just added some comments on the go-libp2p-raft PR.

@hsanjuan

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 14, 2019

@lanzafame do me a favour and have a look to the last commits (added tracing) and lmk if we can leave this ready for merge, as I want it to go before #670 .

api/pb/types.pb.go Show resolved Hide resolved
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package

type Pin_PinType int32

This comment has been minimized.

Copy link
@lanzafame

lanzafame Feb 18, 2019

Collaborator

Was the go package flag used in type.pb so that it generates 'idiomatic' type names?

cluster.go Outdated Show resolved Hide resolved

message Pin {
bytes Cid = 1;
enum PinType {

This comment has been minimized.

Copy link
@lanzafame

lanzafame Feb 18, 2019

Collaborator

s/PinType/Type

This comment has been minimized.

Copy link
@hsanjuan

hsanjuan Feb 19, 2019

Author Collaborator

I can't have an enum called Type and a field called Type

This comment has been minimized.

Copy link
@hsanjuan

hsanjuan Feb 19, 2019

Author Collaborator

let's leave this as is, I don't mind so much and we can change it later.

ReplicationFactorMax int `json:"replication_factor_max"`
Name string `json:"name"`
ShardSize uint64 `json:"shard_size"`
ReplicationFactorMin int `json:"replication_factor_min" codec:"rn,omitempty"`

This comment has been minimized.

Copy link
@lanzafame

lanzafame Feb 18, 2019

Collaborator

@hsanjuan Is there a compile-time or test-time check that we can add so that we don't conflict with the codec variable?

This comment has been minimized.

Copy link
@hsanjuan

hsanjuan Feb 19, 2019

Author Collaborator

As discussed, tests break in most cases when screwing up the codec field names.

This comment has been minimized.

Copy link
@hsanjuan

hsanjuan Feb 19, 2019

Author Collaborator

Additionally I have added a test that verifies that there are no dup fields.

@ipfs ipfs deleted a comment from GitCop Feb 19, 2019
@ipfs ipfs deleted a comment from GitCop Feb 19, 2019
@hsanjuan hsanjuan force-pushed the feat/datastore branch from 81c220e to f78ca81 Feb 19, 2019
Since the beginning, we have used a Go map to store the shared state (pinset)
in memory. The mapstate knew how to serialize itself so that libp2p-raft would
know how to write to disk when it:

* Saved snapshots of the state on shutdown
* Sent the state to a newcomer peer

hashicorp.Raft assumes an in-memory state which is snapshotted from time to
time and read from disk on boot.

This commit adds a `dsstate` implementation of the state interface using
`go-datastore`. This allows to effortlessly switch to a disk-backed state in
the future (as we will need), and also have at our disposal the different
implementations and utilities of Datastore for fine-tuning (caching, batching
etc.).

`mapstate` has been reworked to use dsstate. Ideally, we would not even need
`mapstate`, as it would suffice to initialize `dsstate` with a
`MapDatastore`. BUT, we still need it separate to be able to auto-migrate to
the new format.

This will be the last migration with the current system. Once this has been
released and users have been able to upgrade we will just remove `mapstate` as
it is now.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
@hsanjuan hsanjuan force-pushed the feat/datastore branch from f78ca81 to d57b814 Feb 19, 2019
Additionally, remove persisting the state version to the go-datastore. In the
future versions of the state, there is not a global format anymore (with a
global version). Instead, every pin element can potentially be stored in a
different version.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
@hsanjuan

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 19, 2019

@lanzafame I squashed, rebased and cleaned up. Then I tested with larger states and had to fix some issues (latest commit). Can you check again? I think we can merge then.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
@hsanjuan hsanjuan referenced this pull request Feb 19, 2019
0 of 2 tasks complete
@lanzafame

This comment has been minimized.

Copy link
Collaborator

commented Feb 20, 2019

@hsanjuan LGTM

@hsanjuan hsanjuan merged commit f57c5e4 into master Feb 20, 2019
5 of 6 checks passed
5 of 6 checks passed
continuous-integration/jenkins/pr-merge This commit cannot be built
Details
Travis CI - Branch Build Passed
Details
Travis CI - Pull Request Build Passed
Details
codeclimate 2 fixed issues
Details
commit-message-check/gitcop All commit messages are valid
Details
coverage/coveralls Coverage increased (+0.003%) to 64.888%
Details
@ghost ghost removed the in progress label Feb 20, 2019
@hsanjuan hsanjuan deleted the feat/datastore branch Feb 20, 2019
@hsanjuan hsanjuan referenced this pull request Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.