Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State: Use go-datastore to implement the state interface #655

Merged
merged 3 commits into from Feb 20, 2019

Conversation

hsanjuan
Copy link
Collaborator

Since the beginning, we have used a Go map to store the shared state (pinset)
in memory. The mapstate knew how to serialize itself so that libp2p-raft would
know how to write to disk when it:

  • Saved snapshots of the state on shutdown
  • Sent the state to a newcomer peer

hashicorp.Raft assumes an in-memory state which is snapshotted from time to
time and read from disk on boot.

This commit adds a dsstate implementation of the state interface using
go-datastore. This allows to effortlessly switch to a disk-backed state in
the future (as we will need), and also have at our disposal the different
implementations and utilities of Datastore for fine-tuning (caching, batching
etc.).

mapstate has been reworked to use dsstate. Ideally, we would not even need
mapstate, as it would suffice to initialize dsstate with a
MapDatastore. BUT, we still need it separate to be able to auto-migrate to
the new format.

This will be the last migration with the current system. Once this has been
released and users have been able to upgrade we will just remove mapstate as
it is now.

License: MIT
Signed-off-by: Hector Sanjuan code@hector.link

Depends on libp2p/go-libp2p-raft#50

@hsanjuan hsanjuan self-assigned this Jan 25, 2019
@ghost ghost added the status/in-progress In progress label Jan 25, 2019
@hsanjuan
Copy link
Collaborator Author

Also, kind of depends on discussion #654 . If we slightly changed how things are serialized soon, I'd rather not have to migrate right after this.

@lanzafame
Copy link
Contributor

LGTM. Only thing would be a better explanation of what the migration process is and how the migration process will be different in the future, i.e. post v6.

lanzafame
lanzafame previously approved these changes Jan 31, 2019
@lanzafame
Copy link
Contributor

@hsanjuan It seems that in some cases, this PR has slowed down cluster as there have now been 2 test timeout failures. I am happy to bump the timeout up again and then chip away at getting it down again.

@coveralls
Copy link

coveralls commented Jan 31, 2019

Coverage Status

Coverage decreased (-0.3%) to 64.888% when pulling 46801aa on feat/datastore into 3059ab3 on master.

@hsanjuan
Copy link
Collaborator Author

Also, kind of depends on discussion #654 . If we slightly changed how things are serialized soon, I'd rather not have to migrate right after this.

I don't fully know. But since checking state version will mean opening the datastore and reading one key, we know that will be different. The migration will be looping the keys and overwriting them with their updated form.

The key idea is that Marshal/Unmarshal will either not be used, or can happen normally regardless of the state version (unlike now).

@hsanjuan
Copy link
Collaborator Author

@hsanjuan It seems that in some cases, this PR has slowed down cluster as there have now been 2 test timeout failures. I am happy to bump the timeout up again and then chip away at getting it down again.

I only see jenkins, which I retriggered now. It make take a while before I merge this anyway.

Does this approval include changes in go-libp2p-raft ?

@lanzafame
Copy link
Contributor

@hsanjuan I just added some comments on the go-libp2p-raft PR.

@hsanjuan
Copy link
Collaborator Author

@lanzafame do me a favour and have a look to the last commits (added tracing) and lmk if we can leave this ready for merge, as I want it to go before #670 .

api/pb/types.pb.go Show resolved Hide resolved
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package

type Pin_PinType int32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the go package flag used in type.pb so that it generates 'idiomatic' type names?

cluster.go Outdated Show resolved Hide resolved

message Pin {
bytes Cid = 1;
enum PinType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/PinType/Type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't have an enum called Type and a field called Type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave this as is, I don't mind so much and we can change it later.

ReplicationFactorMax int `json:"replication_factor_max"`
Name string `json:"name"`
ShardSize uint64 `json:"shard_size"`
ReplicationFactorMin int `json:"replication_factor_min" codec:"rn,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hsanjuan Is there a compile-time or test-time check that we can add so that we don't conflict with the codec variable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, tests break in most cases when screwing up the codec field names.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally I have added a test that verifies that there are no dup fields.

Since the beginning, we have used a Go map to store the shared state (pinset)
in memory. The mapstate knew how to serialize itself so that libp2p-raft would
know how to write to disk when it:

* Saved snapshots of the state on shutdown
* Sent the state to a newcomer peer

hashicorp.Raft assumes an in-memory state which is snapshotted from time to
time and read from disk on boot.

This commit adds a `dsstate` implementation of the state interface using
`go-datastore`. This allows to effortlessly switch to a disk-backed state in
the future (as we will need), and also have at our disposal the different
implementations and utilities of Datastore for fine-tuning (caching, batching
etc.).

`mapstate` has been reworked to use dsstate. Ideally, we would not even need
`mapstate`, as it would suffice to initialize `dsstate` with a
`MapDatastore`. BUT, we still need it separate to be able to auto-migrate to
the new format.

This will be the last migration with the current system. Once this has been
released and users have been able to upgrade we will just remove `mapstate` as
it is now.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
Additionally, remove persisting the state version to the go-datastore. In the
future versions of the state, there is not a global format anymore (with a
global version). Instead, every pin element can potentially be stored in a
different version.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
@hsanjuan
Copy link
Collaborator Author

@lanzafame I squashed, rebased and cleaned up. Then I tested with larger states and had to fix some issues (latest commit). Can you check again? I think we can merge then.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
@lanzafame
Copy link
Contributor

@hsanjuan LGTM

@hsanjuan hsanjuan merged commit f57c5e4 into master Feb 20, 2019
@ghost ghost removed the status/in-progress In progress label Feb 20, 2019
@hsanjuan hsanjuan deleted the feat/datastore branch February 20, 2019 11:20
@hsanjuan hsanjuan mentioned this pull request Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants