-
Notifications
You must be signed in to change notification settings - Fork 112
pushsync, localstore: decouple push/pull indexes for tag increment #1915
Conversation
a1c61b7
to
45c72e1
Compare
there is a data race in a stream test which is showing up here. I will fix this but I think it is orthogonal to this PR and that this can still be reviewed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a few minor comments. I am not approving since it is still marked as in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is addressing 1) persistance, and 2) changing indexes.
as for 1) i believe we may need more careful checkpoints as per roundtable discussion, no?
as for 2) synced state should not increment for pull sync only sent state.
TL;DR: our shutdown sequence does not promise that all goroutines related to push and pullsync are shut down before we persist tags. this has to be mended before we implement a dirty flag Yes but as per my discussion with @janos it is not possible to implement the dirty flag right now, since we have no guarantees that once the dirty flag is unset (on persist), that no other goroutine is going to do a tag increment. example - on pushsync shutdown right now there is no guarantee that additional goroutines will not do any other
ok |
The code cannot be compiled:
|
storage/localstore/migration.go
Outdated
for i := 0; i < len(migrations)-1; i++ { | ||
err := migrations[i].migrationFunc(db) | ||
if err != nil { | ||
return err | ||
} | ||
if i != len(migrations)-1 { | ||
err = db.schemaName.Put(migrations[i+1].name) // put the name of the next schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we never running migrations for the last element? But also setting the schema of the last migration.
When any migration is done, its schema should be stored in schemaName field, so that if any intermediate migration fails, the migration can be restarted after the last successful one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we never running migrations for the last element? But also setting the schema of the last migration.
if you would look at the definition of the migration
struct:
type migration struct {
name string //name of the schema
migrationFunc func(db *DB) error // the migration function that needs to be performed in order to get to the NEXT schema name
}
maybe this is not intuitive enough. i can change it that the migrationFunc will lead to the current schema name, not the next.
When any migration is done, its schema should be stored in schemaName field, so that if any intermediate migration fails, the migration can be restarted after the last successful one.
that is actually what happens. but yeah i guess this code is a bit overly convoluted. i will refactor this
storage/localstore/schema.go
Outdated
|
||
// allDbSchemaMigrations contains an ordered list of the database schemes, that is | ||
// in order to run data migrations in the correct sequence | ||
var allDbSchemaMigrations = []migration{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is all
prefix needed? There is only one migrations slice. Can it be called just migrations
?
storage/localstore/migration_test.go
Outdated
return nil | ||
}}, | ||
{name: DbSchemaDiwali, migrationFunc: func(db *DB) error { | ||
shouldNotRun = true // this should not be executed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in relation with my comment on migrate
function. I do not know why last migration should not be executed.
if err != nil { | ||
t.Fatal(err) | ||
} | ||
tag.Inc(chunk.StateStored) // so we don't get an error on tag.Status later on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is needed in the test, should it be the responsibility of db.Put to increment StateStored if it finds the tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, StateStored is incremented in hasherstore
..... remove this line and you will understand the error that is invoked. this must be called in the test because of the tag.Status
logic. please review that function to understand why this is called
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments. LGTM.
@@ -8,7 +24,7 @@ import ( | |||
|
|||
// The DB schema we want to use. The actual/current DB schema might differ | |||
// until migrations are run. | |||
const CurrentDbSchema = DbSchemaSanctuary | |||
var DbSchemaCurrent = DbSchemaDiwali |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong, but I do not think that I added schema name variables. I used DB in shed.
The convention that is usually noted is this one https://github.com/golang/go/wiki/CodeReviewComments#initialisms on initialisms. I also think that Uid
should be UID
and so on, but I missed to review these pull requests.
The uppercaps argument does not stand, as it is about "same" caps, DB
can be db
at the start of unexported value or on its own.
In any case, we have so much inconsistencies in the code that this does not matter at all.
if err != nil { | ||
t.Fatal(err) | ||
} | ||
tag.Inc(chunk.StateStored) // so we don't get an error on tag.Status later on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
storage/localstore/migration_test.go
Outdated
} | ||
defer os.RemoveAll(tmpdir) | ||
|
||
cdir, err := os.Getwd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed. Tests are run always in current directory. Relative path for dir
is fine.
storage/localstore/migration_test.go
Outdated
return err | ||
} | ||
defer func() { | ||
err = out.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error will shadow any returned one. It is possible that Copy or Sync return not nil error and that the err is set to nil, by the Close in this defer.
storage/localstore/migration_test.go
Outdated
if _, err = io.Copy(out, in); err != nil { | ||
return err | ||
} | ||
err = out.Sync() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is never returned.
This PR aims to decrease the complexity that was added in #1828 by adding a
Tag
field to thelocalstore
pullIndex
. This allows decoupling of the logic of tag increments from thepushIndex
and in addition, allows us to have the freedom of choosing whether to insert intopushIndex
at all (this should not be done on all cases, for example when an anonymous upload is made).As part of the changes needed, a backwards compatible
Tag
field has been appended to thepullIndex
value definition. This field, due to the nature of the new tag increment logic is backwards compatible, thus there is not need for a database migration apart from just renaming thepullIndex
using the newshed
functionality ofRenameIndex
.