pushsync, localstore: decouple push/pull indexes for tag increment #1915

acud · 2019-10-29T11:20:17Z

This PR aims to decrease the complexity that was added in #1828 by adding a Tag field to the localstore pullIndex. This allows decoupling of the logic of tag increments from the pushIndex and in addition, allows us to have the freedom of choosing whether to insert into pushIndex at all (this should not be done on all cases, for example when an anonymous upload is made).

As part of the changes needed, a backwards compatible Tag field has been appended to the pullIndex value definition. This field, due to the nature of the new tag increment logic is backwards compatible, thus there is not need for a database migration apart from just renaming the pullIndex using the new shed functionality of RenameIndex.

storage/localstore/mode_set_test.go

acud · 2019-11-04T08:11:36Z

there is a data race in a stream test which is showing up here. I will fix this but I think it is orthogonal to this PR and that this can still be reviewed

janos

Only a few minor comments. I am not approving since it is still marked as in progress.

storage/localstore/migration_test.go

storage/localstore/migration.go

zelig

This PR is addressing 1) persistance, and 2) changing indexes.
as for 1) i believe we may need more careful checkpoints as per roundtable discussion, no?
as for 2) synced state should not increment for pull sync only sent state.

chunk/tags.go

storage/localstore/mode_set.go

swarm.go

storage/localstore/mode_set.go

acud · 2019-11-05T07:51:00Z

This PR is addressing 1) persistance, and 2) changing indexes.
as for 1) i believe we may need more careful checkpoints as per roundtable discussion, no?

TL;DR: our shutdown sequence does not promise that all goroutines related to push and pullsync are shut down before we persist tags. this has to be mended before we implement a dirty flag

Yes but as per my discussion with @janos it is not possible to implement the dirty flag right now, since we have no guarantees that once the dirty flag is unset (on persist), that no other goroutine is going to do a tag increment. example - on pushsync shutdown right now there is no guarantee that additional goroutines will not do any other Set operations or tag increments. This will, in turn, very probably create a situation where the tags are constantly dirty on shutdown and thus on startup they will always be ignored and a new tag object created instead.

as for 2) synced state should not increment for pull sync only sent state.

ok

janos · 2019-11-11T09:04:28Z

The code cannot be compiled:

storage/localstore/migration.go:117:23: not enough arguments in call to db.pushIndex.Iterate
	have (func(shed.Item) (bool, error))
	want (shed.IndexIterFunc, *shed.IterateOptions)
storage/localstore/migration.go:120:5: continue is not in a loop
storage/localstore/migration.go:128:3: missing return at end of function
storage/localstore/migration.go:134:4: too many arguments to return
	have (nil, error)
	want (error)

storage/localstore/localstore.go

storage/localstore/migration.go

janos · 2019-11-11T11:59:43Z

storage/localstore/migration.go

+		for i := 0; i < len(migrations)-1; i++ {
+			err := migrations[i].migrationFunc(db)
+			if err != nil {
+				return err
+			}
+			if i != len(migrations)-1 {
+				err = db.schemaName.Put(migrations[i+1].name) // put the name of the next schema


Why are we never running migrations for the last element? But also setting the schema of the last migration.

When any migration is done, its schema should be stored in schemaName field, so that if any intermediate migration fails, the migration can be restarted after the last successful one.

Why are we never running migrations for the last element? But also setting the schema of the last migration.

if you would look at the definition of the migration struct:

type migration struct { name string //name of the schema migrationFunc func(db *DB) error // the migration function that needs to be performed in order to get to the NEXT schema name }

maybe this is not intuitive enough. i can change it that the migrationFunc will lead to the current schema name, not the next.

When any migration is done, its schema should be stored in schemaName field, so that if any intermediate migration fails, the migration can be restarted after the last successful one.

that is actually what happens. but yeah i guess this code is a bit overly convoluted. i will refactor this

storage/localstore/migration.go

janos · 2019-11-11T12:37:00Z

storage/localstore/schema.go

+
+// allDbSchemaMigrations contains an ordered list of the database schemes, that is
+// in order to run data migrations in the correct sequence
+var allDbSchemaMigrations = []migration{


Is all prefix needed? There is only one migrations slice. Can it be called just migrations?

janos · 2019-11-11T12:39:21Z

storage/localstore/migration_test.go

+			return nil
+		}},
+		{name: DbSchemaDiwali, migrationFunc: func(db *DB) error {
+			shouldNotRun = true // this should not be executed


This is in relation with my comment on migrate function. I do not know why last migration should not be executed.

janos · 2019-11-11T12:43:38Z

storage/localstore/mode_set_test.go

-					if err != nil {
-						t.Fatal(err)
-					}
+	tag.Inc(chunk.StateStored) // so we don't get an error on tag.Status later on


If this is needed in the test, should it be the responsibility of db.Put to increment StateStored if it finds the tag?

no, StateStored is incremented in hasherstore..... remove this line and you will understand the error that is invoked. this must be called in the test because of the tag.Status logic. please review that function to understand why this is called

Thanks for the explanation.

storage/localstore/mode_set_test.go

storage/localstore/mode_set.go

janos

Thanks for addressing my comments. LGTM.

janos · 2019-11-12T09:48:31Z

storage/localstore/schema.go

@@ -8,7 +24,7 @@ import (

 // The DB schema we want to use. The actual/current DB schema might differ
 // until migrations are run.
-const CurrentDbSchema = DbSchemaSanctuary
+var DbSchemaCurrent = DbSchemaDiwali


I may be wrong, but I do not think that I added schema name variables. I used DB in shed.

The convention that is usually noted is this one https://github.com/golang/go/wiki/CodeReviewComments#initialisms on initialisms. I also think that Uid should be UID and so on, but I missed to review these pull requests.

The uppercaps argument does not stand, as it is about "same" caps, DB can be db at the start of unexported value or on its own.

In any case, we have so much inconsistencies in the code that this does not matter at all.

janos · 2019-11-12T09:49:43Z

storage/localstore/mode_set_test.go

-					if err != nil {
-						t.Fatal(err)
-					}
+	tag.Inc(chunk.StateStored) // so we don't get an error on tag.Status later on


Thanks for the explanation.

janos · 2019-11-13T11:24:49Z

storage/localstore/migration_test.go

+	}
+	defer os.RemoveAll(tmpdir)
+
+	cdir, err := os.Getwd()


This is not needed. Tests are run always in current directory. Relative path for dir is fine.

janos · 2019-11-14T13:15:51Z

storage/localstore/migration_test.go

+		return err
+	}
+	defer func() {
+		err = out.Close()


This error will shadow any returned one. It is possible that Copy or Sync return not nil error and that the err is set to nil, by the Close in this defer.

janos · 2019-11-14T13:17:50Z

storage/localstore/migration_test.go

+	if _, err = io.Copy(out, in); err != nil {
+		return err
+	}
+	err = out.Sync()


This error is never returned.

acud added the cleanup code completion, add comments and more label Oct 29, 2019

acud added this to the 0.5.3 milestone Oct 29, 2019

acud self-assigned this Oct 29, 2019

acud added this to Backlog in Swarm Core - Sprint planning via automation Oct 29, 2019

acud requested review from janos and zelig October 29, 2019 11:20

acud added the in progress label Oct 29, 2019

acud moved this from Backlog to In progress in Swarm Core - Sprint planning Oct 29, 2019

acud force-pushed the pushpullidx branch 3 times, most recently from a1c61b7 to 45c72e1 Compare November 1, 2019 08:46

acud commented Nov 1, 2019

View reviewed changes

storage/localstore/mode_set_test.go Outdated Show resolved Hide resolved

acud force-pushed the pushpullidx branch from 45c72e1 to 1adeaa2 Compare November 1, 2019 09:45

acud changed the base branch from master to compare-pushpullpr November 1, 2019 11:09

acud changed the base branch from compare-pushpullpr to master November 1, 2019 11:46

acud changed the title ~~localstore: add tags to pullsync index and mend tag increment logic~~ chunk, localstore: decouple push and pull indexes, add tag checkpoint persistence Nov 1, 2019

acud changed the title ~~chunk, localstore: decouple push and pull indexes, add tag checkpoint persistence~~ chunk, localstore: decouple push/pull indexes, add tag checkpoint persistence Nov 1, 2019

acud force-pushed the pushpullidx branch from 8cd90fd to 57adeaf Compare November 1, 2019 17:17

acud added ready for review and removed in progress labels Nov 2, 2019

acud force-pushed the pushpullidx branch from 57adeaf to 30de985 Compare November 2, 2019 05:48

acud force-pushed the pushpullidx branch from 30de985 to c8cf2bc Compare November 4, 2019 08:52

acud moved this from In progress to In review (includes Documentation) in Swarm Core - Sprint planning Nov 4, 2019

acud added in progress and removed ready for review labels Nov 4, 2019

janos reviewed Nov 4, 2019

View reviewed changes

storage/localstore/migration_test.go Outdated Show resolved Hide resolved

storage/localstore/migration_test.go Outdated Show resolved Hide resolved

storage/localstore/migration.go Outdated Show resolved Hide resolved

zelig suggested changes Nov 4, 2019

View reviewed changes

chunk/tags.go Outdated Show resolved Hide resolved

storage/localstore/mode_set.go Outdated Show resolved Hide resolved

swarm.go Outdated Show resolved Hide resolved

storage/localstore/mode_set.go Show resolved Hide resolved

acud force-pushed the pushpullidx branch from 468c7f2 to f37504e Compare November 7, 2019 09:53

localstore: address PR comment

4ac0379

acud requested a review from zelig November 8, 2019 16:27

localstore: add edge case of diwali migration

8a56fba

acud force-pushed the pushpullidx branch from 0d17178 to 8a56fba Compare November 11, 2019 10:19

janos suggested changes Nov 11, 2019

View reviewed changes

localstore: address PR comments

dc13463

acud force-pushed the pushpullidx branch from 269f839 to b1fb2a7 Compare November 12, 2019 08:35

localstore: simplify tests

7165035

acud force-pushed the pushpullidx branch from b1fb2a7 to 7165035 Compare November 12, 2019 08:47

janos approved these changes Nov 12, 2019

View reviewed changes

zelig approved these changes Nov 13, 2019

View reviewed changes

localstore: fix tag lookup to search for only tags which are > 0

6cb7413

acud added do-not-merge and removed ready for another review labels Nov 13, 2019

acud added 3 commits November 13, 2019 15:55

localstore: fix migrations

9c65400

localstore: add test fixture

5ee686c

localstore: add e2e migration test with fixture for sanctuary

2f3cf27

acud force-pushed the pushpullidx branch from c77627b to 2f3cf27 Compare November 13, 2019 11:12

janos reviewed Nov 13, 2019

View reviewed changes

acud added 3 commits November 14, 2019 16:53

localstore: address PR comment

c35ba19

Merge branch 'master' into pushpullidx

8531dfb

localstore: fix return values

c36e5d7

janos reviewed Nov 14, 2019

View reviewed changes

localstore: fix error handling in copy fn

33bcc27

acud force-pushed the pushpullidx branch from 44c3831 to 33bcc27 Compare November 14, 2019 13:42

acud removed the do-not-merge label Nov 14, 2019

acud merged commit 3ce77cb into master Nov 14, 2019

Swarm Core - Sprint planning automation moved this from In review (includes Documentation) to Done Nov 14, 2019

acud deleted the pushpullidx branch November 14, 2019 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pushsync, localstore: decouple push/pull indexes for tag increment #1915

pushsync, localstore: decouple push/pull indexes for tag increment #1915

acud commented Oct 29, 2019 •

edited

Loading

acud commented Nov 4, 2019

janos left a comment

zelig left a comment

acud commented Nov 5, 2019

janos commented Nov 11, 2019

janos Nov 11, 2019

acud Nov 12, 2019 •

edited

Loading

janos Nov 11, 2019

janos Nov 11, 2019

janos Nov 11, 2019

acud Nov 12, 2019

janos Nov 12, 2019

janos left a comment

janos Nov 12, 2019

janos Nov 12, 2019

janos Nov 13, 2019

janos Nov 14, 2019

janos Nov 14, 2019

pushsync, localstore: decouple push/pull indexes for tag increment #1915

pushsync, localstore: decouple push/pull indexes for tag increment #1915

Conversation

acud commented Oct 29, 2019 • edited Loading

acud commented Nov 4, 2019

janos left a comment

Choose a reason for hiding this comment

zelig left a comment

Choose a reason for hiding this comment

acud commented Nov 5, 2019

janos commented Nov 11, 2019

Choose a reason for hiding this comment

acud Nov 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acud commented Oct 29, 2019 •

edited

Loading

acud Nov 12, 2019 •

edited

Loading