Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ern: Add DDEX ERN Indexer #17

Merged
merged 9 commits into from
Sep 23, 2017
Merged

ern: Add DDEX ERN Indexer #17

merged 9 commits into from
Sep 23, 2017

Conversation

lmars
Copy link
Member

@lmars lmars commented Sep 20, 2017

Summary:

Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
@@ -253,14 +260,21 @@ func (m *Main) RunMusicBrainzConvert(args Args) error {
<-ch
log.Info("received signal, exiting...")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this signal trap actually do? I seems to perform no action. Should there be a rollback?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It traps the signal and then cancels the context which can then be handled gracefully by the converter.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the defer of course. I missed that.

cmd/meta/main.go Outdated

// run the converter in a goroutine
stream := make(chan *cid.Cid)
errC := make(chan error, 1)
go func() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in a go-routine? Doesn't the ConvertArtists just run until completion and return the rows?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConvertArtists sends CIDs to a channel as it converts them, when it returns there is no guarantee that the goroutine reading those CIDs has finished, so we run the conversion in a goroutine and only exit the program when the stream has been fully read below.

I've clarified in c8e62cc.

cmd/meta/main.go Outdated

// run the converter in a goroutine
stream := make(chan *cid.Cid)
errC := make(chan error, 1)
go func() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as for musicbrainz above.

cid, err := cid.Parse(s.Text())
if err != nil {
log.Error("error parsing cid", "value", s.Text(), "err", err)
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why return and not continue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a corrupt entry in the stream then we stop processing because it is an exceptional case (there should be no corrupt entries in a stream).

defer cancel()
ch := make(chan os.Signal, 1)
signal.Notify(ch, syscall.SIGINT, syscall.SIGTERM)
<-ch
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as for musicbrainz above

// Indexer is a META indexer which indexes a stream of META objects
// representing DDEX ERNs into a SQLite3 database, getting the
// associated META objects from a META store.
type Indexer struct {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected to find an interface for Indexers somewhere, but can't. Are there Indexers totally disassociated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than being META indexers, they are all slightly different at the moment, I don't think there is need for an interface.

Copy link

@nolash nolash Sep 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are all slightly different at the moment

My intuition tells me that sounds quite like an interface. We can wait and see, though, and see with time what the generics will look like.

}
if err := i.indexProperty(ern.Cid(), id, indexFn); err != nil {
return err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if some of the indexing functions already have been run when this fails and returns? Should the indexer entries in the database roll back?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is that either indexing succeeds and the whole change to the DB is uploaded to Swarm, or indexing fails and nothing happens (so this whole process will eventually be wrapped in a "transaction" of sorts).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the indexing (sqlite) db is kind of a private cache within the indexer node itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so I'm currently imagining doing:

$ swarm get index.db
$ meta ern index index.db
$ swarm put index.db
$ geth --exec '<update-ens-record>'

With error checking in-between.

ern/index.go Outdated
for _, field := range []string{"ISRC", "CatalogNumber", "ProprietaryId"} {
if v, err := graph.Get("SoundRecordingId", field, "@value"); err == nil {
ids = append(ids, v.(string))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors are now being checked, see 25020a5.

ern/index.go Outdated
if v, err := graph.Get("ReferenceTitle", "TitleText", "@value"); err == nil {
title = v.(string)
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the error

}

// check SoundRecording objects were indexed
for isrc, title := range map[string]string{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging from the code, an id for a SoundRecording can be ISRC,Catalognumber and ProprietaryId. Maybe test for all?

It also seems to only need either "SoundRecordingId" or ReferenceTitle. Maybe the test should also account for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, so I'm actually keen to run this on real ERNs before writing too much code around this (I actually don't think CatalogNumber and ProprietaryId are used so we may end up dropping them), so I'm going to leave the tests like this.

Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
@lmars lmars merged commit bb227a5 into master Sep 23, 2017
@lmars lmars deleted the ddex-ern-indexer branch September 23, 2017 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants