Duplicate relations after importing aliases #81

Rizziepit · 2014-11-04T09:32:56Z

No description provided.

pudo · 2014-11-20T10:30:36Z

This is -- to some extent -- the code that causes it:

https://github.com/granoproject/grano/blob/master/grano/logic/entities.py#L136

The question is, how does that code decide when to delete duplicate links - because it may want to consider more than just source and target. The only fully logical solution I can see is to load all entities first, then de-dupe and then load relations. But that would be a major refactor.

Rizziepit · 2014-11-20T10:42:19Z

Would it not be possible to merge relations based on the uniqueness constraints in the schemata?

pudo · 2014-11-20T10:46:16Z

Hm, but the uniqueness constraints aren't actually in the schema; they're in the loaders. Which may be a problem anyway: if the schema knew about de-dupe, we could just POST whole objects without checking for them first, which would halve the number of HTTP requests we need to do to load a dataset.

Rizziepit · 2014-11-20T10:50:49Z

I was thinking of something along the lines of a grano command that takes the schema file as an argument and de-dupes the relations.

What are good reasons for keeping grano ignorant of uniqueness constraints? Simpler code?

pudo · 2014-11-20T10:54:05Z

Well there could be different uniqueness constraints for different data sources, but that actually seems more like a bug now that I think of it.

Rizziepit · 2014-11-20T11:03:38Z

Perhaps for now I can add a relation de-duping command to granoloader. It should be able to merge relations efficiently enough by paging through relations ordered by unique fields

Rizziepit changed the title ~~Creation of duplicate relations in loader~~ Duplicate relations after importing aliases Nov 10, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate relations after importing aliases #81

Duplicate relations after importing aliases #81

Rizziepit commented Nov 4, 2014

pudo commented Nov 20, 2014

Rizziepit commented Nov 20, 2014

pudo commented Nov 20, 2014

Rizziepit commented Nov 20, 2014

pudo commented Nov 20, 2014

Rizziepit commented Nov 20, 2014

Duplicate relations after importing aliases #81

Duplicate relations after importing aliases #81

Comments

Rizziepit commented Nov 4, 2014

pudo commented Nov 20, 2014

Rizziepit commented Nov 20, 2014

pudo commented Nov 20, 2014

Rizziepit commented Nov 20, 2014

pudo commented Nov 20, 2014

Rizziepit commented Nov 20, 2014