Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated append method to keep an "_end" hint. Purpose is to address … #612

Closed
wants to merge 1 commit into from
Closed

Conversation

hsolbrig
Copy link
Contributor

…performance problems for large lists

@coveralls
Copy link

Coverage Status

Coverage increased (+0.005%) to 63.991% when pulling 4e41256 on hsolbrig:master into d7013fb on RDFLib:master.

container = rest
graph.add((container, RDF.first, item))
graph.add((container, RDF.rest, RDF.nil))
if graph.value(self._end, RDF.rest) == RDF.nil:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer (self._end, RDF.rest, RDF.nil) in graph, which may also be slightly faster?

@gromgull
Copy link
Member

My gut feeling tells me there are some "cache invalidation" issues here, but since the code actually checks if (_end, rest, NIL) is in the graph the only thing I can think of is:

  • Make a list, the Collection's 'end' points to the end.
  • Make another list, move (outside the collection api) the tail of the first list to the end of this.
  • Now the first Collection has end pointing to the end of the wrong list.

BUT - anyone who does this probably deserves everything coming to them :)

Still, it would be a nightmarish bug to debug.

@gromgull
Copy link
Member

Actually, thinking about it - another, easier way to mess up is to create two objects for managing the same collection.

This raises the question - why don't we have a utility function on the graph objects for creating collections? Like we do for Resource? Then that could have had an internal registry of the collection objects and returned the same if requested again.

@joernhees joernhees added this to the rdflib 4.2.2 milestone Mar 20, 2016
@joernhees
Copy link
Member

PR for #609

@joernhees
Copy link
Member

yupp, i see the same issues... #609 was about the constructor, while this PR in general tries to improve subsequent append calls by giving Collection a state, which might cause problems later on.

What about giving append an optional arg _end? It would be a private arg to indicate that its use is potentially dangerous if you don't know what you're doing. If set it, append would trust it to be the end-pointer and we could use it to provide an O(n) extend method, which we could also use in the ctor?

@gromgull
Copy link
Member

As detailed in the ticket, I fixed this with a safer addition of an __iadd__ method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants