Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #837 #1068

Closed
wants to merge 14 commits into from
Closed

Fixes #837 #1068

wants to merge 14 commits into from

Conversation

nilay77
Copy link

@nilay77 nilay77 commented May 25, 2020

  • Fixed Graph.subjects, Graph.predicates, and Graph.objects to return only unique values.
  • Added test case for the above 3 methods

@coveralls
Copy link

coveralls commented May 25, 2020

Coverage Status

Coverage decreased (-0.01%) to 75.752% when pulling 8cbc7ff on mehak16163:Issue837 into 037ea51 on RDFLib:master.

@nilay77 nilay77 changed the title Fixes #837 Fixes RDFLib#837 May 26, 2020
@nilay77 nilay77 changed the title Fixes RDFLib#837 Fixes #837 May 26, 2020
rdflib/graph.py Outdated
@@ -617,18 +617,27 @@ def set(self, triple):

def subjects(self, predicate=None, object=None):
"""A generator of subjects with the given predicate and object"""
subs = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use sets for keeping track of seen (lists have O(n) lookup time).

Suggested change
subs = []
subs = set()

(And subsequently use subs.add(s)).

Copy link
Author

@nilay77 nilay77 May 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@niklasl Oh yes, this makes sense. I'll update this. Are any other changes required?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@niklasl I have pushed the suggested changes.

@tgbugs
Copy link
Contributor

tgbugs commented May 27, 2020

This cannot go into core as is because the memory usage will explode for large graphs. As mentioned in the discussion of #837 this is something that probably has to be implemented on the stores or under a different name, or with a different set of arguments e.g. def subjects(self, predicate=None, object=None, unique=False) or def subjects_unique(self, predicate=None, object=None).

@nilay77
Copy link
Author

nilay77 commented May 27, 2020

@tgbugs I have added a parameter named unique to the current methods as suggested by you. I have also added exception handling code which suggests setting the parameter unique to False in case of a MemoryError. If you could give me an idea about how to produce a memory error I can even add a test case.

@nilay77 nilay77 mentioned this pull request May 27, 2020
@nilay77 nilay77 requested a review from niklasl May 29, 2020 05:54
def subjects(self, predicate=None, object=None, unique=False):
"""A generator of unique subjects with the given predicate and object"""

if(unique==False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to if not unique:

def predicates(self, subject=None, object=None, unique=False):
"""A generator of unique predicates with the given subject and object"""

if(unique==False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to if not unique:

def objects(self, subject=None, predicate=None, unique=False):
"""A generator of unique objects with the given subject and predicate"""

if(unique==False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to if not unique:

Copy link
Member

@nicholascar nicholascar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please change the 3 x if statements in the code to use Python style, as indicated by comments

  2. You've not provided any tests for different outcomes based on whether or not unique is set. You need a test for non-unique triples (unique not set) v. unique triples (unique set).

(you may have fun implementing a test to show non-unique values since the default store for Graph() auto-deduplicates triples anyway!)

@nicholascar
Copy link
Member

Closing this PR as the proponent hasn't addressed comments in a couple of months. @nilay77 please reopen if you want to progress this, but you will need to address comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants