-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #837 #1068
Fixes #837 #1068
Conversation
nilay77
commented
May 25, 2020
- Fixed Graph.subjects, Graph.predicates, and Graph.objects to return only unique values.
- Added test case for the above 3 methods
rdflib/graph.py
Outdated
@@ -617,18 +617,27 @@ def set(self, triple): | |||
|
|||
def subjects(self, predicate=None, object=None): | |||
"""A generator of subjects with the given predicate and object""" | |||
subs = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use sets for keeping track of seen (lists have O(n) lookup time).
subs = [] | |
subs = set() |
(And subsequently use subs.add(s)
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niklasl Oh yes, this makes sense. I'll update this. Are any other changes required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niklasl I have pushed the suggested changes.
This cannot go into core as is because the memory usage will explode for large graphs. As mentioned in the discussion of #837 this is something that probably has to be implemented on the stores or under a different name, or with a different set of arguments e.g. |
@tgbugs I have added a parameter named unique to the current methods as suggested by you. I have also added exception handling code which suggests setting the parameter unique to False in case of a MemoryError. If you could give me an idea about how to produce a memory error I can even add a test case. |
def subjects(self, predicate=None, object=None, unique=False): | ||
"""A generator of unique subjects with the given predicate and object""" | ||
|
||
if(unique==False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to if not unique:
def predicates(self, subject=None, object=None, unique=False): | ||
"""A generator of unique predicates with the given subject and object""" | ||
|
||
if(unique==False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to if not unique:
def objects(self, subject=None, predicate=None, unique=False): | ||
"""A generator of unique objects with the given subject and predicate""" | ||
|
||
if(unique==False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to if not unique:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Please change the 3 x if statements in the code to use Python style, as indicated by comments
-
You've not provided any tests for different outcomes based on whether or not
unique
is set. You need a test for non-unique triples (unique
not set) v. unique triples (unique
set).
(you may have fun implementing a test to show non-unique values since the default store for Graph()
auto-deduplicates triples anyway!)
Closing this PR as the proponent hasn't addressed comments in a couple of months. @nilay77 please reopen if you want to progress this, but you will need to address comments. |