New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-14012: Add helper container class supporting topologically sorted iteration over elements #29
Conversation
python/lsst/daf/butler/core/utils.py
Outdated
@@ -203,3 +203,87 @@ def __call__(cls): # noqa N805 | |||
if cls not in cls._instances: | |||
cls._instances[cls] = super(Singleton, cls).__call__() | |||
return cls._instances[cls] | |||
|
|||
|
|||
class ConnectedSet: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need a __len__
to make it properly look like a collections.Set
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. I'll add it. And perhaps some of the other methods too. But I'm not sure supporting anything other than iteration, length and perhaps union
is worth the effort at the moment.
Oh, and what do we think about naming?
|
3bc5d6d
to
a5f96f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some things to consider, and maybe try the algorithm we talked about in person
python/lsst/daf/butler/core/utils.py
Outdated
""" | ||
for element in (sourceElement, targetElement): | ||
if element not in self._elements: | ||
raise KeyError('{} not in set'.format(element)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make this:
missing = {sourceElement, TargetElement} - self._elements if missing: raise KeyError('element(s) {x} not in set'.format(x=missing))
python/lsst/daf/butler/core/utils.py
Outdated
""" | ||
def __init__(self, elements): | ||
self._elements = set(elements) | ||
self._connections = {e: set() for e in elements} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably should iterate over self._elements, as it is not guaranteed all the entries in the elements iterable will be unique. While this behaviorally wont make any difference, it will save you time on duplicate iteration and construction time on sets that will just be discarded the next time a duplicate e is encountered.
python/lsst/daf/butler/core/utils.py
Outdated
if element not in self._elements: | ||
raise KeyError('{} not in set'.format(element)) | ||
if element in self._connections[sourceElement]: | ||
raise ValueError('Dupplicate connection: ({}, {})'.format(sourceElement, targetElement)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really care that there is a duplicate connection, vs just making it a no-op? if the user just says I want these connected, why should they need a try except block to catch something when the result of the operation they are asking for is done. It seems similar to a caching function.
python/lsst/daf/butler/core/utils.py
Outdated
_nConnections = self._nConnections | ||
# Set of all nodes with no incoming connections | ||
startPoints = set(e for (e, n) in _nIncommingConnections.items() if n == 0) | ||
while len(startPoints) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just have this be:
while startPoints:
an empty set is false-y
python/lsst/daf/butler/core/utils.py
Outdated
When either element is not already in the set. | ||
ValueError | ||
When a dupplicate connection is inserted. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add a check to make sure that sourceElement != targetElement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I guess you would pick that up in your cycle check, but it is probably better to stop the user shooting themselves in the foot by accident
4a66ae3
to
78e5dc3
Compare
Took your first suggestion (to use DFS instead to have less jumping all over the place) and your |
""" | ||
if sourceElement == targetElement: | ||
raise ValueError('Cannot connect {} to itself'.format(sourceElement)) | ||
for element in (sourceElement, targetElement): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think:
missing = {sourceElement, TargetElement} - self._elements if missing: raise KeyError('element(s) {x} not in set'.format(x=missing))
Would be better than a for loop. It would also have the benefit of telling the user if point points are invalid rather than bailing at the first
78e5dc3
to
5ec27b8
Compare
5ec27b8
to
9ee798a
Compare
Undoubtedly not optimally efficient, and perhaps wrongly named, but at least useful.