Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-14012: Add helper container class supporting topologically sorted iteration over elements #29

Merged
merged 1 commit into from Apr 19, 2018

Conversation

pschella
Copy link
Collaborator

@pschella pschella commented Apr 5, 2018

Undoubtedly not optimally efficient, and perhaps wrongly named, but at least useful.

@@ -203,3 +203,87 @@ def __call__(cls): # noqa N805
if cls not in cls._instances:
cls._instances[cls] = super(Singleton, cls).__call__()
return cls._instances[cls]


class ConnectedSet:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need a __len__ to make it properly look like a collections.Set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I'll add it. And perhaps some of the other methods too. But I'm not sure supporting anything other than iteration, length and perhaps union is worth the effort at the moment.

@pschella
Copy link
Collaborator Author

pschella commented Apr 6, 2018

Oh, and what do we think about naming?

  • TopologicalSet?
  • ConnectedSet?
  • ...

Copy link
Contributor

@natelust natelust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things to consider, and maybe try the algorithm we talked about in person

"""
for element in (sourceElement, targetElement):
if element not in self._elements:
raise KeyError('{} not in set'.format(element))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make this:
missing = {sourceElement, TargetElement} - self._elements if missing: raise KeyError('element(s) {x} not in set'.format(x=missing))

"""
def __init__(self, elements):
self._elements = set(elements)
self._connections = {e: set() for e in elements}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should iterate over self._elements, as it is not guaranteed all the entries in the elements iterable will be unique. While this behaviorally wont make any difference, it will save you time on duplicate iteration and construction time on sets that will just be discarded the next time a duplicate e is encountered.

if element not in self._elements:
raise KeyError('{} not in set'.format(element))
if element in self._connections[sourceElement]:
raise ValueError('Dupplicate connection: ({}, {})'.format(sourceElement, targetElement))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really care that there is a duplicate connection, vs just making it a no-op? if the user just says I want these connected, why should they need a try except block to catch something when the result of the operation they are asking for is done. It seems similar to a caching function.

_nConnections = self._nConnections
# Set of all nodes with no incoming connections
startPoints = set(e for (e, n) in _nIncommingConnections.items() if n == 0)
while len(startPoints) > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just have this be:
while startPoints:
an empty set is false-y

When either element is not already in the set.
ValueError
When a dupplicate connection is inserted.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add a check to make sure that sourceElement != targetElement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I guess you would pick that up in your cycle check, but it is probably better to stop the user shooting themselves in the foot by accident

@pschella pschella force-pushed the tickets/DM-14012 branch 3 times, most recently from 4a66ae3 to 78e5dc3 Compare April 18, 2018 20:34
@pschella
Copy link
Collaborator Author

Took your first suggestion (to use DFS instead to have less jumping all over the place) and your namedtuple suggestion (which I don't really like normally, but works well here).
If you have some suggestion to get rid of the need for both seen and finished that would be great, but I don't want to spend too much time on it now.

"""
if sourceElement == targetElement:
raise ValueError('Cannot connect {} to itself'.format(sourceElement))
for element in (sourceElement, targetElement):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think:
missing = {sourceElement, TargetElement} - self._elements if missing: raise KeyError('element(s) {x} not in set'.format(x=missing))
Would be better than a for loop. It would also have the benefit of telling the user if point points are invalid rather than bailing at the first

@pschella pschella merged commit 52a3350 into master Apr 19, 2018
@ktlim ktlim deleted the tickets/DM-14012 branch August 25, 2018 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants