Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consecutive_groups() #169

Merged
merged 1 commit into from
Nov 24, 2017
Merged

Add consecutive_groups() #169

merged 1 commit into from
Nov 24, 2017

Conversation

bbayles
Copy link
Collaborator

@bbayles bbayles commented Nov 22, 2017

This PR adds consecutive_groups(), a tool for finding consecutive blocks of items in an iterable. This is an extension of the example in the old itertools docs.


I've always admired the trick offered in the example, which is excerpted here:

>>> data = [ 1,  4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
>>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
...     print map(operator.itemgetter(1), g)
... 
[1]
[4, 5, 6]
[10]
[15, 16, 17, 18]
[22]
[25, 26, 27, 28]

This works because enumerate counts up by 1 for each item in the iterable (0, 1, 2, 3...), so if we have a run of numbers (20, 21, 22, 23), we'll wind up with a constant difference between the index and each item in the run (20 - 0 = 20, 21 - 1 = 20, 22 - 2 = 20, ...). This we use for grouping.


The example shows how to work with iterables of numbers, but the same method can be applied to any sequence with a clear ordering. I've used this for finding consecutive IP addresses, for example.

>>> from socket import inet_aton
... from struct import unpack
... 
... from more_itertools import consecutive_groups
... 
... iterable = ['192.0.2.0', '192.0.2.2', '192.0.2.3', '192.0.2.5']
... ordering = lambda x: unpack(b'!I', inet_aton(x))[0]
... 
... for group in consecutive_groups(iterable, ordering):
...     print(list(group))
['192.0.2.0']
['192.0.2.2', '192.0.2.3']
['192.0.2.5']

The tests show another example where there's not an obvious numerical representation, but all we need is a ranking method.

@pylang
Copy link

pylang commented Nov 22, 2017

Absolutely. This unsung recipe is so clever. I'm glad you found a general name for it.

For reference, here are other use cases in the wild:

@bbayles bbayles merged commit e08c285 into master Nov 24, 2017
@bbayles bbayles deleted the consecutive-groups-rebased branch November 24, 2017 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants