These are a couple of functions I've always kept handy because I end up using them more often than I even expect.  They both work off of an iterable, so I'll define one here to use as an example:

In [1]:
my_iter = [x for x in range(20)]

# Batch Processing

This is a function that will yield small subsets of an iterable and allow you to work with smaller parts at once.  I've used this when I have too much data to process all of it at once, so I could process it in chuncks.  

In [2]:
def batch(iterable, n: int = 1):
    """
    Return a dataset in batches (no overlap)
    :param iterable: the item to be returned in segments
    :param n: length of the segments
    :return: generator of portions of the original data
    """
    for ndx in range(0, len(iterable), n):
        yield iterable[ndx:max(ndx+n, 1)]

In [3]:
for this_batch in batch(my_iter, 3):
    print(this_batch)

[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
[9, 10, 11]
[12, 13, 14]
[15, 16, 17]
[18, 19]


You can see that it just split my iterable up into smaller parts.  It still gave me back all of it, and did not repeat any portions.

# Sliding Window

Different from the batch, this gives me overlapping sections of the iterable.  You define a window size, and it will give you back each window, in order of that size.

In [4]:
from itertools import islice


def window(sequence, n: int = 5):
    """
    Returns a sliding window of width n over the iterable sequence
    :param sequence: iterable to yield segments from
    :param n: number of items in the window
    :return: generator of windows
    """
    _it = iter(sequence)
    result = tuple(islice(_it, n))
    if len(result) == n:
        yield result
    for element in _it:
        result = result[1:] + (element,)
        yield result

In [5]:
for this_window in window(my_iter, 4):
    print(this_window)

(0, 1, 2, 3)
(1, 2, 3, 4)
(2, 3, 4, 5)
(3, 4, 5, 6)
(4, 5, 6, 7)
(5, 6, 7, 8)
(6, 7, 8, 9)
(7, 8, 9, 10)
(8, 9, 10, 11)
(9, 10, 11, 12)
(10, 11, 12, 13)
(11, 12, 13, 14)
(12, 13, 14, 15)
(13, 14, 15, 16)
(14, 15, 16, 17)
(15, 16, 17, 18)
(16, 17, 18, 19)


It's as easy as that!