## Remove duplicate entries from a sequence while keeping order.

In [1]:
a = [ 
        {'x': 2, 'y': 3},
        {'x': 1, 'y': 4},
        {'x': 2, 'y': 3},
        {'x': 2, 'y': 3},
        {'x': 10, 'y': 15}
        ]

In [2]:
def dedupe(items, key=None):
    seen = set() # initialize and empty set
    for item in items: # iterate over each item in the items
        val = item if key is None else key(item) # if key is not provided, set the val to item itself
        if val not in seen:
            yield item 
            seen.add(val)

In [3]:
dedupe(a, key=lambda a: (a['x'], a['y']))

<generator object dedupe at 0x107406d40>

In [4]:
list(dedupe(a, key=lambda a: (a['x'], a['y'])))

[{'x': 2, 'y': 3}, {'x': 1, 'y': 4}, {'x': 10, 'y': 15}]

The `dedupe` function is used to remove duplicates from an iterable while preserving the order of the unique items. This function takes two arguments:

1. `items`: This is the iterable (e.g., a list) from which you want to remove duplicates.

2. `key` (optional): This is an optional argument that allows you to specify a function that computes a key for each item in the iterable. If provided, the function will use the computed keys to determine duplicates based on these keys. If not provided (default is `None`), it will use the items themselves to check for duplicates.

Here's how the code works:

1. It initializes an empty set called `seen` to keep track of the unique items.

2. It iterates over each item in the `items` iterable.

3. Inside the loop, it computes a `val` based on the `key` function. If `key` is not provided (i.e., it's `None`), `val` is set to the item itself. If `key` is provided, it calls the `key` function with the current item, and `val` becomes the result of that function.

4. It checks whether `val` is already in the `seen` set. If it's not in the set, it means the item is unique (based on the key or the item itself). In this case, it yields the item, which adds it to the output, and then adds `val` to the `seen` set to mark it as seen.

5. The loop continues until all items in the input iterable have been processed, and the function yields unique items while maintaining their original order.



In this example, `dedupe` is used to remove duplicates from the `data` list, resulting in a list containing only unique values while preserving the original order.

In [5]:
data = [1, 2, 2, 3, 4, 4, 5, 3, 1, 7]
list(dedupe(data))

[1, 2, 3, 4, 5, 7]