# 13.9 Takeaway - Partitioning and sorting an array with repeated entries

## Problem Explanation

Order elements so that similar elements are with each other

Input: [b, a, c, b, d, a, b, d]

output: [a, a, b, b, b, c, d, d]

output [d, d, c, a, a, b, b, b] is also acceptable

The above example is a bit misleading for simplicity. In reality, we are getting a list of `('Person', ('age', 'name'))` objects as input, and have to group the array by ages of each person object.


## My First Naiive approach

I first thought it would be easy to just create a counter hashmap manually, tracking each person by age. Each key would be the age, and the value would be the list of person objects that matched that age.

Then I would spit them out into a new array (inducing O(n) space), one key of the hashmap at a time. The problem does ask to just edit the people list, so i could change each entry one at a time to match the new result array.

## Naiive implementation

In [None]:
import collections

Person = collections.namedtuple('Person', ('age', 'name'))

def group_by_age(people):
    people_by_age = {}
    age_to_count = collections.Counter(person.age for person in people)
    result = []

    for person in people:
        if person.age not in people_by_age:
            people_by_age[person.age] = [person]
        else:
            people_by_age[person.age].append(person)

    # This result array costs O(n) space, which is why this isn't the ideal solution. And because..it's too easy. EPI doesn't do easy! :)
    for age in people_by_age.keys():
        for person in people_by_age[age]:
            result.append(person)

    # Need to change each entry index by index, since the tester looks at the people variable to test
    for i in range(len(people)):
        people[i] = result[i]

## The correct approach

The best solution is the one that doesn't take O(n) space. To do this, we need to edit the people array directly, so that the total space is just dependent on the hashmap we build, which will be O(m) space.

Overall it's a lot trickier since there are 2 hashmaps to build.
- Mapping of frequency of ages
- Mapping of ages to where their offset (starting index) should be in the people array

We have to use the relationship between these 2 maps to create `from_idx` and `to_idx` variables that need to be swapped. The item in from_idx will be moved to the to_idx, the location where their correct offset partition starts in the people array. That position is tracked by our 2nd mapping.

We just keep going until we delete all the keys in the 2nd hashmap.

In [1]:
import collections

def group_by_age(people):
    age_map = collections.Counter(person.age for person in people)
    
    person_indices, offset = {}, 0

    # Handy way to get both key|value variables is with hashmap.items() 
    # Creates offsets for which index each group should start 
    # Following above example: [a = 0, b = 2, c = 5, d = 6] 
    for age, count in age_map.items():
        person_indices[age] = offset
        offset += count

    # We are going to keep going until we delete all our map keys in person_indices
    # We need to find where the age starts from, and where it needs to swap to
    # This is hella confusing, so let's explain step by step
    while person_indices:
        # random age, Could be 'b'
        from_age = next(iter(person_indices))
        # from idx = 2
        from_idx = person_indices[from_age]

        # people[2] aka 'c', so to_age = c
        to_age = people[from_idx].age
        # person_indices[c] = 5
        to_idx = person_indices[to_age]

        # Altogether, we randomly picked an age 'b' from person_indices, used it to pick an index it lives in (2)
        # Then we moved whatever was in index 2 (c), and moved it to it's corresponding age index (5), which we track with person_indices

        # Do the swap
        people[from_idx], people[to_idx] = people[to_idx], people[from_idx]

        # use age_map to track when a key has filled up, and to delete a key from person_indices
        age_map[to_age] -= 1
        
        # If an age has 0 left to go, delete it from person_indices, else bump the 
        if age_map[to_age]:
            person_indices[to_age] += 1
        else:
            del person_indices[to_age]