## https://leetcode.com/explore/interview/card/facebook/5/array-and-strings/3014

In [1]:
"""
Given an array of strings, group anagrams together.

Example:

Input: ["eat", "tea", "tan", "ate", "nat", "bat"],
Output:
[
  ["ate","eat","tea"],
  ["nat","tan"],
  ["bat"]
]

Note:

All inputs will be in lowercase.
The order of your output does not matter.

"""

In [2]:
"""
Approach 1: Categorize by Sorted String

Intuition

Two strings are anagrams if and only if their sorted strings are equal.

Algorithm

Maintain a map ans : {String -> List} where each key K is a sorted string, and each value is the list of strings from 
the initial input that when sorted, are equal to K.

In Python, we will store the key as a hashable tuple, eg. ('c', 'o', 'd', 'e').

strs = ["are", "bat", "ear", "code", "tab", "era"] 

ans = {('a', 'e', 'r'): ["are", "ear", "era"],
       ('a', 'b', 't'): ["bat", "tab"],
       ('e', 'c', 'd', 'o'): ["code"]}
       

Complexity Analysis:

Time Complexity: O(NKlogK), where N is the length of strs, and K is the maximum length of a string in strs. The outer 
loop has complexity O(N) as we iterate through each string. Then, we sort each string in O(KlogK) time.

Space Complexity: O(NK), the total information content stored in ans.

"""

from collections import defaultdict

def groupAnagrams(strs):
    ans = defaultdict(list)
    for s in strs:
        ans[tuple(sorted(s))].append(s)
    return ans.values()

In [3]:
groupAnagrams(["eat", "tea", "tan", "ate", "nat", "bat"])

dict_values([['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']])

In [18]:
"""
Approach 2: Categorize by Count

Intuition

Two strings are anagrams if and only if their character counts (respective number of occurrences of each character) are 
the same.

Algorithm

We can transform each string s into a character count, "count", consisting of 26 non-negative integers representing the
number of a's, b's, c's, etc. We use these counts as the basis for our hash map.
 
In python, the representation will be a tuple of the counts. For example, abbccc will be (1, 2, 3, 0, 0, ..., 0), where
again there are 26 entries total.


strs = ["aab", "aba", "baa", "abbccc"]

ans = {(2, 1, 0, 0, ...., 0): ["aab", "aba", "baa"],
       (1, 2, 3, 0, ...., 0): ["abbccc"]}
       |                   |
       --26 total entries--


Complexity Analysis

Time Complexity: O(NK), where N is the length of strs, and K is the maximum length of a string in strs. Counting each 
string is linear in the size of the string, and we count every string.

Space Complexity: O(NK), the total information content stored in ans.

"""

def groupAnagrams1(strs):
    ans = defaultdict(list)
    for s in strs:
        print('String is', s)
        count = [0] * 26
        for c in s:
            print(c)
            count[ord(c) - ord('a')] += 1
            print(count)
        ans[tuple(count)].append(s)
        print(ans)
        print('--------------------------------------------------------------------------------------------------------------')
    return ans.values()

In [19]:
groupAnagrams1(["eat", "tea", "tan", "ate", "nat", "bat"])

String is eat
e
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
a
[1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
t
[1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
defaultdict(<class 'list'>, {(1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0): ['eat']})
--------------------------------------------------------------------------------------------------------------
String is tea
t
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
e
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
a
[1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
defaultdict(<class 'list'>, {(1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0): ['eat', 'tea']})
--------------------------------------------------------------------------------------------------------------
String is tan
t
[0,

dict_values([['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']])