# Alphabetic Anagrams

Original: [Alphabetic Anagrams](https://www.codewars.com/kata/53e57dada0cb0400ba000688/train/python)

> Consider a "word" as any sequence of capital letters A-Z (not limited to just "dictionary words"). For any word with at least two different letters, there are other words composed of the same letters but in a different order (for instance, STATIONARILY/ANTIROYALIST, which happen to both be dictionary words; for our purposes "AAIILNORSTTY" is also a "word" composed of the same letters as these two).
> 
> We can then assign a number to every word, based on where it falls in an alphabetically sorted list of all words made up of the same group of letters. One way to do this would be to generate the entire list of words and find the desired one, but this would be slow if the word is long.
> 
> Given a word, return its number. Your function should be able to accept any word 25 letters or less in length (possibly with some letters repeated), and take no more than 500 milliseconds to run. To compare, when the solution code runs the 27 test cases in JS, it takes 101ms.
> 
>
> Sample words, with their rank:
> ABAB = 2
> AAAB = 1
> BAAA = 4
> QUESTION = 24572
> BOOKKEEPER = 10743

## Generate list of all anagrams

Lets start from the brute force approach. Mostly for potential verification purpuses.

We going to generate all premutation of word's letters. The `intertools` return in order as are provided, so we sort the words letters before providing them into `permutations`.

The problem is with same letters: **A** is the same as **A** after all, so duplicates needs to be clear out (without changing the order).

In [1]:
import itertools

def generate_all_anagrams(word: str):
    duplicates = set()
    for permutation in itertools.permutations(sorted(word), len(word)):
        if permutation in duplicates:
            continue

        duplicates.add(permutation)
        yield ''.join(permutation)

Lets test it:

In [2]:
word = 'AAAB'
for index, anagram in enumerate(generate_all_anagrams(word)):
    print(index + 1, anagram)

1 AAAB
2 AABA
3 ABAA
4 BAAA


## Number of anagrams

Bonus: the number of all anagrams can be calculated by formula:

$$ \frac{(all\ characters\ number)!}{\displaystyle\prod_{distinct\ letters\ number}^i{i!}} $$

Explanation: we calculate the permutation number for number of all characters (which is calculate by $ n! $) and we divide the number by product of permutation number for each distinct letters (also by factorial).

Examples:

* for string *ABC*, the number of all anagrams will be $ \frac{3!}{1! 1! 1!} $.
* for string *AAABBC*, the number of all anagrams will be $ \frac{6!}{3! 2! 1!} $ (because there is six characters totall, and A repeats three times, B twice).

Lets implement it:

In [3]:
from functools import reduce
from math import factorial
from operator import mul

def calculate_number_of_anagrams(word: str) -> int:
    if len(word) == 1:
        return 1

    all_possible_numbers = factorial(len(word))
    all_repeating = reduce(mul, (factorial(word.count(letter)) for letter in set(word)))

    return all_possible_numbers // all_repeating

In [4]:
%time

assert calculate_number_of_anagrams('AAAB') == 4
assert calculate_number_of_anagrams('ABCD') == 24

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 6.2 µs


## Solution

### Base by replace of position

My first idea was to calculate who many letters needs to be replace to return to *base* string. But quickly I understand that replace is not equal to replacement (single letter replace on the end of string, is more than the same on the beginning). I tried to calculate the potential multiplayer for positions.

### Recursive approach

Assuming that we can calculate the position on the anagram list of the anagrams begins from specified letters. We can calculate the final position by simply shorting the word itself, we can simply add the results up to final.

The function which I have mentioned, already is known, that is the `calculate_number_of_anagrams`.

#### The removing single letter is difficult

There is only one problem with that solution. It is actually quite complicated to remove single repeating letter from the word, and doing it for each such letter without repetition. At least base on words itself. But we don't need a word (as string), we just need the letters counts. Hence the use of [Counter class](https://docs.python.org/3/library/collections.html#collections.Counter). That require little adjustment of `calculate_number_of_anagrams`.

In [5]:
from collections import Counter
from functools import reduce
from math import factorial
from operator import mul

def get_number_of_anagrams(letters: Counter) -> int:
    if len(letters) == 1:
        return 1

    all_posible_numbers = factorial(sum(letters.values()))
    all_repeating = reduce(mul, (factorial(count) for _, count in letters.items()))

    return all_posible_numbers // all_repeating

def solution(word: str) -> int:
    if len(word) == 1:
        return 1

    letters = Counter(word)
    position_of_anagram_the_same_initial = 0
    for letter in sorted(letters):
        if letter == word[0]:
            return position_of_anagram_the_same_initial + solution(word[1:])

        letters[letter] -= 1
        position_of_anagram_the_same_initial += get_number_of_anagrams(letters)
        letters[letter] += 1

    raise Exception('first letter not found in word')

Solution as we can see, just updates the counter values, which very simplifed the difficult part of rearanging the letters.

Lets test it for given data:

In [6]:
%time

assert solution('ABAB') == 2
assert solution('AAAB') == 1
assert solution('BAAA') == 4
assert solution('QUESTION') == 24572
assert solution('BOOKKEEPER') == 10743

CPU times: user 3 µs, sys: 1 µs, total: 4 µs
Wall time: 4.77 µs
