# Statistics of Permutation Cycles

We've shown that trends map, 1-1, to permutation cycles for sequences of length $N$.

Because the statistics of permutation cycles is well-studied, and because the statistics of permutation cycles can be easier to reason about than the statistics of trends, let's spend a few moments looking at permutations.

We'll begin by creating a couple of utility methods for printing text.
Here's how to print text in bold and in normal.

In [None]:
print('\x1b[1m' + "hello" + '\x1b[0m' + "world")

Here is a routine that encapsulates that.

In [None]:
def c_print(s, reverse = False):
    ansi = "\x1b[1m" if reverse else "\x1b[0m"
    print(f"{ansi}{s}", end = "")
c_print("hello to ")
c_print("all ", reverse=True)
c_print("the world")

Next, let's print every *other* cycle in a list of cycles in bold.

In [None]:
def print_cycles(cycles):
    for n, cycle in enumerate(cycles):
        c_print(cycle, reverse = n%2)
    c_print("\n")  # reset to normal

print_cycles([[1, 2], [3,4], [5, 6], [7, 8]])
print("hello")

This will prove useful shortly, but first let's go back to permutations.

## Generating All Permutations

First, let's generate all permutations of a set with the standard library module $itertools$.

In [None]:
from itertools import permutations
for permutation in permutations({1, 2, 3}):  # all permutations of {1, 2, 3}
    print(permutation)

Next, we'll use *sympy.combinatorics* to decompose a permutation into cycles.

In [None]:
from sympy.combinatorics import Permutation as perm # break a single permutation into cycles
perm((0, 1, 3, 2)).full_cyclic_form

This pair lets us decompose all permutions of a set into cycles.

In [None]:
for permutation in permutations(range(3)):  # put those two together
    print(perm(permutation).full_cyclic_form)

If we just collect all those cycles into one, giant list, we can ask questions about every cycle in all permutations.

In [None]:
# next, collect all cycles, over all permutations

all_cycles = []
for permutation in permutations(range(3)):
    all_cycles.extend(perm(permutation).full_cyclic_form)
all_cycles

That's a bit hard-to-read, so let's group them by cycle length.

In [None]:
cycles_by_length = {}  # next, group those by length: all length-1 cycles, all length-2, etc.
for length in (range(1, 4)):
    cycles_by_length[length] = sorted([cycle for cycle in all_cycles if len(cycle) == length])
cycles_by_length

Finally, let's squeeze out all the air in each line, removing the commas, spaces, and square and curly brackets. We can separate adjacent trends by printing them in different weights by using the *print_cycles()* routine we defined earlier.

In [None]:
# now pretty-print them by length, with all the meta-characters squeezed out
# switch boldness to highlight each new cycle
for length, list in cycles_by_length.items():
    print_cycles(["".join(map(str, cycle)) for cycle in list])

That was at least easy, but let's collect all the routines above into functions
and see what if it might tell us anything.

In [None]:
# putting it all together
def all_cycles(n):
    all_cycles = []
    for permutation in permutations(range(n)):
        all_cycles.extend(perm(permutation).full_cyclic_form)
    return all_cycles

def cycles_by_length(n):
    cycles = all_cycles(n)
    cycles_by_length = {}
    for length in (range(1, n+1)):
        cycles_by_length[length] = sorted([cycle for cycle in cycles if len(cycle) == length])
    return cycles_by_length

def cycle_block(n):
    cycle_block = []
    cbl = cycles_by_length(n)
    for length, list in cbl.items():
        cycle_block.append(["".join(map(str, cycle)) for cycle in list])
    return cycle_block

for cycle_length in cycle_block(4):
    print_cycles(cycle_length)

Oh! For both $n = 3$ and $n = 4$, the rows are all the same length.
Is that true for smaller $n$?

In [None]:
for cycle_length in cycle_block(2):
    print_cycles(cycle_length)

(These look similar, but the first line is two cycles, each length one, which fix the two elements,
while the second is a single cycle of length two, which exchanges the two elements.)

Longer $n$?  The lines are so long they won't fit cleanly onto all screens, but you can still see the rows line up: they're the same length.

In [None]:
for cycle_length in cycle_block(5):
    print_cycles(cycle_length)

Here's a conjecture:

In a cycle block made from all permutations of a sequence of length N,

1. There are $N$ lines.
1. There are $N*N!$ elements in all the cycles.
1. Each line is the same length: N!

The first isn't surprising. For example, for sequences of length $8$, the permutation $(0, 1, 2, 3, 4, 5, 6, 7)$ has cycles $[[0],[1],[2],...[7]]$ -- all length $1$ -- while $(1, 2, 3, 4, 5, 6, 7, 0)$ has the single, $8$-long cycle $[[0, 1, 2, 3, 4, 5, 6, 7]]$.

It's trivial to construct permutations that have any cycle length in between. Try it yourself.

The second of these also makes sense. The total number of elements in all the cycles 
is the total number of elements in all permutations. 
For sequences of length 8, there are 8! permutations, and each permutation has 8 elements:
$(0, 1, ..., 6, 7), (0, 1, ..., 7, 6), ...$
A sequence of length $N$ has $N!$ permutations, each one with $N$ elements

The most remarkable is the third. Let's defer proving that for a moment
and look at some consequences.

1. There are $n!/k$ cycles of length $k$.
2. On average, there are $1/k$ cycles of length $k$ in each permutation.

For example, on average, every permutation has one cycle of length 1.

3. The total number of cycles in all permutations
is $\sum_1^n{n!/k}$ = n!$\sum_1^n{1/k} = n!H_n$,
where $H_n$ is the n-th harmonic number: $1 + 1/2 + 1/3 + ... + 1/N$
4. The average number of cycles in the $n!$ permutations is $n!H_n/n! = H_n$.
5. The average number of cycles is about the log of the sequence length, because $lim_{n->\infty}{(H_n-ln(n))} = \gamma$,
where $\gamma = 0.57721...$ is the Euler-Mascharoni constant.

This isn't shocking. You probably remember that $\int_0^n{(1/x)}dx = ln(n)$ 

6. The average length of a cycle is $n/H_n \approx n/ln(n)$
7. The expected number of cycles longer than $k$ is 
$P(cycle > k) \approx ln(n) - ln(k) = ln(n/k)$


Picturing the easy-to-visualize cycle block in your head, leads to each of these points right away.

For 1-6 above, if you substitute *trend* for *cycle*, the statements should still hold.