# Trends in Non-Random Sequences

You've explored trends in sequences of random floats. 
What if the sequences of floats aren't random?

"Like what?" you ask.

## Trends in A Sorted Sequence

A monotonically increasing sequence is a single trend.
In a monotonically decreasing sequence, each element is its own trend.

You can write code to check that.

Because we'll be looking at long sequences, we'll use the `trendlist` package,
which supplies the abstractions `Trend` for trends, `TrendList` for lists of trends,
and methods to manipulate them with efficient algorithms.

In [None]:
from random import random
from trendlist import Trend, TrendList, rands

You can decompose random sequences quickly and easily,

In [None]:
n = 1000
# a random sequence
seq = [random() for _ in range(n)]
trends = TrendList(seq)
len(trends)

but you can also decompose, for example, a sorted sequence,
which should be a single trend.

In [None]:
# the same random numbers after sorting
sorted_seq = sorted(seq)
trends = TrendList(sorted_seq)
len(trends)

In like fashion, in a monotonically decreasing sequence -- a sequence sorted in reverse order -- each element should be its own trend.

In [None]:
# the same numbers after sorting into descending order
reverse_sorted_seq = sorted(seq, reverse=True)
trends = TrendList(reverse_sorted_seq)
len(trends)

## Trends in a Trend

What other kind of non-random sequence might we try to decompose? How about a trend?

These should be fairly easy to come by. A sequence of length $n$, has $n!$ permutations; 
of those, $(n-1)$ are single trends.

We can build intuition about that by going back to working with sequences of powers of two:

In [None]:
from trendlist.simple import is_trend, pows
import itertools

def single_trends(s):
    # return a list of all the permutations of the elements of s
    # s can be any iterable
    _perms = itertools.permutations(s)
    for perm in _perms:
        if is_trend((perm)):
            yield list(perm)

n = 5
for trend in single_trends(pows(n)):
    print(trend)

Of course, decomposing any of these into trends isn't even interesting, right? They're already trends.

In [None]:
n = 8
more_than_one_trend = 0
for trend in single_trends(pows(n)):
    if len(TrendList(trend)) != 1: # if it's not a single trend
        more_than_one_trend += 1
        print(f"{trend} should only have one trend, but has more!")
if not more_than_one_trend:
    print("All single trends!")

## Falling Trends

What, however, if you flip the trends around?

Everything that we've decided about trends so far must have a mirror-image. If you replace "things are always getting better, on average," with, "things are always getting worse, on average," all the logic still follows. Everything you've figured out has a dual:

* Every sequence can be uniquely decomposed into *falling* trends.
* The means of the trends *increase* monotonically.
* $P($ *random sequence of length n is a falling trend* $) == 1/n$ for trendy sets. 
* The average number of falling trends for sequences of length $n$ is $H_n$, the $n$th harmonic number.
* ... and so on.

How can you make a falling trend? Just flip around a rising trend.

In [None]:
n = 8
more_than_one_trend = 0
for trend in single_trends(pows(n)):
    trend.reverse() # make it a falling trend
    if len(TrendList(trend, reverse=True)) != 1: # if it's not a single *falling* trend
        more_than_one_trend += 1
        print(f"{trend} should only have one trend, but has more!")
if not more_than_one_trend:
    print("All single falling trends!")

How cool is that?

## Working with Longer, Random Trends

This opens a question: What do you expect when you decompose a long sequence that's a trend in one direction into trends in the other? We could do it either way -- decompose a single, long rising trend into falling trends, or a single long falling trend into rising trends.

Let's write some code.

(Because of the way we've written trendlist, it's easy to do the first, but the second would require writing more code. At some point, someone should generalize the package a little more so they're equally easy.)

For starters, let's generate random, long trends,
and turn to `trendlist.rands()` for this task, 
because it generates a long sequence quickly.

In [None]:
n=1_000_000
%timeit rands(n)

`trendlist.TrendList()` promptly decomposes such sequences into falling trends.

In [None]:
def rising trends(n):
    return TrendList(rands(n))

%timeit len(rising_trends(n))

And `Trendlist().single()` circularly permutes these into single trends.

In [None]:
%timeit rising_trends(n).single()

Though these have little execution-time cost, they comes at a different price: all information about the original sequence is lost. Having created a single trend, there's no way to go back and decompose it in the opposite direction.

The very simplifications that let us decompose really long sequences efficiently, then rotate them to single trends seem like they prevent us from doing much more with the falling trends than get their means. 

But wait! The `rands()` function takes two more arguments, `seed` and `start`, which let us get what we want.  Here's the plan:

* Generate a random seed for the random number generator, `random.random`, and save it away.
* Generate a random sequence using that seed.
* Decompose the sequence into a rising TrendList -- a list of rising Trends.
* Rotate that TrendList into a single, rising trend. Keep track of how much rotation that required.
* Re-generate the original sequence with the same seed, and rotate it enough to make a single trend.

Does this work? My grandfather used to say, "Hide and watch."

First, we'll make a function like `rands()` that generates a sequence of random floats, but only sequences that are rising trends.

In [None]:
def rands_rising(n):
    seed = random() # pick a random seed
    r = rands(n, seed=seed) # generate a random sequence using that seed
    rot = TrendList(r).single() # decompose it into a rising trend
    return rands(n, seed=seed, start=rot.start) # return a generator that generates it

Let's kick the tires a little.

In [None]:
n = 1_000_000
print(f"the function has type {type(rands_rising(n))},")
print(f"given the argument {n:_} it generates {len(list(rands_rising(n))):_} elements,")
just_one_trend = len(TrendList(rands_rising(n))) == 1
print(f"and those elements {'do' if just_one_trend else 'do not'} form a single trend.")

Can a falling trend ever be a rising trend? No way.

In a falling trend, if you stick your finger somewhere in the middle of a falling trend, the mean of the numbers to the left is greater than the mean of those to the right.
In a rising trend, it's the reverse. Can't be both, so no.

How many rising trends will there be?

You know the answer for one case: a monotonic sequence. If you decompose a monotonically decreasing sequence, which is a single falling trend, into rising trends, every element is a separate trend. Is that true for *all* falling trends.

Also no, but to see that you need to be able to decompose a sequence into trends. You can cannibalize the code to do this from another worksheet.

And now we can ask how many falling trends are in a long rising trend. Let's do a few for practice:

In [None]:
n = 100_000
for i in range(10):
    print(f"{i}: {len(TrendList(rands_rising(n), reverse=True))}")

This is, as you'd expect, more than the number of trends in a completely random sequence.

To be more precise, here's the average number of falling trends in a random sequence.

In [None]:
import math
n = 100_000
trends = []
for trial in range(100):
    trends.append(len(TrendList(rands(n), reverse=True)))
print(f"average number of falling trends in a random sequence: {statistics.mean(trends)}")

def H(n):
    '''Euler approximation to the Nth Harmonic number.'''
    gamma = 0.5772 # The Euler-Mascheroni constant
    return ln(n) + gamma

print(f"We expect{H(n)=}")

And here's the average number of falling trends in a random, rising trend: 

In [None]:
import math
n = 100_000
trends = []
for trial in range(100):
    trends.append(len(TrendList(rands_rising(n), reverse=True)))
print(f"average number of falling trends in a rising trend: {statistics.mean(trends)}")

What is this number? Where's it from?

You'd expect it to be bigger than the number in a random trend,
but how big?

## How Mean Number of Reverse Trends Rises with Sequence Length

You expect the number of reverse trends to grow with sequence length,
so perhaps it would be useful to see how it grows.
When a statistical relationship is simple, it offers predictive value: "Ah, this falling trend has 1,000 reals, that means that if I decompose it into rising trends, I expect to see ...."

Moreover, if it's particularly simple, it may offer a clue to the origin of the relationship.

You'll need a way to graph the results.

In [None]:
'''I hope there is a simpler way to do this
but for now, this works.
'''
from matplotlib import pyplot as plt
import numpy as np
import math

def slope_and_intercept(x, y):
    # fit a least-squares line
    m, b = np.polyfit(x, y, deg=1)
    # coefficient of determination = correlation ** 2
    return (m, b)

def legend(m, b, r):
    equation = f"y = {m:2.2f}*x + {b:2.2f}"
    rho_sq = "\u03c1**2"    # Unicode character for rho is U+03C1. Sadly, the font lacks superscripts.
    goodness_of_fit = f"{rho_sq} = {r**2:2.3f}"
    legend = f"{equation}\n{goodness_of_fit}"
    return legend

def annotate_graph(x, y):
    # add in a best-fit line and a legend
    m, b = slope_and_intercept(x, y) 
    r = np.corrcoef(x, y)[0,1]

    # create nparray that spans the x-space
    xseq = np.linspace(0, math.ceil(x[-1]), num=100)

    # create a legend
    graph_legend = legend(m, b, r)

    # add best-fit line and legend to the plot
    plt.plot(xseq, m * xseq + b, color="red", lw=1.5) # best-fit line
    plt.text(0, y[-2], graph_legend, color="red")           # legend in the urh corner
    

def graph(x, y, x_label, y_label, title):    
    # Create a scatterplot with a title and axis labels
    plt.figure(figsize=(10, 10))
    plt.scatter(x, y, s=10, alpha=0.7)
    plt.title(title)
    plt.xlabel(x_label)
    plt.ylabel(y_label)
    annotate_graph(x, y) # add the legend and the best-fit line
    

In [None]:
graph([1, 2, 3], [2, 4, 6], "X", "Y", "My title")

In [None]:
import math

def average_number_of_trends(n, trials):
    '''Average number of trends in sequence of length n'''
    trends = []
    for trial in range(trials):
        trends.append(len(TrendList(rands(n), reverse=True))) # reverse actually doesn't matter
    return statistics.mean(trends)

def trend_count(max_len, trials=10, points=7):
    trend_count = {}
    interval = (max_len - 1)//points
    for n in range(1, max_len, interval):
        trend_count[n] = average_number_of_trends(n, trials)
    return trend_count
    
max_len = 1000
graph(max_len, trend_count(max_len), "length", "average trend count", "Average Number of Trends")