In [5]:
import cProfile
import random
import time

# Stock Exachange Problem

Consider the following problem. We have a non-empty array representing evolution of price of a particular stock over time.
```python
A = [20, 3, 19, 1, 15, 6]
```
Given this information we want to find what is the optimal profit we can make using single buy and single sell operation (here we have perfect knowledge of the prices - you can imagine that this quantity is something that quantitative traders would like to know, to compare their decision to the best possible decision given perfect knowledge of the future).

More formally we want to find two numbers $b$, $s$, such that $$0 \leq b \leq s \leq |A|$$ and $$A_s - A_b$$ is maximum possible. Of course we cannot sell before buying.

For example for the array given above, the biggest profit we can make is $16$ (make sure you can see that). Below we present three different solutions to this problem.

In [1]:
# here we seed the random number generator, to ensure, that we generate 
# the same random instance every time we pass seed equal to a particular 
# value. This way the speed comparison is fair.
def make_prices(n, seed):
    """ Return array of n random prices. """
    random.seed(seed)
    return [ random.random() for _ in range(n) ]

### Naive solution
This solution is a direct search of values of $b$ and $s$. The complexity is $O(n^2)$ (intiutively we have two nested for loops, each of which does $O(n)$ iterations, when $n$ = len(A).

In [2]:
def naive(A):
    """ return best gain on A, using naive method 
        running time, due to doubly-nest loop, is O(n^2)
    """
    n = len(A)
    ans = 0
    for i0 in range(n):
        for j0 in range(i0,n):
            ans = max(ans, A[j0]-A[i0])
    return ans

In [3]:
naive([20, 3, 19, 1, 15, 6])

16

In [7]:
# slowness alert!
cProfile.run("naive(make_prices(10000, 1))")

         50025009 function calls in 14.463 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.004    0.004    0.005    0.005 <ipython-input-1-a3d706331740>:1(make_prices)
        1    9.050    9.050   14.458   14.458 <ipython-input-2-1b173f4b093c>:1(naive)
        1    0.000    0.000   14.463   14.463 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 random.py:100(seed)
        1    0.000    0.000    0.000    0.000 {function seed at 0x7f90fbfeb578}
        1    0.000    0.000    0.000    0.000 {len}
 50005000    5.213    0.000    5.213    0.000 {max}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    10000    0.001    0.000    0.001    0.000 {method 'random' of '_random.Random' objects}
    10002    0.195    0.000    0.195    0.000 {range}




### Divide and conquer approach
Here we split our array into two simpler problems corresponding to left half of the array $L$ and right half $R$. 
For example if 
```python
A = [20, 3, 19, 1, 15, 6]
```
we can imagine that
```python
L = [20, 3, 19]
R = [1,  15, 6]
```
In order to reduce our problem to those simpler problems we need to consider three cases:
1. $ b,s \in L$ - we can solve it by solving original problem for $L$
2. $ b,s \in R$ - we can solve it by solving original problem for $R$
3. $ b \in L$ and $ s \in R$ - we can solve it by finding minimum in $L$ and maximum in $R$ and returning the difference

We need not consider the case $s \in L$ and $b \in R$ (why?).

This way we reduced our problem to two smaller problems. This is good - we will keep changing bigger problems into smaller problems until we get problem so small that it is trivial to solve - in this case if our array is of size 1, then the maximum profit we can make is $0$.

The complexity of this solution is can be calculated by solving the following equation:

$$
T(n) = T(n / 2) + T(n / 2) + O(n)
$$
The three summands in the equation above come for cases 1,2,3 listed above. In particular notice that case 3 requires single read through data and therefor has complexity $O(n)$. The solution to this 

In [9]:
def dc(A, lo=None, hi=None):
    """ return best gain on A[lo:hi], using divide & conquer 
        running time is solution to T(n) = 2*T(n/2) + Theta(n) = Theta(n log n)
    """
    if lo is None:
        lo = 0
    if hi is None:
        hi = len(A)
    n = hi-lo
    # base case
    if n == 1:
        return 0
    # divide and conquer
    # divide into lo:mid and mid:hi
    mid = (lo+hi)//2            
    # recurse on left half
    gain_low = dc(A, lo, mid)
    # recurse on right half
    gain_high = dc(A, mid, hi)
    # figure out best gain for buying in left half, selling in right half
    buy_price = min([ A[i] for i in range(lo, mid) ])
    sell_price = max([ A[i] for i in range(mid, hi)])
    gain_cross = sell_price - buy_price
    # optimum is max of three cases just solved
    return max(gain_low, gain_high, gain_cross)

In [10]:
dc([20, 3, 19, 1, 15, 6])

16

In [11]:
cProfile.run("dc(make_prices(10000, 1))")

         80001 function calls (60003 primitive calls) in 0.073 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.004    0.004    0.005    0.005 <ipython-input-1-a3d706331740>:1(make_prices)
  19999/1    0.047    0.000    0.068    0.068 <ipython-input-9-e8b912c3a064>:1(dc)
        1    0.000    0.000    0.073    0.073 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 random.py:100(seed)
        1    0.000    0.000    0.000    0.000 {function seed at 0x7f90fbfeb578}
        1    0.000    0.000    0.000    0.000 {len}
    19998    0.008    0.000    0.008    0.000 {max}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    10000    0.001    0.000    0.001    0.000 {method 'random' of '_random.Random' objects}
     9999    0.004    0.000    0.004    0.000 {min}
    19999    0.008    0.000    0.008    0.000 {range}




### Solution by algorithmic thinking
Notice that if we know a $k$ such that $b \leq k \leq s$ then we can find $b$ and $s$. Indeed, $b$ is minimum to the left of $k$ and $s$ is maximum to the right of $k$. Since we don't know which $k$ is correct, we need to try all values. Implementing that solution naively leads to $O(n^2)$ complexity. Not happy.

To improve on it notice that we can precompute answer to all the questions of form *minimum to the left of $k$* (and store them in the array $B$) and *maximum to the right of $k$* (and store them in array $S$) in complexity $O(n)$. Once promputed - we can just look them up in complexity $O(1)$, which we will do $n$ times - once for each value of $k$. The total complexity is sequal to:

$$
\text{work to precompute B} + \text{work to precompute S} + \text{work to evaluate all values of k}
$$

Notice that all the of those have complexity $O(n)$, so the total complexity is $O(n)$.

In [12]:
def lin(A):
    """ return best gain, computed by linear-time alg 
        running time is Theta(n)
    """
    n = len(A)
    # B[k] = min{ A[i0]: i0 <= k }   for k = 0, 1, ..., n-1
    #      = price to buy at if you have to buy no later than k (and sell no earlier than k)
    B = [A[0]] * n
    for k in range(1, n):
        B[k] = min(B[k-1],A[k])
    # S[k] = max{ A[j0]: j0 >= k }   for k = 0, 1, ..., n-1
    #      = price to sell at if you have to sell no earlier than k (but bought no later than k)
    S = [A[n-1]] * n
    for k in range(n-2, -1, -1):
        S[k] = max(S[k+1], A[k])
    # G[k] = S[k] - B[k] for k = 0, 1, ..., n-1
    #      = best gain from buying no later than k, then selling no earlier than k
    G = [ S[k]-B[k] for k in range(n) ]
    # opt = max { G[k]: 0 <= k < n }
    #     = best possible gain for given input A
    opt = max(G)
    return opt


In [13]:
lin([20, 3, 19, 1, 15, 6])

16

In [14]:
cProfile.run("lin(make_prices(10000, 1))")

         30010 function calls in 0.026 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.004    0.004    0.006    0.006 <ipython-input-1-a3d706331740>:1(make_prices)
        1    0.015    0.015    0.020    0.020 <ipython-input-12-21a4740e4169>:1(lin)
        1    0.000    0.000    0.026    0.026 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 random.py:100(seed)
        1    0.000    0.000    0.000    0.000 {function seed at 0x7f90fbfeb578}
        1    0.000    0.000    0.000    0.000 {len}
    10000    0.003    0.000    0.003    0.000 {max}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    10000    0.001    0.000    0.001    0.000 {method 'random' of '_random.Random' objects}
     9999    0.003    0.000    0.003    0.000 {min}
        4    0.001    0.000    0.001    0.000 {range}




# Problems to think about (non-examinable, non-compulsory, strictly for fun...)
1. **Maximum sum subsequence problem** - given an array A find a contiguous subsequence of maximum sum. For example for
```python
A = [10, -2, 10, 5, -4, 14]
```
the answer is 15.

In [None]:
# Hint to problem 1
cyph = lambda x: chr((ord(x) + 64) % 128)
''.join(map(cyph, '\x08).4z`2%$5#%`4()3`02/",%-`4/`4(%`34/#+`%8#(!.\'%`02/",%-'))