# <span style='color:red'>Project 1</span>

#### In this project we use dynamic programming to create a trading schedule that maximizes total number of shares traded, under a model of liquidity impact with memory.

In [3]:
import os
import pickle
import numpy as np
import pandas as pd
from util import run_sim

#### Suppose we have a total of N shares that we would like to trade over T time periods.  To do so, we produce a schedule
$$ (n_0, n_1, \ldots, n_{T-1}) \quad \text{where each} \quad n_i \ge 0$$
#### Each $n_i$ represents the quantity that we will attempt  to trade at time $i = 0, 2, \ldots, T-1$.  In reality the market will only allow us to trade a smaller quantity at each time period.  We impose the conditions:
$$ \sum_{i=0}^{T-2} n_i \ \le N \quad \text{and} \quad n_{T-1} = N - \text{quantity traded so far}$$
#### This plays out as follows.  Assume that $\alpha > 0$ (and very small) and $0 < \pi < 1$ are given parameters.  Then we run the following process:
#### 1. Initialize $M = 0$.  Then for $i = 0, 2, \ldots, T-1$ we do the following:
#### 2. Compute $M \leftarrow \lceil 0.1*M + 0.9*n_i\rceil$.
#### 3. At time $i \le T-1$ we trade $S_i \ = \ \lceil(1 - \alpha M^\pi)n_i \rceil$ shares.  
#### 4. Note that $n_{T-1} = N \, - \, \sum_{i = 0}^{T-2} n_i$. 

#### <span style='color:red'>Example:</span>  N = 10000, T = 4,   $\alpha = 0.001$,   $\pi = 0.5$

### <span style='color:red'>Task 1: </span>code a dynamic programming algorithm that computes an optimal schedule of trades $(n_0, n_1, \ldots, n_{T-1})$ with the goal of maximizing the total number of traded shares
#### Make sure that your code runs well for a range of values of $\alpha$ and $\pi$
#### Compute the optimal schedule when $\alpha = 0.001$, $\pi = 0.5$, $N = 100000$ and $T = 10$.   Denote this schedule by $(S_0, S_1, \ldots, S_9)$.

In [4]:
already_ran = False
z, alpha, target_pi = 0.1, 1e-3, 0.5

if not already_ran:
    schedules = run_sim(z=z, alpha=alpha, T=10, N=1e5, several_pi=[0.3, 0.4, 0.5, 0.6, 0.7], notebook=True, F=3) # approximate sched.

    data_dir = './data'
    if not os.path.exists(data_dir): os.makedirs(data_dir)
    with open(f'{ data_dir }/schedules.pkl', 'wb') as f:
        pickle.dump(schedules, f)

else:
    with open('./data/schedules.pkl', 'rb') as f:
        schedules = pickle.load(f)

  0%|          | 0/5 [00:00<?, ?it/s]

In [5]:
print(f'SCHEDULE FOR PI: { target_pi } => { str(schedules[target_pi]) } => SUMS TO { np.sum(schedules[target_pi]) }')

S = np.zeros(len(schedules[target_pi]), dtype='i')
M = total = 0
for t, nt in enumerate( schedules[target_pi] ):
    M = np.ceil(z*M + (1-z)*nt)
    S[t] = np.ceil((1 - alpha*M**target_pi)*nt)
    total += S[t]

print(f'ABLE TO SELL { total } WITH OUR SCHEDULE')

SCHEDULE FOR PI: 0.5 => [6000, 16000, 9000, 9000, 9000, 9000, 9000, 9000, 9000, 15000] => SUMS TO 100000
ABLE TO SELL 89802 WITH OUR SCHEDULE


### <span style='color:red'>Task 2. Test the effectiveness of this computed schedule using the first 2 hours of each day in the TSLA data </span>
To do so, we divide the first 2 hours of each day into 12 separate intervals of ten minutes each.
Each interval is evaluated as follows.  Suppose that the traded volume in that interval is given by the numbers $(V_0, V_1, \ldots, V_9)$. 
Then the interval score we assign to our schedule is given by
$$ \sum_{i = 0}^9 \min\{ S_i, V_i/100 \}.$$
Effectively, this scheme allows us to trade up to a volume of 1% of what the market actually traded.

#### The TOTAL SCORE we assign to our schedule is the average of the all interval scores, averaged over the first 12 intervals of all the days in the first half of our data
#### In other words, if we have 300 days of data, we take the first 150, and we get in total 12x150 = 1800 intervals

In [6]:
df = pd.read_csv('./data/TSLA_first_two_hours_HW1.csv')

df['Dates'] = pd.to_datetime(df['Dates'])

df = df.set_index('Dates').sort_index()

In [7]:
x = df.index.min()
ub = df.index.max()

day = pd.DateOffset(days=1)

res = list()
while x < ub:
    mask = (df.index >= x) & (df.index < x + day)
    slice = df[mask].iloc[1:]
    
    if len(slice) == 120:
        count = sum = 0
        for i in range(len(slice)):
            sum += min(schedules[target_pi][count], slice.iloc[i]['Volume']) # no divide by 100

            count += 1
            if count == 10:
                res.append(sum)
                sum = count = 0

    mask = slice = None
    x += day

print(f'SCORE FOR PI: { target_pi } => { round(np.mean(res), 2) }')

SCORE FOR PI: 0.5 => 7806.08


### <span style='color:red'>Task 3:</span>  code an algorithm that (approximately) does the following:
#### 1. It approximately enumerates all possible values for $\pi$ between $0.3$ and $0.7$
#### 2. It approximately computes the value of $\pi$ that maximizes the TOTAL SCORE, when $N = 100000$, $T = 10$ and $\alpha = 0.001$.
#### 3. This means that we run the DP algorithm (under the chosen value of $\pi$) and then evaluate as above to compute the TOTAL SCORE.

In [8]:
ub = df.index.max()

day = pd.DateOffset(days=1)

all_res = dict()
for pi, schedule in schedules.items():
    x = df.index.min()

    res = list()
    while x < ub:
        mask = (df.index >= x) & (df.index < x + day)
        slice = df[mask].iloc[1:]
        
        if len(slice) == 120:
            count = sum = 0
            for i in range(len(slice)):
                sum += min(schedules[pi][count], slice.iloc[i]['Volume']) # no divide by 100

                count += 1
                if count == 10:
                    res.append(sum)
                    sum = count = 0

        mask = slice = None
        x += day

    all_res[pi] = np.mean(res)
    res = None

In [9]:
print('ALL PIs SORTED BY DESCENDING SCORE:')
for pi, score in sorted(all_res.items(), key=lambda x: -x[1]):
    print(f'SCORE FOR PI: { pi } => { round(score, 2) }')

ALL PIs SORTED BY DESCENDING SCORE:
SCORE FOR PI: 0.5 => 7806.08
SCORE FOR PI: 0.7 => 7805.99
SCORE FOR PI: 0.6 => 7803.61
SCORE FOR PI: 0.4 => 4581.68
SCORE FOR PI: 0.3 => 2244.18
