# High Frequency Trading Algorithm

You have been tasked by the investment firm Renaissance High Frequency Trading (RHFT) to develop an automated trading strategy utilizing a combination of machine learning algorithms and high frequency algorithms. RHFT wants this new algorithm to be based on stock market data of the 30 stocks in the Dow Jones at the minute level and to conduct buys and sells every minute based on 1 min, 5 min, and 10 min Momentum. The CIO asked you to choose the Machine Learning Algorithm best suited for this task and wants you to execute the trades via Alpaca's API.

## Part 1: Prepare the data for training and testing

### Initial Set-Up

In [1]:
import os
from pathlib import Path
import alpaca_trade_api as tradeapi
import pandas as pd
import numpy as np
import datetime
import time
from dotenv import load_dotenv

In [2]:
!pip install python-dotenv



In [2]:
# Load .env enviroment variables
# YOUR CODE HERE
load_dotenv('key.env')

True

In [3]:
# Set Alpaca API key and secret
# YOUR CODE HERE
alpaca_api_key = os.getenv("ALPACA_API_KEY")
alpaca_secret_key = os.getenv("ALPACA_SECRET_KEY")
alpaca_api_base_url = "https://paper-api.alpaca.markets"
#PKPOTQ9DJW12ODHW8FMR

In [4]:
# Create the Alpaca API object, specifying use of the paper trading account:
# YOUR CODE HERE
api = tradeapi.REST(
    alpaca_api_key,
    alpaca_secret_key,
    alpaca_api_base_url,
    api_version="v2")

### Data Generation



#### 1. Create a ticker list, beginning and end dates, and timeframe interval.


In [5]:
# Define a list of tickers
    # YOUR CODE HERE
ticker_list = ['FB','AMZN','AAPL','NFLX', 'GOOGL', 'MSFT', 'TSLA']
# declare begin and end date strings
beg_date = '2021-01-05'
end_date = '2021-01-05'
# we convert begin and end date to formats that the ALPACA API requires
start =  pd.Timestamp(f'{beg_date} 09:30:00-0400', tz='America/New_York').replace(hour=9, minute=30, second=0).astimezone('GMT').isoformat()[:-6]+'Z'
end   =  pd.Timestamp(f'{end_date} 16:00:00-0400', tz='America/New_York').replace(hour=16, minute=0, second=0).astimezone('GMT').isoformat()[:-6]+'Z'
# We set the time frequency at which we want to pull prices
timeframe='1Min'

#### 2. Ping the Alpaca API for the data and store it in a DataFrame called `prices` by using the `get_barset` function combined with the `df` method from the Alpaca Trade SDK.

In [6]:
# Pull prices from the ALPACA API
prices = api.get_barset(ticker_list, timeframe,limit=1000, start=start, end=end).df
prices.head()

Unnamed: 0_level_0,AAPL,AAPL,AAPL,AAPL,AAPL,AMZN,AMZN,AMZN,AMZN,AMZN,...,NFLX,NFLX,NFLX,NFLX,NFLX,TSLA,TSLA,TSLA,TSLA,TSLA
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume,...,open,high,low,close,volume,open,high,low,close,volume
time,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2021-01-05 09:30:00-05:00,128.96,129.485,128.45,129.485,51887,3166.8,3173.53,3166.8,3172.98,1963.0,...,521.98,521.98,520.77,521.03,1355.0,723.66,726.28,721.35,725.23,18284.0
2021-01-05 09:31:00-05:00,129.48,130.17,129.3,130.06,44188,3173.59,3182.67,3173.58,3177.81,1266.0,...,520.92,521.755,520.92,521.365,1112.0,726.6,726.999,722.42,723.0,7760.0
2021-01-05 09:32:00-05:00,130.17,130.32,129.93,130.02,12852,3175.0,3175.47,3174.91,3175.47,778.0,...,522.355,522.355,520.77,520.77,1347.0,723.1,723.1,719.78,720.57,9902.0
2021-01-05 09:33:00-05:00,130.09,130.14,129.78,130.12,14192,3181.52,3181.52,3177.87,3179.36,660.0,...,520.84,520.84,520.0,520.0,1582.0,720.53,722.71,719.22,719.71,7086.0
2021-01-05 09:34:00-05:00,130.15,130.58,130.15,130.51,12002,3183.66,3189.98,3183.66,3184.015,731.0,...,521.44,522.26,521.37,522.24,1039.0,719.97,724.22,719.97,724.22,8581.0


#### 3. Store only the close prices from the `prices` DataFrame in a new DataFrame called `df_closing_prices`, then view the head and tail to confirm the following:
* First price for each stock on the open at 9:30 Eastern Time.
* Last price for the day on the close at 3:59 pm Eastern Time.

In [7]:
# Create an empty DataFrame for closing prices
# YOUR CODE HERE
df_closing_prices = pd.DataFrame()

for i in ticker_list:
    df_closing_prices[i] = prices[i]['close']

# Fetch the closing prices for each one of the tickers and store in a column in df_closing_prices amed after that ticker
# YOUR CODE HERE
df_closing_prices

Unnamed: 0_level_0,FB,AMZN,AAPL,NFLX,GOOGL,MSFT,TSLA
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2021-01-05 09:30:00-05:00,269.00,3172.980,129.485,521.030,1724.17,217.650,725.23
2021-01-05 09:31:00-05:00,269.17,3177.810,130.060,521.365,1724.05,217.630,723.00
2021-01-05 09:32:00-05:00,269.72,3175.470,130.020,520.770,1721.61,217.770,720.57
2021-01-05 09:33:00-05:00,268.80,3179.360,130.120,520.000,,217.720,719.71
2021-01-05 09:34:00-05:00,269.58,3184.015,130.510,522.240,1720.30,217.310,724.22
...,...,...,...,...,...,...,...
2021-01-05 15:56:00-05:00,270.65,3219.840,130.850,519.570,1738.15,217.970,733.00
2021-01-05 15:57:00-05:00,270.91,3222.700,131.010,520.460,1738.99,218.175,734.49
2021-01-05 15:58:00-05:00,270.88,3221.180,130.990,520.300,1738.84,218.150,734.83
2021-01-05 15:59:00-05:00,270.86,3219.670,130.965,520.760,1740.57,218.000,735.33


In [8]:
# Preview first five rows
# YOUR CODE HERE
df_closing_prices.head(5)

Unnamed: 0_level_0,FB,AMZN,AAPL,NFLX,GOOGL,MSFT,TSLA
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2021-01-05 09:30:00-05:00,269.0,3172.98,129.485,521.03,1724.17,217.65,725.23
2021-01-05 09:31:00-05:00,269.17,3177.81,130.06,521.365,1724.05,217.63,723.0
2021-01-05 09:32:00-05:00,269.72,3175.47,130.02,520.77,1721.61,217.77,720.57
2021-01-05 09:33:00-05:00,268.8,3179.36,130.12,520.0,,217.72,719.71
2021-01-05 09:34:00-05:00,269.58,3184.015,130.51,522.24,1720.3,217.31,724.22


In [9]:
# Preview last five rows
# YOUR CODE HERE
df_closing_prices.tail(5) 

Unnamed: 0_level_0,FB,AMZN,AAPL,NFLX,GOOGL,MSFT,TSLA
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2021-01-05 15:56:00-05:00,270.65,3219.84,130.85,519.57,1738.15,217.97,733.0
2021-01-05 15:57:00-05:00,270.91,3222.7,131.01,520.46,1738.99,218.175,734.49
2021-01-05 15:58:00-05:00,270.88,3221.18,130.99,520.3,1738.84,218.15,734.83
2021-01-05 15:59:00-05:00,270.86,3219.67,130.965,520.76,1740.57,218.0,735.33
2021-01-05 16:00:00-05:00,,,131.14,,,,


#### 4. When viewing the head and tail, you'll notice several `NaN` values.
* Alpaca reports `NaN` for minutes without any trades occuring as missing.
* These values must be removed, we use Panda's `ffill()` function to "forward fill", or replace, those prices with the previous values (since the price has not changed).


In [10]:
# Use Pandas' forward fill function to fill missing values (be sure to set inplace=True)
# YOUR CODE HERE
df_closing_prices.ffill(inplace=True)

### Computing Returns

In [11]:
#### 1. Compute the percentage change values for 1 minute as follows:
* Create a variable called `forecast` to hold the forecast, in this case `1` for 1 minute.
* Use the `pct_change` function, passing in the `forecast`, on the `df_closing_prices` DataFrame, storeing the newly generated DataFrame in a variable called `returns`.
* Convert the `returns` DataFrame to show forward returns by passing `-(forecast)` into the `shift function.`

SyntaxError: invalid syntax (<ipython-input-11-c7f61f8492a1>, line 2)

In [12]:
# Define a variable to set prediction period
# YOUR CODE HERE
forecast = 1
# Compute the pct_change for 1 min 
# YOUR CODE HERE
forecast_df = df_closing_prices.pct_change(freq= '1min')
# Shift the returns to convert them to forward returns
# YOUR CODE HERE
forecast_df = forecast_df.shift(-(forecast))
# Preview the DataFrame
# YOUR CODE HERE
forecast_df.head()

Unnamed: 0_level_0,FB,AMZN,AAPL,NFLX,GOOGL,MSFT,TSLA
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2021-01-05 09:30:00-05:00,0.000632,0.001522,0.004441,0.000643,-7e-05,-9.2e-05,-0.003075
2021-01-05 09:31:00-05:00,0.002043,-0.000736,-0.000308,-0.001141,-0.001415,0.000643,-0.003361
2021-01-05 09:32:00-05:00,-0.003411,0.001225,0.000769,-0.001479,0.0,-0.00023,-0.001193
2021-01-05 09:33:00-05:00,0.002902,0.001464,0.002997,0.004308,-0.000761,-0.001883,0.006266
2021-01-05 09:34:00-05:00,0.001335,0.002074,0.000651,-0.001704,0.003061,-0.001703,0.004667


##### Note: 
> You can verify these returns are computed correctly by analyzing the first observation for Facebook:
> * 9:30 am for 0.000632.
 
> How is that number computed? 
 
> * The price of Facebook at 9:30 is 269.00
> * The price of Facebook at 9:31 is 269.17

> Which gives you:

> * (269.17 - 	269.00)/ 269.90 = 0.000632
 

#### 2. Convert the DataFrame into long form for merging later using `unstack` and `reset_index`.

In [13]:
# Use unstack() to bring the data in long format and save the output as as dataframe
unstack = pd.DataFrame(forecast_df.unstack())
# YOUR CODE HERE
# Rename the column to make it easer to identify it:
name = f'F_{forecast}_m_returns'
unstack.rename(columns={0: name}, inplace = True)
# Reset the index of the dataframe for merging later (be sure to set inplace=True)
# YOUR CODE HERE
unstack.reset_index(inplace=True,col_fill='FB')


In [14]:
unstacks = unstack.rename(columns={'time':'level_1'})

In [15]:
# Preview the first five rows
# YOUR CODE HERE
unstack.head()

Unnamed: 0,level_0,time,F_1_m_returns
0,FB,2021-01-05 09:30:00-05:00,0.000632
1,FB,2021-01-05 09:31:00-05:00,0.002043
2,FB,2021-01-05 09:32:00-05:00,-0.003411
3,FB,2021-01-05 09:33:00-05:00,0.002902
4,FB,2021-01-05 09:34:00-05:00,0.001335


In [16]:
# Preview the last five rows
# YOUR CODE HERE
unstack.tail()

Unnamed: 0,level_0,time,F_1_m_returns
2732,TSLA,2021-01-05 15:56:00-05:00,0.002033
2733,TSLA,2021-01-05 15:57:00-05:00,0.000463
2734,TSLA,2021-01-05 15:58:00-05:00,0.00068
2735,TSLA,2021-01-05 15:59:00-05:00,0.0
2736,TSLA,2021-01-05 16:00:00-05:00,


In [17]:
#### 3. Compute the 1, 5, 10 minute momentums that will be used to predict the forward returns, then merge them with the forward returns as follows:
* Create the list of moments: `list_of_momentums = [1,5,10]`.
* Write a for-loop to loop through the `list_of_momentums`, applying them to `pct_change` with the `df_closing_price` with each iteration.
* With each loop, the data temporary DataFrame, `returns_temp` will need to be prepped with `unstack` and `reset_index`, then added as a new column to the original `returns` DataFrame from the prior step.
* Complete this step by dropping the null values from `returns` and creating a multi-index based on date and ticker.

SyntaxError: invalid syntax (<ipython-input-17-02cc5d1193e5>, line 2)

In [18]:
# Create list of momentums that we want to predict
#list_of_momentums = [1,5,10]

#for i in list_of_momentums:   
    # Compute percentage change for each one of the momentums in the momentum list
    # YOUR CODE HERE
    #returns_temp = df_closing_prices.pct_change(i)
    # Unstack the returns 
    #returns_temp = pd.DataFrame(returns_temp.unstack())
    #name = f'F_{i}_m_returns'
    #returns_temp.rename(columns={'time':'level_1'}, inplace = True)
    #returns_temp.rename(columns={'name':'level_1'}, inplace = True)

    #Reset the index so we can merge based on index
    #returns.reset_index(inplace=True)
    # Create list of momentums
list_of_momentums = [1,5,10]

for i in list_of_momentums:  
    # Compute percentage change for each one of the momentums in the momentum list
    returns_temp = df_closing_prices.pct_change(i)
    # Unstack the returns 
    returns_temp = pd.DataFrame(returns_temp.unstack())
    name = f'{i}_m_returns'
    returns_temp.rename(columns={0: name}, inplace = True)
    # Reset the index so we can merge based on index
    returns_temp.reset_index(inplace = True)
    # Merge newly computed returns with previously created returns
    if i ==1:
        returns = returns_temp
    else:
        returns = pd.merge(returns,returns_temp,left_on=['level_0', 'time'],right_on=['level_0', 'time'], how='left', suffixes=('_original', 'right'))

In [19]:
returns.rename(columns = {'time':'level_1'})

Unnamed: 0,level_0,level_1,1_m_returns,5_m_returns,10_m_returns
0,FB,2021-01-05 09:30:00-05:00,,,
1,FB,2021-01-05 09:31:00-05:00,0.000632,,
2,FB,2021-01-05 09:32:00-05:00,0.002043,,
3,FB,2021-01-05 09:33:00-05:00,-0.003411,,
4,FB,2021-01-05 09:34:00-05:00,0.002902,,
...,...,...,...,...,...
2732,TSLA,2021-01-05 15:56:00-05:00,-0.000259,0.000382,0.000806
2733,TSLA,2021-01-05 15:57:00-05:00,0.002033,0.003463,0.003868
2734,TSLA,2021-01-05 15:58:00-05:00,0.000463,0.004223,0.004140
2735,TSLA,2021-01-05 15:59:00-05:00,0.000680,0.003357,0.004934


In [20]:
new_df = pd.merge(unstack,returns)
new_dfs= new_df.rename(columns={'time': 'level_1'})

In [21]:
new_dfs.head(11)

Unnamed: 0,level_0,level_1,F_1_m_returns,1_m_returns,5_m_returns,10_m_returns
0,FB,2021-01-05 09:30:00-05:00,0.000632,,,
1,FB,2021-01-05 09:31:00-05:00,0.002043,0.000632,,
2,FB,2021-01-05 09:32:00-05:00,-0.003411,0.002043,,
3,FB,2021-01-05 09:33:00-05:00,0.002902,-0.003411,,
4,FB,2021-01-05 09:34:00-05:00,0.001335,0.002902,,
5,FB,2021-01-05 09:35:00-05:00,0.000185,0.001335,0.003494,
6,FB,2021-01-05 09:36:00-05:00,0.000778,0.000185,0.003046,
7,FB,2021-01-05 09:37:00-05:00,-0.000777,0.000778,0.00178,
8,FB,2021-01-05 09:38:00-05:00,0.001,-0.000777,0.004427,
9,FB,2021-01-05 09:39:00-05:00,7.4e-05,0.001,0.002522,


In [22]:
returns.tail(11)

Unnamed: 0,level_0,time,1_m_returns,5_m_returns,10_m_returns
0,FB,2021-01-05 09:30:00-05:00,,,
1,FB,2021-01-05 09:31:00-05:00,0.000632,,
2,FB,2021-01-05 09:32:00-05:00,0.002043,,
3,FB,2021-01-05 09:33:00-05:00,-0.003411,,
4,FB,2021-01-05 09:34:00-05:00,0.002902,,
5,FB,2021-01-05 09:35:00-05:00,0.001335,0.003494,
6,FB,2021-01-05 09:36:00-05:00,0.000185,0.003046,
7,FB,2021-01-05 09:37:00-05:00,0.000778,0.00178,
8,FB,2021-01-05 09:38:00-05:00,-0.000777,0.004427,
9,FB,2021-01-05 09:39:00-05:00,0.001,0.002522,


In [23]:
# Use dropna() to get rid of those missing observations.
# YOUR CODE HERE
new = new_dfs.dropna()
# Create a multi index based on level_0 and time
# YOUR CODE HERE
newr = new.set_index(['level_0', 'level_1'])
newr

Unnamed: 0_level_0,Unnamed: 1_level_0,F_1_m_returns,1_m_returns,5_m_returns,10_m_returns
level_0,level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
FB,2021-01-05 09:40:00-05:00,0.000814,0.000074,0.001260,0.004758
FB,2021-01-05 09:41:00-05:00,0.000887,0.000814,0.001889,0.004941
FB,2021-01-05 09:42:00-05:00,0.000628,0.000887,0.001999,0.003782
FB,2021-01-05 09:43:00-05:00,0.000480,0.000628,0.003408,0.007850
FB,2021-01-05 09:44:00-05:00,-0.001291,0.000480,0.002886,0.005416
...,...,...,...,...,...
TSLA,2021-01-05 15:55:00-05:00,-0.000259,0.000437,0.001140,0.000164
TSLA,2021-01-05 15:56:00-05:00,0.002033,-0.000259,0.000382,0.000806
TSLA,2021-01-05 15:57:00-05:00,0.000463,0.002033,0.003463,0.003868
TSLA,2021-01-05 15:58:00-05:00,0.000680,0.000463,0.004223,0.004140


## Part 2: Train and Compare Multiple Machine Learning Algorithms

 In this section, you'll train each of the requested algorithms and compare performance. Be sure to use the same parameters and training steps for each model. This is necessary to compare each model accurately.

### Preprocessing Data

#### 1. Generate your feature data (`X`) and target data (`y`):
* Create a dataframe `X` that contains all the columns from the returns dataframe that will be used to predict `F_1_m_returns`.
* Create a variable, called `y`, that is equal 1 if `F_1_m_returns` is larger than 0. This will be our target variable.

In [24]:
# Load the dataset returns.csv and set the index to level_0 and time
# YOUR CODE HERE
newr.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,F_1_m_returns,1_m_returns,5_m_returns,10_m_returns
level_0,level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
FB,2021-01-05 09:40:00-05:00,0.000814,7.4e-05,0.00126,0.004758
FB,2021-01-05 09:41:00-05:00,0.000887,0.000814,0.001889,0.004941
FB,2021-01-05 09:42:00-05:00,0.000628,0.000887,0.001999,0.003782
FB,2021-01-05 09:43:00-05:00,0.00048,0.000628,0.003408,0.00785
FB,2021-01-05 09:44:00-05:00,-0.001291,0.00048,0.002886,0.005416


In [25]:
X = newr.drop(columns=['F_1_m_returns'])
X

Unnamed: 0_level_0,Unnamed: 1_level_0,1_m_returns,5_m_returns,10_m_returns
level_0,level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
FB,2021-01-05 09:40:00-05:00,0.000074,0.001260,0.004758
FB,2021-01-05 09:41:00-05:00,0.000814,0.001889,0.004941
FB,2021-01-05 09:42:00-05:00,0.000887,0.001999,0.003782
FB,2021-01-05 09:43:00-05:00,0.000628,0.003408,0.007850
FB,2021-01-05 09:44:00-05:00,0.000480,0.002886,0.005416
...,...,...,...,...
TSLA,2021-01-05 15:55:00-05:00,0.000437,0.001140,0.000164
TSLA,2021-01-05 15:56:00-05:00,-0.000259,0.000382,0.000806
TSLA,2021-01-05 15:57:00-05:00,0.002033,0.003463,0.003868
TSLA,2021-01-05 15:58:00-05:00,0.000463,0.004223,0.004140


In [26]:
y = np.where(newr['F_1_m_returns']>0,1,0)

In [27]:
# Create a separate dataframe for features and define the target variable as a binary target
# YOUR CODE HERE

# Create the target variable
# YOUR CODE HERE

##### Note:
> Notice that we don't use shuffle when splitting the dataset into a training and testing dataset. 

> We want to keep the original ordering of the data, so we don't end up using observations in the future to predict past observations,

> This is a critical mistake known as look ahead bias.

#### 2. Use the train_test_split library to split the dataset into a training and testing dataset, with 70% used for testing
* Set the shuffle parameter to False, so that you use the first 70% for training to prvent look ahead bias.
* Make sure you have these 4 variables: `X_train`, `X_test`, `y_train`, `y_test`. 

In [28]:
# Import train_test_split 
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
#from sklearn.model_selection import Time_Series_Split

# Split the dataset without shuffling
# YOUR CODE HERE
X_train, X_test, y_train, y_test = train_test_split(X, y,train_size=0.7)
#X_train, X_test, y_train, y_test = Time_Series_Split(X, y,n_splits = 5)

In [29]:
# Create the StandardScaler instance
scaler = StandardScaler()

In [30]:
# Fit the Standard Scaler with the training data
X_scaler = scaler.fit(X_train)

In [31]:
# Scale the training data
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

In [33]:
#!pip install imblearn

#### 3. Use the `Counter` function to test the distribution of the data. 
* The result of `Counter({1: 668, 0: 1194})` reveals the data is indeed unbalanced.

In [1]:
# Import the Counter function from the collections library
from collections import Counter
# Use Counter to count the number 1s and 0 in y_train
# YOUR CODE HERE
Counter(y_train)

NameError: name 'y_train' is not defined

#### 4. Balance the dataset with the Oversampler libary, setting `random state= 1`.

In [33]:
# Import RandomOverSampler from the imblearn library
from imblearn.over_sampling import RandomOverSampler

# Use RandomOverSampler to resample the datase using random_state=1
ros = RandomOverSampler(random_state=1)
X_resampled, y_resampled = ros.fit_resample(X_train, y_train)

#### 5. Test the distribution once again with `Counter`. The new result of `Counter({1: 1194, 0: 1194})` shows the data is now balanced.

In [34]:
# Use Counter again to verify imbalance removed
# YOUR CODE HERE
Counter(y_resampled)

Counter({0: 1112, 1: 1112})

# Machine Learning

#### 1. The first cells in this section provide an example of how to fit and train your model using the `LogisticRegression` model from sklearn:
* Import select model.
* Instantiate model object.
* Fit the model to the resampled data - `X_resampled` and `y_resampled`.
* Predict the model using `X_test`.
* Print the classification report.

In [35]:
# Import classification_report from sklearn
from sklearn.metrics import classification_report


from sklearn import tree
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.ensemble import GradientBoostingClassifier

# Needed for decision tree visualization
import pydotplus
from IPython.display import Image

In [39]:
!pip install pydotplus



In [36]:
# Import LogisticRegression from sklearn
from sklearn.linear_model import LogisticRegression

# Create a LogisticRegression model and train it on the X_resampled data we created before
log_model = LogisticRegression()
log_model.fit(X_resampled, y_resampled)  

# Use the model you trained to predict using X_test
y_pred = log_model.predict(X_test)   

# Print out a classification report toevaluate performance
print(classification_report(y_test, y_pred, digits=4))

              precision    recall  f1-score   support

           0     0.5888    0.5175    0.5508       487
           1     0.3666    0.4359    0.3982       312

    accuracy                         0.4856       799
   macro avg     0.4777    0.4767    0.4745       799
weighted avg     0.5020    0.4856    0.4912       799



#### 2. Use the same approach as above to train and test the following ML Algorithms:
* [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)
* [GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)
* [AdaBoostClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html)
* [XGBClassifier](https://xgboost.readthedocs.io/en/latest/python/python_api.html)

#### RandomForestClassifier

In [37]:
# Import RandomForestClassifier from sklearn
from sklearn.ensemble import RandomForestClassifier

# Create a RandomForestClassifier model and train it on the X_resampled data we created before
# YOUR CODE HERE
rf_model = RandomForestClassifier(n_estimators=500, random_state=78)
# Use the model you trained to predict using X_test
# YOUR CODE HERE  
rf_model = rf_model.fit(X_train_scaled, y_train)
# Print out a classification report to evaluate performance
# YOUR CODE HERE
predictions = rf_model.predict(X_test_scaled)


#### GradientBoostingClassifier

In [38]:
# Import RandomForestClassifier from sklearn
from sklearn.ensemble import GradientBoostingClassifier

# Create a GradientBoostingClassifier model and train it on the X_resampled data we created before
# YOUR CODE HERE
# Choose learning rate
learning_rates = [0.05, 0.1, 0.25, 0.5, 0.75, 1]
for learning_rate in learning_rates:
    model = GradientBoostingClassifier(
        n_estimators=100,
        learning_rate=learning_rate,
        max_features=2,
        max_depth=3,
        random_state=0)
    model.fit(X_train_scaled,y_train.ravel())
    print("Learning rate: ", learning_rate)

    # Score the model
    print("Accuracy score (training): {0:.3f}".format(
        model.score(
            X_train_scaled,
            y_train.ravel())))
    print("Accuracy score (validation): {0:.3f}".format(
        model.score(
            X_test_scaled,
            y_test.ravel())))
    print()
# Use the model you trained to predict using X_test
# YOUR CODE HERE     

# Print out a classification report to evaluate performance
# YOUR CODE HERE

Learning rate:  0.05
Accuracy score (training): 0.673
Accuracy score (validation): 0.593

Learning rate:  0.1
Accuracy score (training): 0.721
Accuracy score (validation): 0.601

Learning rate:  0.25
Accuracy score (training): 0.820
Accuracy score (validation): 0.566

Learning rate:  0.5
Accuracy score (training): 0.887
Accuracy score (validation): 0.533

Learning rate:  0.75
Accuracy score (training): 0.912
Accuracy score (validation): 0.562

Learning rate:  1
Accuracy score (training): 0.933
Accuracy score (validation): 0.566



In [39]:
# Choose a learning rate and create classifier
classifier = GradientBoostingClassifier(n_estimators=20,
                                        learning_rate=0.75,
                                        max_features=2,
                                        max_depth=3,
                                        random_state=0)

# Fit the model
classifier.fit(X_train_scaled, y_train.ravel())

# Make Prediction
predictions = classifier.predict(X_test_scaled)
pd.DataFrame({"Prediction": predictions, "Actual": y_test.ravel()}).head(20)

Unnamed: 0,Prediction,Actual
0,1,0
1,1,1
2,0,0
3,1,0
4,0,0
5,1,1
6,0,1
7,0,0
8,0,0
9,0,0


In [40]:
# Make predictions
predictions = model.predict(X_test_scaled)

# Generate accuracy score for predictions using y_test
accuracy_score(y_test, predictions)

0.5657071339173968

In [41]:
# Generatring the confusion matrix
cm = confusion_matrix(y_test, predictions)
cm_df = pd.DataFrame(
    cm, index=["Actual 0", "Actual 1"], columns=[
        "Predicted 0",
        "Predicted 1"
    ]
)

display(cm_df)

Unnamed: 0,Predicted 0,Predicted 1
Actual 0,335,152
Actual 1,195,117


In [42]:
# Generate classification report
print(classification_report(y_test, predictions))

              precision    recall  f1-score   support

           0       0.63      0.69      0.66       487
           1       0.43      0.38      0.40       312

    accuracy                           0.57       799
   macro avg       0.53      0.53      0.53       799
weighted avg       0.56      0.57      0.56       799



#### AdaBoostClassifier

In [43]:
# Import RandomForestClassifier from sklearn
from sklearn.ensemble import AdaBoostClassifier

# Create a AdaBoostClassifier model and train it on the X_resampled data we created before
# YOUR CODE HERE
abc = AdaBoostClassifier(n_estimators=50,
                         learning_rate=1)
# Use the model you trained to predict using X_test
# YOUR CODE HERE
model = abc.fit(X_train, y_train)
# Print out a classification report to evaluate performance
# YOUR CODE HERE
y_pred = model.predict(X_test)

In [44]:
from sklearn import metrics
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Accuracy: 0.5944931163954944


#### XGBClassifier

In [46]:
# Import RandomForestClassifier from sklearn
from xgboost import XGBClassifier

# Create a XGBClassifier model and train it on the X_resampled data we created before
# YOUR CODE HERE
# load data

# split data into X and y

# split data into train and test sets
seed = 7
test_size = 0.33
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size, random_state=seed)
# fit model no training data
model = XGBClassifier()
model.fit(X_train, y_train)
# make predictions for test data
y_pred = model.predict(X_test)
predictions = [round(value) for value in y_pred]
# evaluate predictions
accuracy = accuracy_score(y_test, predictions)
print("Accuracy: %.2f%%" % (accuracy * 100.0))
# Use the model you trained to predict using X_test
# YOUR CODE HERE

# Print out a classification report to evaluate performance
# YOUR CODE HERE



Accuracy: 56.72%


In [78]:
!pip install xgboost

Collecting xgboost
  Downloading xgboost-1.3.3-py3-none-win_amd64.whl (95.2 MB)
Installing collected packages: xgboost
Successfully installed xgboost-1.3.3


### Evaluate the performance of each model


#### 1. Using the classification report for each model, choose the model with the highest precision for use in your algo-trading program.
#### 2. Save the selected model with the `joblib` libary to avoid retraining every time you wish to use it.

In [51]:
# Import the joblib library 
import joblib

# Use the library to save the model that you want to use for trading
joblib.dump(abc, 'log_model.pkl')

['log_model.pkl']

## Part 3: Implement the strongest model using Apaca API

### Develop the Algorithm


#### 1. Use the provided code to ping the Alpaca API and create the DataFrame needed to feed data into the model.
   * This code will also store the correct feature data in `X` for later use.

In [47]:
# Create the list of tickers

ticker_list = ['FB','AMZN','AAPL','NFLX', 'GOOGL', 'MSFT', 'TSLA']
# Define Dates

beg_date = '2021-01-06'
end_date = '2021-01-06'

# Convert the date in a format the Alpaca API reqires
start =  pd.Timestamp(f'{beg_date} 09:30:00-0400', tz='America/New_York').replace(hour=9, minute=30, second=0).astimezone('GMT').isoformat()[:-6]+'Z'
end   =  pd.Timestamp(f'{end_date} 16:00:00-0400', tz='America/New_York').replace(hour=15, minute=0, second=0).astimezone('GMT').isoformat()[:-6]+'Z'
timeframe='1Min'

# Use iloc to get the last 10 mins every time we pull new data
prices = api.get_barset(ticker_list, "minute", start=start, end=end).df.iloc[-11:]
prices.ffill(inplace=True)   

# Create an empty DataFrame for closing prices
df_closing_prices = pd.DataFrame()

# Fetch the closing prices of our tickers
df_closing_prices["FB"] = prices["FB"]["close"]
df_closing_prices["AMZN"] = prices["AMZN"]["close"]
df_closing_prices["AAPL"] = prices["AAPL"]["close"]
df_closing_prices["NFLX"] = prices["NFLX"]["close"]
df_closing_prices["GOOGL"] = prices["GOOGL"]["close"]
df_closing_prices['MSFT'] = prices['MSFT']["close"]
df_closing_prices['TSLA'] = prices['TSLA']["close"]

print(df_closing_prices.head(20))

                                FB      AMZN     AAPL    NFLX    GOOGL  \
time                                                                     
2021-01-06 14:50:00-05:00  264.610  3146.960  127.110  506.54  1721.82   
2021-01-06 14:51:00-05:00  264.630  3146.910  127.430  506.54  1721.82   
2021-01-06 14:52:00-05:00  264.830  3147.980  127.720  506.69  1723.67   
2021-01-06 14:53:00-05:00  264.525  3148.570  127.510  506.01  1723.67   
2021-01-06 14:54:00-05:00  264.560  3147.840  127.645  506.01  1720.84   
2021-01-06 14:55:00-05:00  264.880  3150.330  127.920  506.30  1720.60   
2021-01-06 14:56:00-05:00  264.965  3150.610  128.150  506.72  1721.10   
2021-01-06 14:57:00-05:00  264.980  3151.745  127.980  507.07  1720.07   
2021-01-06 14:58:00-05:00  265.000  3149.280  127.850  506.33  1720.07   
2021-01-06 14:59:00-05:00  265.360  3150.840  127.930  506.13  1720.48   
2021-01-06 15:00:00-05:00  264.840  3148.580  127.630  506.43  1720.48   

                              MSFT   

In [52]:
# Create list of momentums
list_of_momentums = [1,5,10]

for i in list_of_momentums:  
    # Compute percentage change for each one of the momentums in the momentum list
    returns_temp = df_closing_prices.pct_change(i)
    # Unstack the returns 
    returns_temp = pd.DataFrame(returns_temp.unstack())
    name = f'{i}_m_returns'
    returns_temp.rename(columns={0: name}, inplace = True)
    # Reset the index so we can merge based on index
    returns_temp.reset_index(inplace = True)
    # Merge newly computed returns with previously created returns
    if i ==1:
        returns = returns_temp
    else:
        returns = pd.merge(returns,returns_temp,left_on=['level_0', 'time'],right_on=['level_0', 'time'], how='left', suffixes=('_original', 'right'))

# Drop nulls and set index
returns.dropna(axis=0, how='any', inplace=True)
returns.set_index(['level_0', 'time'], inplace=True)

# Generate feature data and preview first 10 rows.
X = returns
X


Unnamed: 0_level_0,Unnamed: 1_level_0,1_m_returns,5_m_returns,10_m_returns
level_0,time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
FB,2021-01-06 15:00:00-05:00,-0.00196,-0.000151,0.000869
AMZN,2021-01-06 15:00:00-05:00,-0.000717,-0.000555,0.000515
AAPL,2021-01-06 15:00:00-05:00,-0.002345,-0.002267,0.004091
NFLX,2021-01-06 15:00:00-05:00,0.000593,0.000257,-0.000217
GOOGL,2021-01-06 15:00:00-05:00,0.0,-7e-05,-0.000778
MSFT,2021-01-06 15:00:00-05:00,-0.00077,-0.001213,0.000654
TSLA,2021-01-06 15:00:00-05:00,-0.000735,-0.001587,0.0105


#### 2. Using `joblib`, load the chosen model.

In [53]:
# Load the previously trained and saved model using joblib
# YOUR CODE HERE
joblib_file = "log_model.pkl"
joblib.dump(model, joblib_file)

# Load from file
joblib_model = joblib.load(joblib_file)

# Calculate the accuracy and predictions
score = joblib_model.score(X_test, y_test)
print("Test score: {0:.2f} %".format(100 * score))
#Ypredict = pickle_model.predict(X_test)

Test score: 56.72 %


#### 3. Use the model file to make predicttions:
* Use `predict` on `X` and save this as `y_pred`.
* Convert `y_pred` to a DataFrame, setting the index to the index of `X`.
* Rename the column 0 to 'buy', be sure to set `inplace =True`.

In [54]:
# Use the model file to predict on X
# YOUR CODE HERE
y_preds = joblib_model.predict(X)
# Convert y_pred to a dataframe, set the index to the index of X
# YOUR CODE HERE
y_predes = pd.DataFrame(y_preds,X.index)
# Rename the column 0 to 'buy', be sure to set inplace =True
# YOUR CODE HERE
y_predes.rename(columns={0: 'buy'}, inplace = True)

In [55]:
y_predes

Unnamed: 0_level_0,Unnamed: 1_level_0,buy
level_0,time,Unnamed: 2_level_1
FB,2021-01-06 15:00:00-05:00,0
AMZN,2021-01-06 15:00:00-05:00,0
AAPL,2021-01-06 15:00:00-05:00,0
NFLX,2021-01-06 15:00:00-05:00,0
GOOGL,2021-01-06 15:00:00-05:00,0
MSFT,2021-01-06 15:00:00-05:00,0
TSLA,2021-01-06 15:00:00-05:00,0


Unnamed: 0_level_0,Unnamed: 1_level_0,buy
level_0,time,Unnamed: 2_level_1
FB,2021-01-06 15:00:00-05:00,1
AMZN,2021-01-06 15:00:00-05:00,1
AAPL,2021-01-06 15:00:00-05:00,1
NFLX,2021-01-06 15:00:00-05:00,0
GOOGL,2021-01-06 15:00:00-05:00,1
MSFT,2021-01-06 15:00:00-05:00,1
TSLA,2021-01-06 15:00:00-05:00,0


#### 4. Filter the stocks where 'buy' is equal to 1, saving the filter as `y_pred`.

In [56]:
# Filter the stocks where 'buy' is equal to 1
# YOUR CODE HERE
y_predes.filter(items=['buy'=='1'])

level_0,time
FB,2021-01-06 15:00:00-05:00
AMZN,2021-01-06 15:00:00-05:00
AAPL,2021-01-06 15:00:00-05:00
NFLX,2021-01-06 15:00:00-05:00
GOOGL,2021-01-06 15:00:00-05:00
MSFT,2021-01-06 15:00:00-05:00
TSLA,2021-01-06 15:00:00-05:00


#### 5. Using the `y_pred` filter, create a dictionary called `buy_dict` and assign 'n' to each Ticker (key value) as a placeholder.

In [57]:
# Create dictionary from y_pred and assign a 'n' to each of them for now as a placeholder.
buy_dict = dict.fromkeys(y_predes.index.get_level_values(0), 'n')
buy_dict

{'FB': 'n',
 'AMZN': 'n',
 'AAPL': 'n',
 'NFLX': 'n',
 'GOOGL': 'n',
 'MSFT': 'n',
 'TSLA': 'n'}

{'FB': 'n', 'AMZN': 'n', 'AAPL': 'n', 'GOOGL': 'n', 'MSFT': 'n'}

#### 6. Obtain the total available equity in your account from the Alpaca API and store in a variable called `total_capital`. You will split the capital equally between all selected stocks per the CIO's request.

In [58]:
# Pull the total available equity in our account from the  Alpaca API
# YOUR CODE HERE
account = api.get_account()
total_capital = account.equity
total_capital

'100000'

In [59]:
# Compute capital per stock, divide equity in account by number of stocks
# Use Alpaca API to pull the equity in the account
if len(buy_dict) > 0:
    capital_per_stock = float(total_capital)/ len(buy_dict)
else:
    capital_per_stock = 0
print(f'Capital per stock: {capital_per_stock}')

Capital per stock: 14285.714285714286


Capital per stock: 19886.79


#### 7. Use a for-loop to iterate through `buy_dict` to determine the number stocks you need to buy for each ticker.

In [60]:
# Use for loop to iterate through dictionary of buys 
# Determine the number stocks we need to buy for each ticker
for ticker in buy_dict:
    try:
        buy_dict[ticker] = int(capital_per_stock /int(prices[ticker].iloc[-1]['close']))
    except:
        pass

print(buy_dict)

{'FB': 54, 'AMZN': 4, 'AAPL': 112, 'NFLX': 28, 'GOOGL': 8, 'MSFT': 66, 'TSLA': 18}


#### 8. Cancel all previous orders in the Alpaca API (so you don't buy more than intended) and sell all currently held stocks to close all positions.

In [61]:
# Cancel all previous orders in the Alpaca API
# YOUR CODE HERE
api.cancel_all_orders()
# Sell all currently held stocks to close all positions
# YOUR CODE HERE
api.close_all_positions()

[]

#### 9. Iterate through `buy_dict` and send a buy order for each ticker with their corresponding number of shares.

In [2]:
# Iterate through the longlist object and send a buy order for each ticker with a corresponding number of shares:
# YOUR CODE HERE

for ticker ,shares in buy_dict.items():
    print("buying " + ticker + "shares " + str(shares))
    if shares > 0:
        api.submit_order(ticker, side = 'buy', qty=shares, type = 'market', time_in_force='gtc')

NameError: name 'buy_dict' is not defined

buying FB numShare 75
buying AMZN numShare 6
buying AAPL numShare 156
buying GOOGL numShare 11
buying MSFT numShare 92


### Automate the algorithm

#### 1. Make a function called `trade()` that incorporates all of the steps above.

In [90]:
# Add all of the steps conducted above into the function trade
def trade():

    ticker_list = ['FB','AMZN','AAPL','NFLX', 'GOOGL', 'MSFT', 'TSLA']
    # Notice that we remove the start and end variables since we want the latest prices.
    timeframe='1Min'
    # Use iloc to get the last 10 mins every time we pull new data
    prices = api.get_barset(ticker_list, "minute").df.iloc[-11:]
    prices.ffill(inplace=True)   

    # Create and empty DataFrame for closing prices
    df_closing_prices = pd.DataFrame()

    # Fetch the closing prices of our tickers
    df_closing_prices["FB"] = prices["FB"]["close"]
    df_closing_prices["AMZN"] = prices["AMZN"]["close"]
    df_closing_prices["AAPL"] = prices["AAPL"]["close"]
    df_closing_prices["NFLX"] = prices["NFLX"]["close"]
    df_closing_prices["GOOGL"] = prices["GOOGL"]["close"]
    df_closing_prices['MSFT'] = prices['MSFT']["close"]
    df_closing_prices['TSLA'] = prices['TSLA']["close"]
    print(df_closing_prices.head())
    
    # Loop through momentums to build new DataFrame
    list_of_momentums = [1,5,10]
    for i in list_of_momentums:   
        returns_temp = df_closing_prices.pct_change(i)
        returns_temp = pd.DataFrame(returns_temp.unstack())
        name = f'{i}_m_returns'
        returns_temp.rename(columns={0: name}, inplace = True)
        returns_temp.reset_index(inplace = True)
        if i ==1:
            returns = returns_temp
        else:
            returns = pd.merge(returns,returns_temp,left_on=['level_0', 'time'],right_on=['level_0', 'time'], how='left', suffixes=('_original', 'right'))

    # Drop nulls and set index            
    returns.dropna(axis=0, how='any', inplace=True)
    returns.set_index(['level_0', 'time'], inplace=True)

    # Preprocess data for model
    # YOUR CODE HERE
    X = returns
    y_pred = joblib_model.predict(X)
    y_pred = pd.DataFrame(y_pred, index=X.index)
    y_pred.rename(columns={0: 'buy'}, inplace = True)
    
    y_pred = y_pred[y_pred['buy']==1]

    # Create the `buy_dict` object
    # YOUR CODE HERE

    for ticker in buy_dict:
        try:
            buy_dict[ticker] = int(capital_per_stock /int(prices[ticker].iloc[-1]['close']))
        except:
            pass

    print(buy_dict)
    # Split capital between stocks and determine buy or sell
    # YOUR CODE HERE
    if len(buy_dict) > 0:
        capital_per_stock = float(total_capital)/ len(buy_dict)
    else:
        capital_per_stock = 0
    print(f'Capital per stock: {capital_per_stock}')
    
    # Cancel pending orders and close positions
    # YOUR CODE HERE
    
    api.cancel_all_orders()

    # YOUR CODE HERE
    api.close_all_positions()
    
    print(buy_dict)
    # Submit orders
    # YOUR CODE HERE
    for ticker, shares in buy_dict.items():
        print("buying " + ticker + "shares " + str(shares))
        if shares > 0:
            api.submit_order(ticker, side = 'buy', qty=shares, type = 'market', time_in_force='gtc')

#### 2. Import Python's schedule module.

In [75]:
# Import Python's schedule module 
# YOUR CODE HERE
import schedule 

In [74]:
!pip install schedule

In [91]:
trade()

                               FB     AMZN     AAPL     NFLX     GOOGL  \
time                                                                     
2021-04-12 15:49:00-04:00  311.42      NaN  131.330  552.905  2241.210   
2021-04-12 15:50:00-04:00  311.62  3379.02  131.455  553.470  2241.210   
2021-04-12 15:51:00-04:00  311.72  3379.02  131.470  553.300  2243.705   
2021-04-12 15:52:00-04:00  311.71  3379.02  131.450  553.530  2245.200   
2021-04-12 15:53:00-04:00  311.69  3379.02  131.440  553.060  2244.600   

                              MSFT    TSLA  
time                                        
2021-04-12 15:49:00-04:00  256.540  701.91  
2021-04-12 15:50:00-04:00  256.755  702.83  
2021-04-12 15:51:00-04:00  256.640  703.30  
2021-04-12 15:52:00-04:00  256.670  703.71  
2021-04-12 15:53:00-04:00  256.580  702.59  
{'FB': 54, 'AMZN': 4, 'AAPL': 112, 'NFLX': 28, 'GOOGL': 8, 'MSFT': 66, 'TSLA': 18}
Capital per stock: 14285.714285714286
{'FB': 54, 'AMZN': 4, 'AAPL': 112, 'NFLX': 28

#### 3. Use the "schedule" module to automate the algorithm:
* Clear the schedule with `.clear()`.
* Define a schedule to run the trade function every minute at 5 seconds past the minute mark (e.g. `10:31:05`).
* Use the Alpaca API to check whether the market is open.
* Use run_pending() function inside schedule to execute the schedule you defined while the market is open

In [93]:
# Clear the schedule
# YOUR CODE HERE
schedule.clear()
# Define a schedule to run the trade function every minute at 5 seconds past the minute mark (e.g. 10:31:05)
# YOUR CODE HERE
schedule.every().minute.at(":05").do(trade)
# Use the Alpaca API to check whether the market is open
# YOUR CODE HERE
clock = api.get_clock().is_open

# Use run_pending() function inside schedule to execute the schedule you defined as long as the market is open
# YOUR CODE HERE
while clock==True:
    schedule.run_pending()

                               FB     AMZN     AAPL     NFLX    GOOGL  \
time                                                                    
2021-02-02 15:13:00-05:00  267.24  3401.17  134.705  552.420  1924.72   
2021-02-02 15:14:00-05:00  267.42  3397.59  134.665  552.530  1924.84   
2021-02-02 15:15:00-05:00  267.30  3395.85  134.700  552.690  1924.96   
2021-02-02 15:16:00-05:00  267.11  3393.24  134.730  552.320  1925.13   
2021-02-02 15:17:00-05:00  267.19  3397.00  134.820  552.685  1926.48   

                              MSFT    TSLA  
time                                        
2021-02-02 15:13:00-05:00  240.705  878.60  
2021-02-02 15:14:00-05:00  240.590  878.05  
2021-02-02 15:15:00-05:00  240.640  878.22  
2021-02-02 15:16:00-05:00  240.460  878.75  
2021-02-02 15:17:00-05:00  240.670  878.75  
buying AAPL numShare 371
buying TSLA numShare 56
                               FB      AMZN     AAPL     NFLX    GOOGL  \
time                                              