# This is a research project on using twitter data for executing trades on the Tesla stock

We start by importing the libraries that are going to be used first.

In [17]:
import pandas as pd
import numpy as np
import csv

from backtesting import Backtest, Strategy

from sklearn.preprocessing import MinMaxScaler
from sklearn.ensemble import RandomForestClassifier

from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet as wn
from nltk.corpus import sentiwordnet as swn
from nltk import sent_tokenize, word_tokenize, pos_tag

from textblob import TextBlob
from afinn import Afinn

from numpy import array
from keras.models import Sequential
from keras.layers import Dense 
from keras.layers import LSTM 
from keras.layers import Dropout
from keras.layers import Bidirectional

import pandas_ta as ta

## Data loading and processing

Load the data

In [6]:
tweet_location="D://Aprawa/tesla_data/tsla_tweets.csv"
price_location="D://Aprawa/tesla_data/tsla_prices.csv"

In [8]:
tweets=pd.read_csv(tweet_location)
price=pd.read_csv(price_location)

Clean up price table

In [11]:
print(price.tail()) #check out price data at a glance
print(price.head())

          Date   Adj Close       Close        High         Low        Open  \
677  9/10/2020  371.339996  371.339996  398.989990  360.559998  386.209992   
678  9/11/2020  372.720001  372.720001  382.500000  360.500000  381.940002   
679  9/14/2020  419.619995  419.619995  420.000000  373.299988  380.950012   
680  9/15/2020  449.760010  449.760010  461.940002  430.700012  436.559998   
681  9/16/2020  441.760010  441.760010  457.790008  435.309998  439.869995   

       Volume  
677  84930600  
678  60717500  
679  83020600  
680  97298200  
681  72279300  
       Date  Adj Close      Close       High        Low       Open    Volume
0  1/2/2018  64.106003  64.106003  64.421997  62.200001  62.400002  21761000
1  1/3/2018  63.450001  63.450001  65.050003  63.110001  64.199997  22607500
2  1/4/2018  62.924000  62.924000  63.709999  61.136002  62.574001  49731500
3  1/5/2018  63.316002  63.316002  63.448002  62.400002  63.324001  22956000
4  1/8/2018  67.281998  67.281998  67.403999  63.0

In [12]:
price.dropna(inplace=True) #remove missing values if any
price['Date']=pd.to_datetime(price['Date']) #set Date as index

Next clean up the tweets

In [9]:
tweets.columns #first look at what columns we got

Index(['tweet_id', 'conversation_id', 'date', 'timezone', 'tweet',
       'user_id_str', 'username', 'nlikes', 'nreplies', 'nretweets',
       'quote_url', 'reply_to', 'date_utc', 'id', 'link'],
      dtype='object')

In [10]:
tweets.head() #and what they contain

Unnamed: 0,tweet_id,conversation_id,date,timezone,tweet,user_id_str,username,nlikes,nreplies,nretweets,quote_url,reply_to,date_utc,id,link
0,1306191234671816706,1306191234671816706,2020-09-16 11:20:32 UTC,0,🚀THANK YOU for 1 Mio. views on YouTube coverin...,100253491,_mm85,42,11,1,,"[{'user_id': '100253491', 'username': '_mm85'}]",2020-09-16 11:20:32 UTC,19196006,https://twitter.com/_mm85/status/1306191234671...
1,1306314507606724616,1306314507606724616,2020-09-16 19:30:23 UTC,0,Invest in future focused growth stocks that wi...,61654707,jvhak,14,4,5,,"[{'user_id': '61654707', 'username': 'jvhak'}]",2020-09-16 19:30:23 UTC,19194013,https://twitter.com/jvhak/status/1306314507606...
2,1306194332932804608,1306194332932804608,2020-09-16 11:32:51 UTC,0,Tesla Norway started registering cars in their...,1011871178665979904,Mtass7,27,4,4,,"[{'user_id': '1011871178665979904', 'username'...",2020-09-16 11:32:51 UTC,19195972,https://twitter.com/Mtass7/status/130619433293...
3,1306234691155103744,1306234691155103744,2020-09-16 14:13:13 UTC,0,Looks like weak hands started chasing $SPAQ. \...,1281254559755747328,Besni121,20,5,6,,"[{'user_id': '1281254559755747328', 'username'...",2020-09-16 14:13:13 UTC,19195258,https://twitter.com/Besni121/status/1306234691...
4,1306231274399358978,1306231274399358978,2020-09-16 13:59:39 UTC,0,$TSLA \n\nabout to turn green \n\nHope U didn...,19052568,MadMraket,44,5,1,,"[{'user_id': '19052568', 'username': 'MadMrake...",2020-09-16 13:59:39 UTC,19195366,https://twitter.com/MadMraket/status/130623127...


In [14]:
#Remove columns that are unlikely to be of help or that are redundant
clean_tweets=tweets.drop(['tweet_id','user_id_str','username','date','conversation_id','timezone','link','id','reply_to','quote_url'],axis=1)

## First iteration, preparing a sentiment analysis model using VADER

Vader is a library for sentiment analysis that comes with nltk and it's optimized to deal with messy social media data like tweets. It gives a positive score, a negative score and a compound score. The cell below calculates VADER sentiment score for each tweet, but it can take a long time, so feel free to load the scores already calculated.

In [47]:
#analyzer = SentimentIntensityAnalyzer()
#cs = []
#for row in range(len(clean_tweets)):
#    cs.append(analyzer.polarity_scores(clean_tweets['tweet'].iloc[row])['compound'])


vader_location="D://Aprawa/Report/vader_scores.csv"
with open(vader_location, newline='') as f:
    vader_scores = list(csv.reader(f))[0] #this reads the content as strings
    vader_scores = list(map(float, vader_scores)) #so we must convert them to numbers

We now prepare our sentiment table in order to get the maximum, minimum and sum per date.

In [48]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now
sentiment_tweets['Sentiment']=vader_scores #add the new sentiment values
sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours

We now group by our sentiment getting the maximum sentiment, the minimum and the overall sum of the sentiment per date.

In [49]:
max_sentiment=sentiment_tweets.groupby('date_utc')['Sentiment'].max()
min_sentiment=sentiment_tweets.groupby('date_utc')['Sentiment'].min()
sum_sentiment=sentiment_tweets.groupby('date_utc')['Sentiment'].agg('sum')
d = {'Max': max_sentiment.values, 'Min': min_sentiment.values,
     'Sum': sum_sentiment.values, 'Date': max_sentiment.index}
grouped_sentiment=pd.DataFrame(d).set_index('Date')

In [51]:
grouped_sentiment.head() #sanity check to see everything works correctly

Unnamed: 0_level_0,Max,Min,Sum
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2018-01-01,0.9571,-0.7783,39.0194
2018-01-02,0.9371,-0.8934,62.8729
2018-01-03,0.9417,-0.9389,57.8225
2018-01-04,0.9558,-0.915,82.6285
2018-01-05,0.8885,-0.9022,55.0367


We now combine our candle data with our sentiment on each day in order to make a trade on the next candle open after all sentiment has been analyzed for the day

In [52]:
combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

In [53]:
class TwitterSumStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        #If our overall sentiment is positive and we're not in a long position or the opposite
        #We do a long or short operation worth 20% of our equity
        if self.data.Sum[-1] > 0 and not self.position.is_long:
            self.buy(size=.2)
        elif self.data.Sum[-1] < 0 and not self.position.is_short:
            self.sell(size=.2)

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                 21657.379927
Equity Peak [$]                  23469.330305
Return [%]                         116.573799
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   33.047549
Volatility (Ann.) [%]               31.311767
Sharpe Ratio                         1.055435
Sortino Ratio                         2.22184
Calmar Ratio                         1.313819
Max. Drawdown [%]                  -25.153808
Avg. Drawdown [%]                   -4.062076
Max. Drawdown Duration      496 days 00:00:00
Avg. Drawdown Duration       37 days 00:00:00
# Trades                                    1
Win Rate [%]                            100.0
Best Trade [%]                      589.17615
Worst Trade [%]                     589.17615
Avg. Trade [%]                    

In [54]:
bt.plot()

The next cell uses the maximum and minimum sentiment to generate trades, it has a positive win rate, but the commission costs would make it unprofitable.

In [55]:
class TwitterMaxOverMinStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        #If our maximum positive sentiment is higher than our maximum negative sentiment
        #and we're not in a long position or the opposite
        #We do a long or short operation worth 20% of our equity
        if abs(self.data.Max[-1]) > abs(self.data.Min[-1]) and not self.position.is_long:
            self.buy(size=.2)
        elif abs(self.data.Max[-1]) < abs(self.data.Min[-1]) and not self.position.is_short:
            self.sell(size=.2)

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterMaxOverMinStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=True, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                  4605.039673
Equity Peak [$]                       10000.0
Return [%]                         -53.949603
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                  -24.913089
Volatility (Ann.) [%]               10.133178
Sharpe Ratio                              0.0
Sortino Ratio                             0.0
Calmar Ratio                              0.0
Max. Drawdown [%]                  -59.550563
Avg. Drawdown [%]                  -59.550563
Max. Drawdown Duration      987 days 00:00:00
Avg. Drawdown Duration      987 days 00:00:00
# Trades                                  303
Win Rate [%]                        32.673267
Best Trade [%]                      36.451226
Worst Trade [%]                      -31.4695
Avg. Trade [%]                    

In [56]:
bt.plot()

## Second iteration, assigning weights to the sentiments

We now consider what would happen if we assign weights to the sentiment of the tweets based on the number of likes, retweets and replies. Just like before, prepare our sentiment dataframe.

In [156]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1)
sentiment_tweets['Sentiment']=vader_scores
sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values])

We assign weights to the number of likes, retweets and replies.
Based on the small amount of data we observe that a reply is more valuable than a retweet, and that a retweet is more valuable than a like (due to the amount observed in each). Based on this assumption 10 and 5 are good overall results from a first analysis, and we can observe if it influences the overall trading strategy results.

In [175]:
weight_reply=10 #A reply is at least as valuable as 10 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))


scaled_tweets['Sentiment']*=scaled_tweets['Importance'] #finally, we multiply our sentiment by the importance

Similarly as before, we prepare our combined dataset and test it

In [176]:
max_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].max()
min_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].min()
sum_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].sum()

d = {'Max': max_sentiment.values, 'Min': min_sentiment.values,
     'Sum': sum_sentiment.values, 'Date': max_sentiment.index}
grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

In [177]:
#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                 10604.122861
Equity Peak [$]                  11071.722958
Return [%]                           6.041229
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                    2.191073
Volatility (Ann.) [%]               16.863866
Sharpe Ratio                         0.129927
Sortino Ratio                        0.195526
Calmar Ratio                         0.053587
Max. Drawdown [%]                  -40.888305
Avg. Drawdown [%]                   -6.177566
Max. Drawdown Duration      932 days 00:00:00
Avg. Drawdown Duration      109 days 00:00:00
# Trades                                   97
Win Rate [%]                        30.927835
Best Trade [%]                     116.754009
Worst Trade [%]                    -21.063827
Avg. Trade [%]                    

In [64]:
bt.plot()

And once again we test the maximum over minimum strategy with the new weights.

In [65]:
#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterMaxOverMinStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                  3779.152844
Equity Peak [$]                  10243.618178
Return [%]                         -62.208472
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                  -30.201413
Volatility (Ann.) [%]                9.317278
Sharpe Ratio                              0.0
Sortino Ratio                             0.0
Calmar Ratio                              0.0
Max. Drawdown [%]                  -64.982192
Avg. Drawdown [%]                  -11.323209
Max. Drawdown Duration      951 days 00:00:00
Avg. Drawdown Duration      164 days 00:00:00
# Trades                                  293
Win Rate [%]                        31.399317
Best Trade [%]                      56.680972
Worst Trade [%]                    -44.728614
Avg. Trade [%]                    

In [66]:
bt.plot()

The next cell does a small grid search to determine optimal weights for likes, replies and retweets, it can take a considerable time to run, change the type to code and run it if you're interested, or change the values to expand the search. The optimal values found were 19 and 4, which are probably overfit, so I'd recommend using 20 and 5.

Test using the pseudo-optimal weights 20 and 5

In [178]:
weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

scaled_tweets['Sentiment']*=scaled_tweets['Importance']#finally, we multiply our sentiment by the importance


max_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].max()
min_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].min()
sum_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].sum()

d = {'Max': max_sentiment.values, 'Min': min_sentiment.values,
     'Sum': sum_sentiment.values, 'Date': max_sentiment.index}
grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                 11006.323243
Equity Peak [$]                  11473.923341
Return [%]                          10.063232
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                     3.60647
Volatility (Ann.) [%]               17.099231
Sharpe Ratio                         0.210914
Sortino Ratio                        0.323635
Calmar Ratio                         0.092948
Max. Drawdown [%]                  -38.800866
Avg. Drawdown [%]                   -5.597626
Max. Drawdown Duration      741 days 00:00:00
Avg. Drawdown Duration       89 days 00:00:00
# Trades                                  101
Win Rate [%]                        32.673267
Best Trade [%]                     116.754009
Worst Trade [%]                    -21.063827
Avg. Trade [%]                    

In [69]:
bt.plot()

## Third iteration: Expanding the VADER Lexicon

Time to extend the VADER lexicon with financial and trading buzzwords and see how it affects sentiment and our trading.
First we locate the words in all our tweets.

In [None]:
words = dict()
for row in range(len(clean_tweets)):
    raw_sentences = sent_tokenize(clean_tweets['tweet'].iloc[row])
    for raw_sentence in raw_sentences:
        tokens = word_tokenize(raw_sentence)
        for token in tokens:
            words.setdefault(token,0)
            words[token]+=1

Afterwards we separate the common words that aren't in our lexicon

In [None]:
words_not_in_lexicon =dict()
for w in sorted(words, key=words.get, reverse=True):
    if words[w]>5000 and w not in analyzer.lexicon:
           words_not_in_lexicon.setdefault(w,words[w])

In [None]:
print(words_not_in_lexicon) #We manually inspect those words and choose the ones that we want to add to the lexicon.

At a first glance the following words could be of use in the lexicon

In [73]:
analyzer = SentimentIntensityAnalyzer()
analyzer.lexicon.update({'up': 0.6,
 'short': -1.2,
 'down': -0.6,
 'buy': 0.2,
 'long': 1.2,
 'investing': 1.6,
 'shorts': -1.3,
 'sell': -0.2,
 'buying': 0.2,
 'higher': 0.7,
 'sold': -0.8,
 'selling': -0.2,
 'needs': -0.4,
 'investment': 0.7,
 'tech': 0.2,
 'bulls': 0.3,
 'bullish': 0.3,
 'Short': -0.2,
 'deal': 0.5,
 'Long': 1.2,
 'pump': 1.7,
 'bull': 0.3,
 'shorting': -0.2,})

Similarly as above, we calculate our VADER scores with the commented text, but I'd suggest to use the precalculated .csv

In [74]:
#cs = []
#for row in range(len(clean_tweets)):
#    cs.append(analyzer.polarity_scores(clean_tweets['tweet'].iloc[row])['compound'])
    
changed_vader_location="D://Aprawa/Report/changed_vader_scores.csv"
with open(changed_vader_location, newline='') as f:
    changed_vader_scores = list(csv.reader(f))[0] #this reads the content as strings
    changed_vader_scores = list(map(float, changed_vader_scores)) #so we must convert them to numbers

In [179]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now
sentiment_tweets['Sentiment']=changed_vader_scores #add the new sentiment values
sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours

weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))


scaled_tweets['Sentiment']*=scaled_tweets['Importance'] #finally, we multiply our sentiment by the importance


max_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].max()
min_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].min()
sum_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].sum()

d = {'Max': max_sentiment.values, 'Min': min_sentiment.values,
     'Sum': sum_sentiment.values, 'Date': max_sentiment.index}
grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                 10609.422416
Equity Peak [$]                  11077.022514
Return [%]                           6.094224
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                    2.209941
Volatility (Ann.) [%]               15.684044
Sharpe Ratio                         0.140904
Sortino Ratio                        0.211009
Calmar Ratio                         0.064056
Max. Drawdown [%]                  -34.499927
Avg. Drawdown [%]                   -5.759655
Max. Drawdown Duration      745 days 00:00:00
Avg. Drawdown Duration       99 days 00:00:00
# Trades                                  103
Win Rate [%]                        32.038835
Best Trade [%]                     116.754009
Worst Trade [%]                    -21.063827
Avg. Trade [%]                    

In [94]:
bt.plot()

Modifying VADER lexicon was dissapointing, but there are other sentiment analyzers that can be explored

## Fourth Iteration: Other Sentiment analyzers

### SentiwordNet

SentiwordNet is very used in many scenarios and unlike VADER it's not specialized in social media data, but it can do a better representation of sentiment in many other cases, so it's a good choice to start with.

In [85]:
lemmatizer = WordNetLemmatizer()
  
def penn_to_wn(tag):
    """
    Convert between the PennTreebank tags to simple Wordnet tags
    """
    if tag.startswith('J'):
        return wn.ADJ
    elif tag.startswith('N'):
        return wn.NOUN
    elif tag.startswith('R'):
        return wn.ADV
    elif tag.startswith('V'):
        return wn.VERB
    return None
 

def clean_text(text):
    #Any specific cleanup could be done here, but due to time constraints and the huge variety of content in the text
    #it's best to leave as is, at least for the first run
    return text
 
def swn_polarity(text):
    """
    Return a sentiment polarity: 0 = negative, 1 = positive
    """
 
    sentiment = 0.0
    tokens_count = 0
 
    text = clean_text(text)
 
 
    raw_sentences = sent_tokenize(text)
    for raw_sentence in raw_sentences:
        tagged_sentence = pos_tag(word_tokenize(raw_sentence))
 
        for word, tag in tagged_sentence:
            wn_tag = penn_to_wn(tag)
            if wn_tag not in (wn.NOUN, wn.ADJ, wn.ADV):
                continue
 
            lemma = lemmatizer.lemmatize(word, pos=wn_tag)
            if not lemma:
                continue
 
            synsets = wn.synsets(lemma, pos=wn_tag)
            if not synsets:
                continue
 
            # Take the first sense, the most common
            synset = synsets[0]
            swn_synset = swn.senti_synset(synset.name())
 
            sentiment += swn_synset.pos_score() - swn_synset.neg_score()
            tokens_count += 1
 
    # judgment call ? Default to positive or negative
    if not tokens_count:
        return 0
 
    # sum greater than 0 => positive sentiment
    if sentiment >= 0:
        return 1
 
    # negative sentiment
    return -1

Sentiwordnet isn't exactly fast, calculating the polarity score can be extremely slow so I really suggest to use the precalculated ones.

In [86]:
#cs = []
#for row in range(len(clean_tweets)):
#    cs.append(swn_polarity(clean_tweets['tweet'].iloc[row]))

senti_location="D://Aprawa/Report/sentiwordnet_scores.csv"
with open(senti_location, newline='') as f:
    senti_scores = list(csv.reader(f))[0] #this reads the content as strings
    senti_scores = list(map(float, senti_scores)) #so we must convert them to numbers

In [180]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now
sentiment_tweets['Sentiment']=senti_scores #add the new sentiment values
sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours

weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

scaled_tweets['Sentiment']*=scaled_tweets['Importance'] #finally, we multiply our sentiment by the importance

max_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].max()
min_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].min()
sum_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].sum()

d = {'Max': max_sentiment.values, 'Min': min_sentiment.values,
     'Sum': sum_sentiment.values, 'Date': max_sentiment.index}
grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                 14047.955049
Equity Peak [$]                  14632.455171
Return [%]                           40.47955
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   13.381777
Volatility (Ann.) [%]               18.385775
Sharpe Ratio                         0.727833
Sortino Ratio                        1.244631
Calmar Ratio                         0.648233
Max. Drawdown [%]                  -20.643482
Avg. Drawdown [%]                   -3.687203
Max. Drawdown Duration      729 days 00:00:00
Avg. Drawdown Duration       51 days 00:00:00
# Trades                                   47
Win Rate [%]                        34.042553
Best Trade [%]                      99.097452
Worst Trade [%]                    -18.181549
Avg. Trade [%]                    

In [100]:
bt.plot()

There was a good difference in P/L with Sentiword net, but it's just extremely bullish as well

### Textblob

Textblob is another sentiment analysis that we can use off the shelf and it's very good, just very slow as well.

In [95]:
#cs = []
#for row in range(len(clean_tweets)):
#    cs.append(TextBlob(clean_tweets['tweet'].iloc[row]).sentiment.polarity)


textblob_location="D://Aprawa/Report/textblob_scores.csv"
with open(textblob_location, newline='') as f:
    textblob_scores = list(csv.reader(f))[0] #this reads the content as strings
    textblob_scores = list(map(float, textblob_scores)) #so we must convert them to numbers

In [187]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now
sentiment_tweets['Sentiment']=textblob_scores #add the new sentiment values
sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours

weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

scaled_tweets['Sentiment']*=scaled_tweets['Importance'] #finally, we multiply our sentiment by the importance


max_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].max()
min_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].min()
sum_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].sum()

d = {'Max': max_sentiment.values, 'Min': min_sentiment.values,
     'Sum': sum_sentiment.values, 'Date': max_sentiment.index}
grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                 25883.372174
Equity Peak [$]                  28338.272686
Return [%]                         158.833722
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   42.105803
Volatility (Ann.) [%]               39.873465
Sharpe Ratio                         1.055986
Sortino Ratio                        2.405411
Calmar Ratio                         1.362211
Max. Drawdown [%]                  -30.909891
Avg. Drawdown [%]                   -4.390209
Max. Drawdown Duration      469 days 00:00:00
Avg. Drawdown Duration       34 days 00:00:00
# Trades                                    3
Win Rate [%]                        33.333333
Best Trade [%]                     900.661515
Worst Trade [%]                    -33.803868
Avg. Trade [%]                    

In [103]:
bt.plot()

Very few trades, we can vary our entry conditions a little bit

In [197]:
class TextBlobModifiedTwitterSumStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        #If our overall sentiment is mildly positive and we're not in a long position or the opposite
        #We do a long or short operation worth 20% of our equity
        if self.data.Sum[-1] > 1000 and not self.position.is_long:
            self.buy(size=.2)
        elif self.data.Sum[-1] < -1000 and not self.position.is_short:
            self.sell(size=.2)
        #And if our sentiment is more or less neutral again we close the trades
        elif self.data.Sum[-1] < 500 and self.data.Sum[-1] >-500:
            for trade in self.trades:
                trade.close()

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TextBlobModifiedTwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   94.428152
Equity Final [$]                 19473.116182
Equity Peak [$]                  21051.266512
Return [%]                          94.731162
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   27.922467
Volatility (Ann.) [%]               29.241514
Sharpe Ratio                         0.954891
Sortino Ratio                        1.911608
Calmar Ratio                         1.166801
Max. Drawdown [%]                  -23.930786
Avg. Drawdown [%]                   -5.554232
Max. Drawdown Duration      519 days 00:00:00
Avg. Drawdown Duration       54 days 00:00:00
# Trades                                   17
Win Rate [%]                        41.176471
Best Trade [%]                     547.845869
Worst Trade [%]                    -18.998345
Avg. Trade [%]                    

In [198]:
bt.plot()

TextBlob was AMAZING in the analysis, the time spent was time well spent and it gave good results, and filtering out our entry conditions to only take trades on more significant emotion generated more trades and less drawdown, although less P/L thanks to the commissions.

### Afinn

Afinn is another widely used library, it's a general use library with no strong points but no weak points either. It's considerably faster than textblob and sentiwordnet but the data size is big enough to take a few hours.

In [107]:
#afinn = Afinn(language="en", emoticons=True)
#cs = []
#for row in range(len(clean_tweets)):
#    cs.append(afinn.score(clean_tweets['tweet'].iloc[row]))

afinn_location="D://Aprawa/Report/afinn_scores.csv"
with open(afinn_location, newline='') as f:
    afinn_scores = list(csv.reader(f))[0] #this reads the content as strings
    afinn_scores = list(map(float, afinn_scores)) #so we must convert them to numbers

In [199]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now
sentiment_tweets['Sentiment']=afinn_scores #add the new sentiment values
sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours

weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

scaled_tweets['Sentiment']*=scaled_tweets['Importance'] #finally, we multiply our sentiment by the importance


max_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].max()
min_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].min()
sum_sentiment=scaled_tweets.groupby('date_utc')['Sentiment'].sum()

d = {'Max': max_sentiment.values, 'Min': min_sentiment.values,
     'Sum': sum_sentiment.values, 'Date': max_sentiment.index}
grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, TwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   99.706745
Equity Final [$]                  5908.635356
Equity Peak [$]                  10208.708037
Return [%]                         -40.913646
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   -17.66885
Volatility (Ann.) [%]               12.208519
Sharpe Ratio                              0.0
Sortino Ratio                             0.0
Calmar Ratio                              0.0
Max. Drawdown [%]                  -57.703079
Avg. Drawdown [%]                  -11.682649
Max. Drawdown Duration      967 days 00:00:00
Avg. Drawdown Duration      197 days 00:00:00
# Trades                                  201
Win Rate [%]                        31.343284
Best Trade [%]                     116.754009
Worst Trade [%]                    -46.760442
Avg. Trade [%]                    

In [200]:
bt.plot()

Afinn admittedly didn't do too well, although it kinda had the general idea of positiveness it took too many trades. Just like with textblob we can consider specifying a specific intensity for our trades.

In [210]:
class AfinnModifiedTwitterSumStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        #If our overall sentiment is mildly positive and we're not in a long position or the opposite
        #We do a long or short operation worth 20% of our equity
        if self.data.Sum[-1] > 40000 and not self.position.is_long:
            self.buy(size=.2)
        elif self.data.Sum[-1] < -40000 and not self.position.is_short:
            self.sell(size=.2)
        #And if our sentiment is more or less neutral again we close the trades
        elif self.data.Sum[-1] < 200 and self.data.Sum[-1] >-200:
            for trade in self.trades:
                trade.close()

bt = Backtest(combined_data, AfinnModifiedTwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   83.577713
Equity Final [$]                 13878.637894
Equity Peak [$]                  14580.038041
Return [%]                          38.786379
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   12.874897
Volatility (Ann.) [%]               20.261262
Sharpe Ratio                         0.635444
Sortino Ratio                        1.048793
Calmar Ratio                         0.501128
Max. Drawdown [%]                  -25.691832
Avg. Drawdown [%]                   -6.074792
Max. Drawdown Duration      583 days 00:00:00
Avg. Drawdown Duration       70 days 00:00:00
# Trades                                   26
Win Rate [%]                             50.0
Best Trade [%]                     160.482914
Worst Trade [%]                    -35.580331
Avg. Trade [%]                    

In [211]:
bt.plot()

Much better, although nowhere near Textblob.

### Ensemble system, combining analyzers

A great way to diminish the negative effects and mistakes of one classifier is to use another. In this case we can combine and scale all four classifiers to generate a single sentiment signal.

In [303]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now

 #add the new sentiment values for each analyzer
sentiment_tweets['Sentiment_vader']=vader_scores
sentiment_tweets['Sentiment_afinn']=afinn_scores
sentiment_tweets['Sentiment_sentiwordnet']=senti_scores
sentiment_tweets['Sentiment_textblob']=textblob_scores

sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours


weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

#We now multiply each of our sentiment as predicted by our analyzers by the importance
scaled_tweets['Sentiment_vader']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_afinn']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_sentiwordnet']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_textblob']*=scaled_tweets['Importance']

At this point the ideal thing to do is to scale and standarize our series so we can add them, but we **CANNOT** use the data we use for standarization as it would leak data into our test. As such we must use a initial part of the data for the scaling and the rest we would only transform it and use it normally. Notice the **difference** as it's crucial in getting correct results. There isn't much danger as we are scaling our sentiment and not based on the price but it's wrong anyways and shouldn't be done in the first place.

In [389]:
ordered_tweets=scaled_tweets.set_index('date_utc').sort_index(ascending=True)

In [392]:
train=ordered_tweets.copy().iloc[:100000,:] 
test=ordered_tweets.copy().iloc[100000:,:]

In [394]:
min_max_scaler = MinMaxScaler()
train['Sentiment_vader']=min_max_scaler.fit(train[['Sentiment_vader']]) #Fit the train
test['Sentiment_vader']=min_max_scaler.transform(test[['Sentiment_vader']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_afinn']=min_max_scaler.fit(train[['Sentiment_afinn']]) #Fit the train
test['Sentiment_afinn']=min_max_scaler.transform(test[['Sentiment_afinn']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_sentiwordnet']=min_max_scaler.fit(train[['Sentiment_sentiwordnet']]) #Fit the train
test['Sentiment_sentiwordnet']=min_max_scaler.transform(test[['Sentiment_sentiwordnet']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_textblob']=min_max_scaler.fit(train[['Sentiment_textblob']]) #Fit the train
test['Sentiment_textblob']=min_max_scaler.transform(test[['Sentiment_textblob']]) #Transform the test

In [395]:
#We are using our test subset now
sum_sentiment=test.groupby('date_utc')[['Sentiment_vader','Sentiment_afinn','Sentiment_sentiwordnet','Sentiment_textblob']].sum()

d = {'Vader': sum_sentiment['Sentiment_vader'].values, 
      'Afinn': sum_sentiment['Sentiment_afinn'].values,
      'Senti': sum_sentiment['Sentiment_sentiwordnet'].values,
      'Textblob': sum_sentiment['Sentiment_textblob'].values,
      'Date': sum_sentiment.index}


grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

In [396]:
class EnsembleTwitterSumStrategy(Strategy):
    price_delta = .02 #20% for stop loss
    def init(self):        
        None
    def next(self):
        
        sentiment = self.data.Vader[-1]+self.data.Afinn[-1]+self.data.Senti[-1]+self.data.Textblob[-1] 

        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]
        upper, lower = close[-1] * (1 + np.r_[1, -1]*self.price_delta)
        
        #If our sentiment is slightly positive or slightly negative
        #and we're not in a long position or the opposite
        #We do a long or short operation worth 20% of our equity        
        if sentiment>3000 and not self.position.is_long:
            self.buy(size=.2, sl=lower)
        elif sentiment <-3000 and not self.position.is_short:
            self.sell(size=.2, sl=upper)
        #We also close trades that are more or less neutral in sentiment
        elif sentiment < 100 and sentiment>-100:
            for trade in self.trades:
                trade.close()    

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, EnsembleTwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                   50.879765
Equity Final [$]                 16819.802202
Equity Peak [$]                  18164.152483
Return [%]                          68.198022
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   21.182846
Volatility (Ann.) [%]               25.895057
Sharpe Ratio                         0.818027
Sortino Ratio                        1.556488
Calmar Ratio                         0.904346
Max. Drawdown [%]                  -23.423396
Avg. Drawdown [%]                   -6.016573
Max. Drawdown Duration      545 days 00:00:00
Avg. Drawdown Duration       53 days 00:00:00
# Trades                                   20
Win Rate [%]                              5.0
Best Trade [%]                     526.227206
Worst Trade [%]                    -12.927665
Avg. Trade [%]                    

In [397]:
bt.plot()

Much better PL and trades, but still could be much better.

## Fifth Iteration: Adding negative bias to tweets

In [398]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now

 #add the new sentiment values for each analyzer
sentiment_tweets['Sentiment_vader']=vader_scores
sentiment_tweets['Sentiment_afinn']=afinn_scores
sentiment_tweets['Sentiment_sentiwordnet']=senti_scores
sentiment_tweets['Sentiment_textblob']=textblob_scores

sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours


weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

#We now multiply each of our sentiment as predicted by our analyzers by the importance
scaled_tweets['Sentiment_vader']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_afinn']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_sentiwordnet']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_textblob']*=scaled_tweets['Importance']


ordered_tweets=scaled_tweets.set_index('date_utc').sort_index(ascending=True)
train=ordered_tweets.copy().iloc[:100000,:] 
test=ordered_tweets.copy().iloc[100000:,:]

min_max_scaler = MinMaxScaler()
train['Sentiment_vader']=min_max_scaler.fit(train[['Sentiment_vader']]) #Fit the train
test['Sentiment_vader']=min_max_scaler.transform(test[['Sentiment_vader']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_afinn']=min_max_scaler.fit(train[['Sentiment_afinn']]) #Fit the train
test['Sentiment_afinn']=min_max_scaler.transform(test[['Sentiment_afinn']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_sentiwordnet']=min_max_scaler.fit(train[['Sentiment_sentiwordnet']]) #Fit the train
test['Sentiment_sentiwordnet']=min_max_scaler.transform(test[['Sentiment_sentiwordnet']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_textblob']=min_max_scaler.fit(train[['Sentiment_textblob']]) #Fit the train
test['Sentiment_textblob']=min_max_scaler.transform(test[['Sentiment_textblob']]) #Transform the test

#Amplifying the negative sentiment
test['Sentiment_vader']=test['Sentiment_vader'].apply(lambda x: x*10000 if x<0 else x)
test['Sentiment_afinn']=test['Sentiment_afinn'].apply(lambda x: x*10000 if x<0 else x)
test['Sentiment_sentiwordnet']=test['Sentiment_sentiwordnet'].apply(lambda x: x*10000 if x<0 else x)
test['Sentiment_textblob']=test['Sentiment_textblob'].apply(lambda x: x*10000 if x<0 else x)

#We are using our test subset now
sum_sentiment=test.groupby('date_utc')[['Sentiment_vader','Sentiment_afinn','Sentiment_sentiwordnet','Sentiment_textblob']].sum()

d = {'Vader': sum_sentiment['Sentiment_vader'].values, 
      'Afinn': sum_sentiment['Sentiment_afinn'].values,
      'Senti': sum_sentiment['Sentiment_sentiwordnet'].values,
      'Textblob': sum_sentiment['Sentiment_textblob'].values,
      'Date': sum_sentiment.index}


grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

In [399]:
class NegEnsembleTwitterSumStrategy(Strategy):
    price_delta = .02 #20% for stop loss
    def init(self):        
        None
    def next(self):
        
        sentiment = self.data.Vader[-1]+self.data.Afinn[-1]+self.data.Senti[-1]+self.data.Textblob[-1] 

        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]
        upper, lower = close[-1] * (1 + np.r_[1, -1]*self.price_delta)
        
        #If our sentiment is slightly positive or slightly negative
        #and we're not in a long position or the opposite
        #We do a long or short operation worth 20% of our equity        
        if sentiment>5 and not self.position.is_long:
            self.buy(size=.2, sl=lower)
        elif sentiment <-5 and not self.position.is_short:
            self.sell(size=.2, sl=upper)
        #We also close trades that are more or less neutral in sentiment
        elif sentiment < 0.1 and (vader+afinn+senti+textblob)>-0.1:
            for trade in self.trades:
                trade.close()    

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, NegEnsembleTwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                    84.31085
Equity Final [$]                  15282.31549
Equity Peak [$]                   15669.45163
Return [%]                          52.823155
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   16.965603
Volatility (Ann.) [%]               20.046031
Sharpe Ratio                         0.846332
Sortino Ratio                        1.530636
Calmar Ratio                         0.683344
Max. Drawdown [%]                   -24.82733
Avg. Drawdown [%]                   -5.018593
Max. Drawdown Duration      532 days 00:00:00
Avg. Drawdown Duration       58 days 00:00:00
# Trades                                   46
Win Rate [%]                        19.565217
Best Trade [%]                     166.334674
Worst Trade [%]                    -10.463481
Avg. Trade [%]                    

In [400]:
bt.plot()

We need to amplify 10000 times the negative tweets to even alter our algorithm slightly, this could be showing how robust ensemble systems are or how useless it is to amplify negativity. It didn't give increased P/L either.

## Sixth Iteration: Amplifying sentiment based on amount of tweets in a candle

In [401]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now

 #add the new sentiment values for each analyzer
sentiment_tweets['Sentiment_vader']=vader_scores
sentiment_tweets['Sentiment_afinn']=afinn_scores
sentiment_tweets['Sentiment_sentiwordnet']=senti_scores
sentiment_tweets['Sentiment_textblob']=textblob_scores

sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours


weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

#We now multiply each of our sentiment as predicted by our analyzers by the importance
scaled_tweets['Sentiment_vader']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_afinn']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_sentiwordnet']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_textblob']*=scaled_tweets['Importance']


ordered_tweets=scaled_tweets.set_index('date_utc').sort_index(ascending=True)
train=ordered_tweets.copy().iloc[:100000,:] 
test=ordered_tweets.copy().iloc[100000:,:]

min_max_scaler = MinMaxScaler()
train['Sentiment_vader']=min_max_scaler.fit(train[['Sentiment_vader']]) #Fit the train
test['Sentiment_vader']=min_max_scaler.transform(test[['Sentiment_vader']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_afinn']=min_max_scaler.fit(train[['Sentiment_afinn']]) #Fit the train
test['Sentiment_afinn']=min_max_scaler.transform(test[['Sentiment_afinn']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_sentiwordnet']=min_max_scaler.fit(train[['Sentiment_sentiwordnet']]) #Fit the train
test['Sentiment_sentiwordnet']=min_max_scaler.transform(test[['Sentiment_sentiwordnet']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_textblob']=min_max_scaler.fit(train[['Sentiment_textblob']]) #Fit the train
test['Sentiment_textblob']=min_max_scaler.transform(test[['Sentiment_textblob']]) #Transform the test

#We are using our test subset now
sum_sentiment=test.groupby('date_utc')[['Sentiment_vader','Sentiment_afinn','Sentiment_sentiwordnet','Sentiment_textblob']].sum()

sum_sentiment['count']=test.groupby('date_utc')[['Sentiment_vader']].count()

sum_sentiment['Sentiment_vader']*=sum_sentiment['count']
sum_sentiment['Sentiment_afinn']*=sum_sentiment['count']
sum_sentiment['Sentiment_sentiwordnet']*=sum_sentiment['count']
sum_sentiment['Sentiment_textblob']*=sum_sentiment['count']

d = {'Vader': sum_sentiment['Sentiment_vader'].values, 
      'Afinn': sum_sentiment['Sentiment_afinn'].values,
      'Senti': sum_sentiment['Sentiment_sentiwordnet'].values,
      'Textblob': sum_sentiment['Sentiment_textblob'].values,
      'Date': sum_sentiment.index}

grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

In [403]:
#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, EnsembleTwitterSumStrategy,cash=10000, commission=0.02, margin=1.0, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                    84.31085
Equity Final [$]                 27665.204253
Equity Peak [$]                  30412.354827
Return [%]                         176.652043
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   45.644878
Volatility (Ann.) [%]               43.869699
Sharpe Ratio                         1.040465
Sortino Ratio                        2.442757
Calmar Ratio                         1.363255
Max. Drawdown [%]                  -33.482284
Avg. Drawdown [%]                   -6.053607
Max. Drawdown Duration      496 days 00:00:00
Avg. Drawdown Duration       40 days 00:00:00
# Trades                                   15
Win Rate [%]                         6.666667
Best Trade [%]                    1090.626943
Worst Trade [%]                    -10.463481
Avg. Trade [%]                    

In [404]:
bt.plot()

Slightly better results, but not really noticeable, much less trades and higher drawdown

## Seventh iteration: Establishing more complex money management techniques

In [405]:
sentiment_tweets=clean_tweets.drop(['tweet'],axis=1) #remove text data as it's useless now

 #add the new sentiment values for each analyzer
sentiment_tweets['Sentiment_vader']=vader_scores
sentiment_tweets['Sentiment_afinn']=afinn_scores
sentiment_tweets['Sentiment_sentiwordnet']=senti_scores
sentiment_tweets['Sentiment_textblob']=textblob_scores

sentiment_tweets['date_utc']=pd.to_datetime([x[:10] for x in sentiment_tweets['date_utc'].values]) #Get the date only up to the days and remove the hours


weight_reply=20 #A reply is at least as valuable as 20 likes
weight_retweet=5 #A retweet is at least as valuable as 5 likes

#Prepare a new dataset without the redundant columns
scaled_tweets = sentiment_tweets.drop(['nlikes','nreplies','nretweets'],axis=1) 

#Generate our importance multiplier based on the number of likes, replies and retweets.
scaled_tweets['Importance']=(sentiment_tweets['nlikes']
                            +(sentiment_tweets['nreplies']*weight_reply)
                            +(sentiment_tweets['nretweets']*weight_retweet))

#We now multiply each of our sentiment as predicted by our analyzers by the importance
scaled_tweets['Sentiment_vader']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_afinn']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_sentiwordnet']*=scaled_tweets['Importance']
scaled_tweets['Sentiment_textblob']*=scaled_tweets['Importance']


ordered_tweets=scaled_tweets.set_index('date_utc').sort_index(ascending=True)
train=ordered_tweets.copy().iloc[:100000,:] 
test=ordered_tweets.copy().iloc[100000:,:]

min_max_scaler = MinMaxScaler()
train['Sentiment_vader']=min_max_scaler.fit(train[['Sentiment_vader']]) #Fit the train
test['Sentiment_vader']=min_max_scaler.transform(test[['Sentiment_vader']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_afinn']=min_max_scaler.fit(train[['Sentiment_afinn']]) #Fit the train
test['Sentiment_afinn']=min_max_scaler.transform(test[['Sentiment_afinn']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_sentiwordnet']=min_max_scaler.fit(train[['Sentiment_sentiwordnet']]) #Fit the train
test['Sentiment_sentiwordnet']=min_max_scaler.transform(test[['Sentiment_sentiwordnet']]) #Transform the test

min_max_scaler = MinMaxScaler()
train['Sentiment_textblob']=min_max_scaler.fit(train[['Sentiment_textblob']]) #Fit the train
test['Sentiment_textblob']=min_max_scaler.transform(test[['Sentiment_textblob']]) #Transform the test

#We are using our test subset now
sum_sentiment=test.groupby('date_utc')[['Sentiment_vader','Sentiment_afinn','Sentiment_sentiwordnet','Sentiment_textblob']].sum()

sum_sentiment['count']=test.groupby('date_utc')[['Sentiment_vader']].count()

sum_sentiment['Sentiment_vader']*=sum_sentiment['count']
sum_sentiment['Sentiment_afinn']*=sum_sentiment['count']
sum_sentiment['Sentiment_sentiwordnet']*=sum_sentiment['count']
sum_sentiment['Sentiment_textblob']*=sum_sentiment['count']

d = {'Vader': sum_sentiment['Sentiment_vader'].values, 
      'Afinn': sum_sentiment['Sentiment_afinn'].values,
      'Senti': sum_sentiment['Sentiment_sentiwordnet'].values,
      'Textblob': sum_sentiment['Sentiment_textblob'].values,
      'Date': sum_sentiment.index}

grouped_sentiment=pd.DataFrame(d).set_index('Date')

combined_data=price.join(grouped_sentiment,on='Date')
combined_data.set_index('Date',inplace=True)

In [406]:
#Research on several of the numbers used were done with grid search and manual testing, the overall results 
#are in the document file, this just represents a different and interesting approach to the previous alternatives
class EntriesEnsembleTwitterSumStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        sentiment = self.data.Vader[-1]+self.data.Afinn[-1]+self.data.Senti[-1]+self.data.Textblob[-1] 

        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]
        sl_long = close[-1] * (1 - 0.2)
        tl_long = close[-1] * (1 - 0.1)

        sl_short = close[-1] * (1 + 0.2)
        tl_short = close[-1] * (1 + 0.1)

        #If our maximum positive sentiment is higher than our maximum negative sentiment
        #and we're not in a long position or the opposite
        #We do a long or short operation worth 20% of our equity
        if sentiment > 300 and len(self.trades) < 10:
            self.buy(size=0.05, sl=sl_long)
        elif sentiment < (300*-1) and len(self.trades) < 10:
            self.sell(size=0.05, sl=sl_short)
        elif sentiment < 10 and sentiment > (10 * -1):
            for trade in self.trades:
                trade.close()
        for trade in self.trades: # trailing stop
            if trade.is_long:
                trade.sl = tl_long
            else:
                trade.sl = tl_short  

#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_data, EntriesEnsembleTwitterSumStrategy,cash=10000, commission=0.02, margin=1, trade_on_close=False, hedging=False, exclusive_orders=False)
bt.run()                                    

Start                     2018-01-02 00:00:00
End                       2020-09-16 00:00:00
Duration                    988 days 00:00:00
Exposure Time [%]                    84.31085
Equity Final [$]                 25076.193446
Equity Peak [$]                  27385.974736
Return [%]                         150.761934
Buy & Hold Return [%]              589.108649
Return (Ann.) [%]                   40.451943
Volatility (Ann.) [%]               35.865653
Sharpe Ratio                         1.127874
Sortino Ratio                        2.826281
Calmar Ratio                         1.309504
Max. Drawdown [%]                  -30.891041
Avg. Drawdown [%]                   -5.312626
Max. Drawdown Duration      517 days 00:00:00
Avg. Drawdown Duration       40 days 00:00:00
# Trades                                  122
Win Rate [%]                        33.606557
Best Trade [%]                     245.562925
Worst Trade [%]                    -32.112756
Avg. Trade [%]                    

In [407]:
bt.plot()

## Eight Iteration: Sentiment + Technical indicator trading system

Precalculate several TA indicators for posterior analysis

In [502]:
close=price['Close']
high=price['High']
low=price['Low']

wma_10 = ta.wma(close, length=10)
wma_20 = ta.wma(close, length=20)
wma_50 = ta.wma(close, length=50)
tema_10 = ta.tema(close, length=10)
tema_20 = ta.tema(close, length=20)
tema_50 = ta.tema(close, length=50)
sma_10 = ta.sma(close, length=10)
sma_20 = ta.sma(close, length=20)
sma_50 = ta.sma(close, length=50)

atr = ta.atr(high, low, close, length=15)
std = ta.stdev(close, timeperiod=15, nbdev=1)
linreg = ta.linreg(close, timeperiod=15)
mom = ta.mom(close, timeperiod=10)
rsi = ta.rsi(close, timeperiod=15)
cci = ta.cci(high, low, close, timeperiod=15)

bbands = ta.bbands(close, length=25, std=2).iloc[:,:3]
bblower=np.reshape(bbands.iloc[:,0:1].values,len(bbands))
bbmiddle=np.reshape(bbands.iloc[:,1:2].values,len(bbands))
bbupper=np.reshape(bbands.iloc[:,2:3].values,len(bbands))

aroon = ta.aroon(high, low, length=15).iloc[:,:2]
aroondown=np.reshape(aroon.iloc[:,0:1].values,len(aroon))
aroonup=np.reshape(aroon.iloc[:,1:2].values,len(aroon))


macd  = ta.macd(close, fast=15, slow=30, signal=10)
macd_base = np.reshape(macd.iloc[:,0:1].values,len(macd))
macd_hist = np.reshape(macd.iloc[:,1:2].values,len(macd))
macd_signal = np.reshape(macd.iloc[:,2:3].values,len(macd))

ta_data = {'WMA_10': wma_10.values, 'WMA_20': wma_20.values, 'WMA_50': wma_50.values,
           'TEMA_10': tema_10.values, 'TEMA_20': tema_20.values, 'TEMA_50': tema_50.values,
           'SMA_10': sma_10.values, 'SMA_20': sma_20.values, 'SMA_50': sma_50.values,
           'ATR': atr.values, 'STD': std.values, 'LINREG': linreg.values,
           'MOM': mom.values, 'RSI': rsi.values, 'CCI': cci.values,
           'BBUPPER': bbupper, 'BBMIDDLE': bbmiddle, 'BBLOWER': bblower, 
           'AROONDOWN': aroondown, 'AROONUP': aroonup,
           'MACD': macd_base, 'MACDHIST': macd_hist, 'MACDSIGNAL': macd_signal,
          'Date': price['Date']}

In [503]:
ta_df=pd.DataFrame(ta_data).set_index('Date')
ta_df.dropna(axis=0,inplace=True) #We clear the initial price data that had no values
combined_ta_sent_data=ta_df.join(combined_data,on='Date') #We now combine it with our sentiment data

In [504]:
 # We could do a inner join to keep both tables but I want to explicitly drop NA values from the combined table
# so it's more clear that the tables have different lengths and thus contain NAN values after joining.
combined_ta_sent_data.dropna(axis=0,inplace=True)
print(combined_ta_sent_data.head()) #Sanity check
print(combined_ta_sent_data.tail())

               WMA_10     WMA_20     WMA_50    TEMA_10    TEMA_20    TEMA_50  \
Date                                                                           
2018-06-05  57.675818  57.319533  57.906935  58.581011  57.714941  56.520293   
2018-06-06  58.927891  57.906143  58.146317  61.180612  59.356322  57.268083   
2018-06-07  59.894182  58.411105  58.356551  62.564753  60.518232  57.884385   
2018-06-08  60.782946  58.937171  58.573308  63.557516  61.538705  58.491201   
2018-06-11  62.052036  59.726247  58.893928  65.440806  63.105753  59.379695   

             SMA_10   SMA_20    SMA_50       ATR  ...  Adj Close      Close  \
Date                                              ...                         
2018-06-05  57.0136  57.7406  57.79576  2.047639  ...  58.226002  58.226002   
2018-06-06  57.9034  57.9159  57.85704  2.325157  ...  63.900002  63.900002   
2018-06-07  58.6438  58.0083  58.00468  2.389114  ...  63.217999  63.217999   
2018-06-08  59.4400  58.1347  58.24420  2.34

Looks good, we got 33 columns containing TA indicators + sentiment analysis values + price data

In [530]:
class TAEnsembleTwitterSumStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        sentiment = self.data.Vader[-1]+self.data.Afinn[-1]+self.data.Senti[-1]+self.data.Textblob[-1] 

        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]
        #If our maximum positive sentiment is higher than our maximum negative sentiment
        #and we're not in a long position or the opposite
        #and we're not extremely overbought or extremely oversold
        #and our fast weighted moving average is above the medium weighted moving average
        
        if sentiment > 500 and not self.position.is_long and self.data.RSI<90:
            if self.data.WMA_10> self.data.WMA_20:
                self.buy(size=0.2, sl=self.data.BBLOWER) # trailing stop using bollinger bands
        elif sentiment < -500 and not self.position.is_short and self.data.RSI>10:
            if self.data.WMA_10< self.data.WMA_20:
                self.sell(size=0.2, sl=self.data.BBUPPER) # trailing stop using bollinger bands
        
        #If our moving average falls below we just close the trade and wait
        elif (self.data.WMA_10< self.data.WMA_20 and self.position.is_long) or (self.data.WMA_10> self.data.WMA_20 and self.position.is_short):
            for trade in self.trades:
                trade.close()
        for trade in self.trades: # trailing stop using bollinger bands
            if trade.is_long:
                trade.sl = self.data.BBLOWER
            else:
                trade.sl = self.data.BBUPPER 
                                    
        #elif (vader+afinn+senti+textblob) < 100 and (vader+afinn+senti+textblob)>-100:


#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(combined_ta_sent_data, TAEnsembleTwitterSumStrategy,cash=10000, commission=0.02, margin=1, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2018-06-05 00:00:00
End                       2020-09-16 00:00:00
Duration                    834 days 00:00:00
Exposure Time [%]                   59.548611
Equity Final [$]                 16499.588291
Equity Peak [$]                  17508.578433
Return [%]                          64.995883
Buy & Hold Return [%]              658.698857
Return (Ann.) [%]                   24.492872
Volatility (Ann.) [%]               19.214852
Sharpe Ratio                         1.274684
Sortino Ratio                        2.570255
Calmar Ratio                         1.525209
Max. Drawdown [%]                    -16.0587
Avg. Drawdown [%]                   -3.177038
Max. Drawdown Duration      524 days 00:00:00
Avg. Drawdown Duration       37 days 00:00:00
# Trades                                   15
Win Rate [%]                        53.333333
Best Trade [%]                     208.773694
Worst Trade [%]                     -16.17273
Avg. Trade [%]                    

In [531]:
bt.plot()

The overall P/L didn't change much but our drawdown went down significantly and we can appreciate much better trades and win rate in the chart.

## Ninth Iteration: Using a classifier for automatic TA analysis

Now it's time to try creating two classifiers for the TA that predicts the next candle and the next weekly candle.

In [536]:
classifier_df_data=ta_df.copy()
classifier_df_data=classifier_df_data.join(price.set_index('Date'),on='Date')
classifier_df_data['next_close']= classifier_df_data['Close'].shift(1)
classifier_df_data['next_week_close']= classifier_df_data['Close'].shift(7)
classifier_df_data.dropna(inplace=True)

In [542]:
len(classifier_df_data)

626

626 samples of data is VERY small for a classifier, but we'll see what happens

In [543]:
train=classifier_df_data.iloc[:200,:] # Preparing 200 samples for training
test=classifier_df_data.iloc[200:,:].drop(['next_close','next_week_close'],axis=1) #And the rest for test

In [544]:
train_x=train.drop(['next_close','next_week_close'],axis=1) #We now drop the next close and next week close data from our X
day_train_y=train['next_close']>train['Close'] #And we make a simple comparison of future price with current price for Y
week_train_y=train['next_week_close']>train['Close']

In [None]:
#We train both RandomForests using a maximum of 15 features and maximizing information gain
day_clas = RandomForestClassifier(n_estimators=10000, criterion='entropy', max_features=15, max_depth=100, random_state=0)
day_clas.fit(train_x, day_train_y)
week_clas = RandomForestClassifier(n_estimators=10000, criterion='entropy', max_features=15, max_depth=100, random_state=0)
week_clas.fit(train_x, week_train_y)
day_pred=day_clas.predict(test)
week_pred=week_clas.predict(test)

In [552]:
class_data=combined_data.iloc[-426:,:].copy() #We get the last 426 rows of the combined sentiment data
#And here we add our predictions to the data
class_data['higher_day_candle_prediction']=day_pred
class_data['higher_week_candle_prediction']=week_pred

In [559]:
class TreeSentimentTwitterSumStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        sentiment = self.data.Vader[-1]+self.data.Afinn[-1]+self.data.Senti[-1]+self.data.Textblob[-1] 

        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]
        
        sl_long = close[-1] * (1 - 0.2)
        tl_long = close[-1] * (1 - 0.1)

        sl_short = close[-1] * (1 + 0.2)
        tl_short = close[-1] * (1 + 0.1)
        
        #If our maximum positive sentiment is higher than our maximum negative sentiment
        #and we're not in a long position or the opposite
        #and we predict a increase in price in both our daily and weekly candle or viceversa
        if sentiment > 0 and not self.position.is_long and self.data.higher_day_candle_prediction==True and self.data.higher_week_candle_prediction==True:
            self.buy(size=0.2, sl=sl_long)
        elif sentiment < 0 and not self.position.is_short and self.data.higher_day_candle_prediction==False and self.data.higher_week_candle_prediction==False:
            self.sell(size=0.2, sl=sl_short)
        for trade in self.trades: # trailing stop
            if trade.is_long:
                trade.sl = tl_long
            else:
                trade.sl = tl_short              
                
#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(class_data, TreeSentimentTwitterSumStrategy,cash=10000, commission=0.02, margin=1, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2019-01-09 00:00:00
End                       2020-09-16 00:00:00
Duration                    616 days 00:00:00
Exposure Time [%]                    93.42723
Equity Final [$]                 19822.682383
Equity Peak [$]                  20899.823619
Return [%]                          98.226824
Buy & Hold Return [%]              552.468026
Return (Ann.) [%]                   49.894681
Volatility (Ann.) [%]               28.601416
Sharpe Ratio                         1.744483
Sortino Ratio                        4.815017
Calmar Ratio                         2.932726
Max. Drawdown [%]                  -17.013071
Avg. Drawdown [%]                   -2.510104
Max. Drawdown Duration      262 days 00:00:00
Avg. Drawdown Duration       22 days 00:00:00
# Trades                                   12
Win Rate [%]                        33.333333
Best Trade [%]                     244.923335
Worst Trade [%]                    -22.639643
Avg. Trade [%]                    

In [557]:
bt.plot()

A Sharpe ratio of 1.7 and good P/L with low drawdown despite the very small amount of data available for training. This is a very good result

## RNN: Using Recurrent Neural Networks for Training

Recurrent neural networks are a very modern and very useful form of neural network that feeds data back into the network. The LSTM model in Keras allows the network to simulate a **L**ong **S**hort **T**erm **M**emory and it's often used in time series prediction.

These kind of neural networks require the data to be feed in a special way, using a sequence window size (or step size). For example, the series 1,2,3,4,5 could be feed with a window of 3 like this: "1,2,3" -> "4" and "2,3,4" -> "5"

The following code generates the sequences and creates a model that learns over time and predicts the values as they come, we then take that information and save it for prediction later.

In [None]:
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
            # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

def UniLSTM(data, train_size, win_siz, epochs, units, model_type='Univariate'):    
    #We are going to scale the data, but we can't include the test in the scaling
    #so we must separate the training and test, then scale them then recombine them
    separatedData=data.iloc[:,2:3].values
    
    training_data=separatedData[:train_size]
    test_data=separatedData[train_size:]
    
    sc = MinMaxScaler() # por defecto es 0,1 
    training_set = sc.fit_transform(training_data)
    test_set = sc.transform(test_data)
    
    total_set=np.append(training_set,test_set)
  
    #At this moment we split the entire set into sequences for LSTM
    X, y = split_sequence(total_set, win_siz)
    n_features = 1
    X = X.reshape((X.shape[0], X.shape[1], n_features))
    
    #We separate the train and test data in an ordered way
    X_train=X[0:train_size]
    X_test=X[train_size:]
    y_train=y[0:train_size]
    y_test=y[train_size:]
    
    if model_type =='Stacked':
        #Stacked LSTM
        model = Sequential()
        model.add(LSTM(units, activation='relu', return_sequences=True, input_shape=(win_siz, n_features)))
        model.add(LSTM(units, activation='relu', return_sequences=False, input_shape=(win_siz, n_features)))
        model.add(Dense(1))
        model.compile(optimizer='adam', loss='mse')
    elif model_type =='Bidirectional':
        #Bidirectional LSTM
        model = Sequential()
        model.add(Bidirectional(LSTM(units, activation='relu'), input_shape=(win_siz, n_features)))
        model.add(Dense(1))
        model.compile(optimizer='adam', loss='mse')
    else:
        #Univariate LSTM
        model = Sequential()
        model.add(LSTM(units, activation='relu', return_sequences=False, input_shape=(win_siz, n_features)))
        model.add(Dense(1))
        model.compile(optimizer='adam', loss='mse')
    
    #Fit model on training data
    model.fit(X_train, y_train, epochs=epochs, verbose=1)
    
    #Make predictions and do autoregressive learning on the new data
    predictions=[]
    for input_price, real_price in zip(X_test,y_test):
        y_input = real_price.reshape(1,1)
        x_input = input_price.reshape((1, win_siz, n_features))
        model.save_weights("model.h5")
        predicted_price=model.predict(x_input)
        predictions.append(sc.inverse_transform(predicted_price))
        model.load_weights("model.h5")
        model.fit(x_input,y_input, verbose=0)
    return predictions

N_TRAIN=200
WIN_SIZE=50

UniLSTM_pred=UniLSTM(price, N_TRAIN, WIN_SIZE, 20, 30, model_type='Univariate')
StackedLSTM_pred=UniLSTM(price, N_TRAIN, WIN_SIZE, 20, 15, model_type='Stacked')
BiLSTM_pred=UniLSTM(price, N_TRAIN, WIN_SIZE, 20, 20, model_type='Bidirectional')

Prepare the price dataset with the predictions. First we must reshape our numpy array obtained from the predictions
and then we need to convert it to an array for passing it more easily into the pandas dataframe.

In [626]:
UniLSTM_pred=np.reshape(UniLSTM_pred[:-1],(len(UniLSTM_pred)))
StackedLSTM_pred=np.reshape(StackedLSTM_pred[:-1],(len(UniLSTM_pred)))
BiLSTM_pred=np.reshape(BiLSTM_pred[:-1],(len(UniLSTM_pred)))
UniLSTM_pred=list(UniLSTM_pred)
StackedLSTM_pred=list(StackedLSTM_pred)
BiLSTM_pred=list(BiLSTM_pred)

real_prices=price.iloc[-(len(UniLSTM_pred)):,:].copy()
real_prices['UniLSTM_Prediction']=UniLSTM_pred[:-1]
real_prices['UniLSTM_Buy']=real_prices['UniLSTM_Prediction']>real_prices['Close']
real_prices['StackedLSTM_Prediction']=StackedLSTM_pred[:-1]
real_prices['StackedLSTM_Buy']=real_prices['StackedLSTM_Prediction']>real_prices['Close']
real_prices['BiLSTM_Prediction']=BiLSTM_pred[:-1]
real_prices['BiLSTM_Buy']=real_prices['BiLSTM_Prediction']>real_prices['Close']
real_prices.set_index('Date',inplace=True)

In [639]:
class_data.index.name = 'Date'
real_prices.drop(['Adj Close', 'Close', 'High', 'Low', 'Open', 'Volume'],axis=1,inplace=True)
final_merged_data=class_data.join(real_prices, on='Date')

In [646]:
class RNNTreeSentimentTwitterSumStrategy(Strategy):
    def init(self):        
        None
    def next(self):
        sentiment = self.data.Vader[-1]+self.data.Afinn[-1]+self.data.Senti[-1]+self.data.Textblob[-1] 

        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]
        
        sl_long = close[-1] * (1 - 0.2)
        tl_long = close[-1] * (1 - 0.1)

        sl_short = close[-1] * (1 + 0.2)
        tl_short = close[-1] * (1 + 0.1)
        
        #If our maximum positive sentiment is higher than our maximum negative sentiment
        #and we're not in a long position or the opposite
        #and we predict a increase in price in both our daily and weekly candle or viceversa
        if sentiment > 0 and not self.position.is_long and self.data.higher_day_candle_prediction==True and self.data.higher_week_candle_prediction==True:
            if self.data.UniLSTM_Buy and self.data.StackedLSTM_Buy and self.data.BiLSTM_Buy:
                self.buy(size=0.2, sl=sl_long)
        elif sentiment < 0 and not self.position.is_short and self.data.higher_day_candle_prediction==False and self.data.higher_week_candle_prediction==False:
            if self.data.UniLSTM_Buy and not self.data.StackedLSTM_Buy and not self.data.BiLSTM_Buy:
                self.sell(size=0.2, sl=sl_short)
        for trade in self.trades: # trailing stop
            if trade.is_long:
                trade.sl = tl_long
            else:
                trade.sl = tl_short              
                
#2% commission, no margin, no hedging and trade on next open.
bt = Backtest(final_merged_data, RNNTreeSentimentTwitterSumStrategy,cash=10000, commission=0.02, margin=1, trade_on_close=False, hedging=False, exclusive_orders=True)
bt.run()

Start                     2019-01-09 00:00:00
End                       2020-09-16 00:00:00
Duration                    616 days 00:00:00
Exposure Time [%]                   93.192488
Equity Final [$]                  19807.10531
Equity Peak [$]                  20884.246546
Return [%]                          98.071053
Buy & Hold Return [%]              552.468026
Return (Ann.) [%]                   49.824991
Volatility (Ann.) [%]               28.473162
Sharpe Ratio                         1.749893
Sortino Ratio                        4.851727
Calmar Ratio                         2.910437
Max. Drawdown [%]                  -17.119417
Avg. Drawdown [%]                   -2.515122
Max. Drawdown Duration      262 days 00:00:00
Avg. Drawdown Duration       22 days 00:00:00
# Trades                                   11
Win Rate [%]                        36.363636
Best Trade [%]                     244.923335
Worst Trade [%]                    -22.639643
Avg. Trade [%]                    

In [644]:
bt.plot()

There was almost no variation at all in comparison with the past strategy. The algorithm still makes the mistake of buying during the two big falls.

# Conclusions

This project illustrates many of the techniques that can be used in Quant research and trading algorithms development but it's not fully exhaustive: research takes time and money and it's certainly not easy. We can explore many further options and alternatives like:

### Using more data

This project was done on nearly 2 years of data and slightly above 1 million tweets, that's not much data, sadly. Bigger data sets could make the machine learning models significantly better.

### Exploring other NLP alternatives

Sentiment analysis is a field of study with many proposals, using many techniques. In general many of the proposed techniques use either supervised learning or a lexicon approach. A supervised learning approach for tweets is unlikely to be effective because classifying manually the tweets and their impact on the price is prone to error.

On the other side, lexicon approaches are a good fit and this project covered 4 of them, but there other alternatives out there using Deep Learning, cloud providers, and special lexicon generator algorithms. We can also explore on preprocessing the text first before doing the analysis and many other techniques.

### Modifying our Ensemble systems

We combined all our NLP models with a similar weight and without giving preference to any of them, but our tests revealed that Afinn performed far worse than Textblob. We could add weights of importance to each of our sentiment analyzers or discard some of the less good performing ones.

### Portfolio management, diversification and money management techniques

This project barely glanced on the idea of stop loss, trailing stop loss and adaptative/variable stop loss and demonstrated that in general a take profit isn't very good for long term trading.

However, that is just the tip of the iceberg, the real important piece of the puzzle is in maximing the returns of the portfolio with the less risk possible. This entails picking from thousands of securities the best performing ones (or the worst if we can short). This is a non-trivial optimization problem that is at the core of quant research.

While diversification is a good starting point for reducing risk in your portfolio, more advanced techniques like hedging or carry trades are also a possibility. Portfolio management is a very tricky subject even in the field of automated algorithms, with reinforcement learning being one of the most suggested ways to tackle it.

However in order to estimate the risk/return ratio of each security so our algorithm can take the proper measures we also need to determine the proper maximum risk/return we can obtain, and there are many techniques that weren't covered in this project. One of the most common ones being to move your stop loss to your initial price for break even, and then let the trade go on it's way. 

More complex techniques implicate taking profit at certain thresholds but without closing the trade. Easing into a position is also a very popular strategy when planning for long term.

### Fully exploiting the TA indicators

This project only used a few of the technical analysis indicators in a quick test to showcase how it could positively influence our trading decisions when combined with our sentiment analysis. While it's possible to chart the TA indicators and do studies on conditions and backtest on them, we should always remember to calculate them properly to avoid data leaks and to fit them on a separated section of training, then testing it on a validation set and only after we've found a good combination of TA indicators; test them on a test section.

Disclaimer: Pure TA analysis takes a considerable amount of time and it's subject to human bias. If tests are going to be done on specific TA techniques it's suggested to use optimizations algorithms to determine optimal values instead of using grid search techniques as the search space is exponential.

Just like in the project, our TA indicators can serve as input data for a ML algorithm. In this case we saw the use of Random Forests but there exists many other systems for regression and classification.

In general these systems do not have memory of previous values in the time series, but that's where the TA indicators come in and give them that "memory" by providing the values for moving average, volatility, deviation, etc...

However, to make the most of these algorithms we need to scale all our indicators into a range measured in deviations or it won't scale into the future.

For example: Say we train our algorithm on price data between 10 and 40. If price data goes into 300 or 400 it's very possible our algorithm losses a lot of value because the indicators themselves go out of the training range. While some algorithms are more resistant to this, if we can measure the separation from the SMA to the close as a consistent value that scales on time like: 10% above the close or 40% above close instead of a fixed value our algorithms can learn **much** better.

### Properly training the recurrent neural network

RNN with LSTM is very good on time series, specially if we do autoregressive analysis. The main problem lies in defining the structure of the network and training it. This is a non-trivial process which can take a considerable amount of time to achieve good results. 

Asides from the three shallow networks used in the project it's also possible to add modification layers like dropout to avoid overfitting or more dense layers to avoid memory overfitting. Combining the predictions from networks with bigger memory and networks with shorter memory and a long etc...

It's also possible (although non-trivial) to combine the structuring of the neural network with optimization algorithms to determine local maxima configurations on our train data, which we could then extend to our validation data.

There is also a special type of RNN configuration we can use where we can feed multiple sequences of input to obtain a single output, this could be used with TA indicators as well just like we did with the random forest, it just needs to be properly converted.

### Add information with more lag or less lag to the classifiers and in our studies.

While some tests were done with predicting the next weekly candle using our current candle and some indicators, those were minor tests. Using aggregated information such as weekly SMA, weekly ATR, etc... and combining that information with our current daily, or hourly data is a very powerful technique that can help eliminate many bad trades by helping our algorithms keep on the overall right direction when making trades instead of taking small bad trades based on a small momentum on the lower timeframes.

This kind of information lag can be coded and fed into our classification algorithms and networks (or used manually) and it's often very powerful in markets with plenty of noise.

### Finally, combining the multiple classifiers into a single classifier ensemble system

This project used a random forest to combine the TA data, but just like we combined the TA data, we could combine **all** our data for our classifier, be it the sentiment ensemble data, our RNN predictions, our TA... this could also be used by an agent in a reinforcement learning system to learn how to use that data for proper trading.