<div style="float:right; width:100px; text-align: center; margin: 10px;">
<img src="https://crypto-lake.com/assets/img/lake.png" alt="Lake"/>
</div>

# Twitter influencer analysis

Backtest of a simple twitter influencer counter-trade strategy. Inspired by the well known Inverse-Cramer strategy.

More information about this is in the original twitter thread: https://twitter.com/crypto_lake_com/status/1704111464573706242

We use [crypto-lake.com](https://crypto-lake.com/#data) sample/free market data, BTC-USDT market on Binance. This notebook also requires your twitter credentils, see the following cells.

Quick links:
- [edit this notebook online](https://mybinder.org/v2/gh/crypto-lake/analysis-sharing/main?filepath=twitter_analysis.ipynb) using Binder
- [follow our activity on twitter](https://twitter.com/intent/user?screen_name=crypto_lake_com)

In [14]:
!pip install -q tweety-ns python-dotenv lakeapi cufflinks

In [2]:
import datetime
import os

import pandas as pd
import cufflinks as cf

import tweety
import tweety.types
import lakeapi
import dotenv

dotenv.load_dotenv()
cf.go_offline()

# Use free Lake sample data containing BTC minute candles between 2023-08-20 and 2023-09-20
# For longer time frame, subscribe to Crypto Lake to get access key or use your own data
lakeapi.use_sample_data(anonymous_access=True)

In [3]:
pd.set_option('display.width', 1000)
pd.set_option('display.max_columns', 30)
pd.set_option('display.max_colwidth', 500)
pd.set_option('display.max_rows', 30)
pd.set_option('display.precision', 8)

## Load & prepare data

In [4]:
# Put your Twitter credentials in .env file or insert them here
app = tweety.Twitter("session")
app.sign_in(os.environ['TWITTER_USERNAME'], os.environ['TWITTER_PASSWORD'])
target_username = "Vertox_DF"

In [5]:
# Download tweets from the user
user = app.get_user_info(target_username)
all_tweets = app.get_tweets(user, replies = True, pages = 10)

len(all_tweets)

155

In [6]:
# Unpack threads
tweets  = []
for t in all_tweets:
	if type(t) == tweety.types.Tweet:
		tweets.append(t)
	elif type(t) == tweety.types.SelfThread:
		for t in t.tweets:
			if t.author.username == target_username:
				tweets.append(t)
	else:
		raise ValueError(t)

len(tweets)

179

In [7]:
# Convert tweets to a dataframe
def to_pandas(tweet):
	return dict(tweet) | {'author': tweet.author.username}

df = pd.DataFrame(map(to_pandas, tweets))
df['date'] = pd.to_datetime(df['date']).dt.tz_localize(None)
df = df[['date', 'author', 'text', 'is_retweet', 'is_reply', 'url']]
df = df.dropna().sort_values('date')
df.sample(3)

Unnamed: 0,date,author,text,is_retweet,is_reply,url
122,2023-08-28 15:01:34,Vertox_DF,"@Message_RS I started off with cross exchange feather arbs. Would buy feather packs from the fisherman, extract the feathers and sell them on the GE.",False,True,https://twitter.com/Vertox_DF/status/1696176199326695930
151,2023-08-26 07:28:01,Vertox_DF,@spinGreekGod No clue what that is,False,True,https://twitter.com/Vertox_DF/status/1695337285321986060
49,2023-09-08 19:06:41,Vertox_DF,There are of course scenarios where if you hit capacity you can't add more capital but if you haven't you should add more.\n\n9/n,False,True,https://twitter.com/Vertox_DF/status/1700224153901490400


In [8]:
# Optional: cache the data. This is useful for reusing so that Elon wont block you :)
# df.to_csv(f'{target_username}_tweets.csv')

In [9]:
# Here we hardcoded some common phrases, but you can use something more complex or sentiment analysis
# Vertox often tweets just we're so back or it's so over, which makes our life easier
influencer_up = df.text.str.contains('so back') | df.text.str.contains('pump [^a]') | df.text.str.match('.*(is|looks|will be) good.*', case = False) | df.text.str.contains('keeps going')
influencer_down = df.text.str.contains('so over') | df.text.str.contains('[^d] dump') | df.text.str.match('.*(aren|isn|wasn).{,20}good.*', case = False) | df.text.str.match('.*what.{,20}is going on.*', case = False) | df.text.str.contains('liquidated|liquidations')

# Trade against the sentiment
df['prediction'] = 0
df.loc[influencer_up & ~influencer_down, 'prediction'] = -1
df.loc[influencer_down & ~influencer_up, 'prediction'] = +1
# df[df.prediction != 0]

In [10]:
# We use free Lake sample data containing BTC minute candles between 2023-08-20 and 2023-09-20
# For longer time frame, subscribe to Crypto Lake to get access key or use your own data
candles = lakeapi.load_data(table = 'candles', start = df.iloc[0].date, symbols = ['BTC-USDT'], exchanges = ['BINANCE'])
candles['volume_24h'] = candles['volume'].rolling(24*60, min_periods = 12*60).sum()
candles.tail(3)

  0%|          | 0/27 [00:00<?, ?it/s]

Unnamed: 0,origin_time,open,high,low,close,volume,trades,received_time,start,stop,exchange,symbol,volume_24h
38877,2023-09-19 23:57:00,27212.35,27212.36,27204.88,27204.89,5.14826,288,2023-09-19 23:57:00,1695167820.0,1695167880.0,BINANCE,BTC-USDT,36218.15856
38878,2023-09-19 23:58:00,27204.88,27205.54,27202.47,27205.54,7.29524,232,2023-09-19 23:58:00,1695167880.0,1695167940.0,BINANCE,BTC-USDT,36209.41612
38879,2023-09-19 23:59:00,27205.53,27210.26,27205.53,27210.26,2.7091,147,2023-09-19 23:59:00,1695167940.0,1695168000.0,BINANCE,BTC-USDT,36190.46175


In [11]:
# Merge tweet data and trading candle data
df['future_date'] = df['date'] + datetime.timedelta(hours = 3*24)
merged = pd.merge_asof(df, candles[['close', 'volume', 'volume_24h', 'received_time']], left_on = 'date', right_on = 'received_time', direction = 'nearest', tolerance = pd.Timedelta('5min'))
merged = pd.merge_asof(merged, candles[['close', 'received_time']], left_on = 'future_date', right_on = 'received_time', direction = 'nearest', suffixes = ('', '_future'), tolerance = pd.Timedelta('5min'))
merged = merged.drop(columns = ['received_time', 'received_time_future', 'future_date']).dropna()

## Trading simulation

In [12]:
def volume_based_sizing(volume):
	# Thresholds based on volume_24h histogram/percentiles
	# Sizing based on volatility is the common trick to improve stability and volume is a common proxy for volatility
	if volume < 20_000:
		return 2
	elif volume < 30_000:
		return 1.5
	elif volume < 40_000:
		return 0.5
	else:
		return 0.25

merged['btc_return'] = merged['close_future'] / merged['close'] - 1
merged['sizing'] = merged['volume_24h'].apply(volume_based_sizing)
merged['our_return'] = merged['btc_return'] * merged['prediction'] * merged['sizing']
merged['equity'] = merged['our_return'].cumsum()
merged[merged.prediction != 0].drop(columns = 'url').head()

Unnamed: 0,date,author,text,is_retweet,is_reply,prediction,close,volume,volume_24h,close_future,btc_return,sizing,our_return,equity
0,2023-08-24 14:59:02,Vertox_DF,@OHare888 It's so over for you bro,False,True,1,26116.05,18.03312,17613.46288,26145.28,0.00111924,2.0,0.00223847,0.00223847
9,2023-08-25 11:49:33,Vertox_DF,@VexTxs truly beautiful. The perfect dump,False,True,1,26092.01,2.98171,27792.49152,25967.0,-0.00479112,1.5,-0.00718668,-0.00494821
18,2023-08-25 20:53:00,Vertox_DF,The other day XRPUSD perp vs XRPUSDT perp on Bybit diverged over 50%! \n\nIf you were leveraged say 2-3x relatively early onto the trade would you have survived this trade without getting liquidated or similar?\n\n5/n https://t.co/I0Gf1c2vky,False,True,1,26034.01,7.56852,28359.20599,25998.71,-0.00135592,1.5,-0.00203388,-0.00698209
23,2023-08-26 06:09:53,Vertox_DF,@icebergy_ It's so over,False,True,1,26055.72,37.81869,26154.08846,26053.28,-9.365e-05,1.5,-0.00014047,-0.00712256
27,2023-08-26 07:27:12,Vertox_DF,@pepecoineth We so back https://t.co/2vQtYQ6UAb,False,True,-1,26071.99,2.79305,25640.36122,26034.99,-0.00141915,1.5,0.00212872,-0.00499384


## Evaluation

In [13]:
merged['btc_return_benchmark'] = merged['close'] / merged['close'].iloc[0] - 1
merged.set_index('date')[['equity', 'btc_return_benchmark']].iplot(yTitle = 'Return')