# Unlocking Financial Frontiers: A Deep Dive into Revolutionizing Trading with Pretrained Deep Learning Models and Python Bots

#### by Atanas Vasev

# Introduction

This term paper introduces a pioneering method in the domain of trading, harnessing the capabilities of pretrained deep learning models intricately designed for the complexities of finance. Through the deliberate crafting of a strategic blueprint, the paper outlines the seamless integration of this intelligence into a Python-driven trading bot, subjected to meticulous backtesting using authentic market data. The primary objective is to unveil the transformative potential of deep learning, offering traders liberation from the tedious routine of perpetual financial news analysis.

Given the nature of this paper, situated within a Deep Learning course, it is imperative to acknowledge the formidable challenges in developing the underlying architecture of these models. The computational demands for training from scratch pose significant obstacles in terms of time and cost. While tools such as Google Colab offer a solution, the complexity of creating a neural network from scratch necessitates a judicious approach.

This leads us to the pragmatic choice of utilizing pretrained models. Acknowledging the challenges of constructing architectures from the ground up, this approach aims not only to streamline trading processes but also to enhance overall quality of life. By aligning with the fundamental principles of scientific and general progress, this term paper seeks to make our existence not only more efficient and productive but also undeniably better.

# 1. BERT - Bidirectional Encoder Representation from Transformer

BERT enables context awareness for sentences. It is one of the most popular state of the art text embedding model published by Google. BERT has caused a revolution in the world of NLP by providing superior results on many NLP tasks, such as question answering, text generation, sentence classification, and many more compared to other methods.One of the reasons BERT is more successful is that it uses a context based embedding model. 

Consider the example below:

**Sentence 1: The python ate the rabbit**

**Sentence 2: Python is one of the most popular programming languages**

Without context, the word python would have the same meaning in both sentences. BERT looks at the sentence and figures out what words python is related to in the sentence, and will create embedding of the word python based on the context. BERT does this by using transformers, which is a state of the art deep learning architecture, that is mostly used for Natural language processing. The architecture uses encoder-decoder paradigm.

The encoder takes the input sentence and learns its representation and then sends the representation to the decoder. The decoder generates the output sentence. The transformer architecture uses many layers of encoders to generate the representation. BERT can be thought of as a transformer, but only with encoders. BERT has different configurations based on how many encoder layers it uses.

<img src="./IMG/BERT_Arhitecture.PNG" width="800">

The BERT model is pretrained on a large corpus of words. What is pretraining? Pretraining is when we train a model with a huge dataset and potentially a very large number of parameters for a particular task and save the trained model. For any new task, instead of initializing a new model with random weights, we will initialize it with the weights of the trained model, and adjust the weights for the new task. This is helpful since for our work we may not have easy access to huge volumes of training data, and we will save a lot of time and resources that were spent on training the model. 

The BERT model is pre-trained using two tasks: masked language modelling and next sentence prediction. BERT models have been trained on BookCorpus and English Wikipedia, which have in total more than 3.5 Billion words. Many domain specific models have emerged using BERT as the base and are being used for NLP tasks. Some of them are: FinBERT for Finance, BioBERT for Biomedical, VideoBERT for Video captioning categorization, ClinicalBERT for hospitals, and many more continue to evolve. If you are looking for cutting edge, deep learning pre-trained models for any domain, it would be worth researching to see if a DomainBERT model for that area exists.

### 1.1 FinBERT

FinBERT is a language model based on BERT. It further trains the BERT model for financial data. The additional training corpus is a set of 1.8M Reuters’ news articles and Financial PhraseBank.The main sentiment analysis dataset used is Financial PhraseBank which consists of 4845 English sentences selected randomly from financial news found on LexisNexis database. These sentences then were annotated by 16 people with backgrounds in finance and business.

<img src="./IMG/FinBERT.PNG" width="800">

### 1.2 Importance of sentiment analysi in Finance

Sentiment analysis, meanwhile, is a very common task in NLP that aims to assign a "feeling" or an "emotion" to text. Typically, it predicts whether the sentiment is positive, negative, or neutral.
You often see sentiment analysis around social media responses to hot-button issues or to determine the success of an ad campaign. But it's promising in the financial domain as changes in sentiment around a company could help predict a rise or fall in that company's stock. 


### 1.3 Brief implementation

FinBERT implementation is reliant on Hugging Face’s pytorch_pretrained_bert library and their implementation of BERT for sequence classification tasks.In order to demonstrate FinBert in action, I will use a financial news dataset from [Kaggle](https://www.kaggle.com/datasets/notlucasp/financial-news-headlines).I have used Google colab to run this code to avoid making new enviroment just to show a qick demo of the FinBERT in action.

Here is the result of the sentiment analysis:

The code is in the **[FinBERT_implementation](./FinBERT_implementation.ipynb)**

<img src="./IMG/FinBERT_Analysis.PNG" width="500">

We manage it to do it in 20 lines of code and without any training from us.

### Conclusion

FinBERT makes the job of sentiment analysis for financial feeds very easy, and a lot more accurate. The heavy lifting for training and testing a model on a very large financial corpus has already been done by the researchers, and the model has been made public by Hugging Face. The rest of us can simply use it with very few lines of code to get fairly accurate results for financial sentiment analysis.

# 2. Trading Bot implementation

Some of the world's largest hedge funds are driven by algorithmic trading, outperforming the market through the utilization of cutting-edge algorithms. Now, let's endeavor to create our own.

The initial step involves constructing our foundational bot. We will use [Alpaca trading API](https://alpaca.markets/) where we will acutaly test the bot.

*note: the code implementation will be in Visual Studio Code, but i will provide parts of the code in the notebook and explain them.

### 2.1 Instaling dependencies and testing the first trade

Firstly, let's set up the necessary dependencies. Ensure you have a virtual environment with Python 3.10 installed, along with the following libraries: lumibot, timedelta, alpaca-trade-api, torch, torchvision, torchaudio, and transformers. Additionally, you'll need to create an account on the [Alpaca trading API](https://alpaca.markets/), which is free, and obtain API keys for testing the bot.

Lets see the first trade of our both:

**[First Buy Code](./First_Buy.py)**

<img src="./IMG/First_Buy.PNG" width="1000">

Okay so we just bought 10 SPY contracts at a price of $ 469.49, at the start date that we have set and how the market is doing until the end date from the code.

Other thing that we can check is the **[tearsheet](./logs/MLTrader_2024-01-25_12-43-32_tearsheet.html)** of our test buy, we can see more important information abouth the trade and the strategy.

Now, we have a bot that randomly acquires a few shares intermittently, a strategy that may not be optimal for profit generation. It is essential to introduce position sizing and limits to effectively manage our funds.

### 2.2 Position sizing and limits

In determining our optimal position size, we employ the cash-at-risk method—a strategy that encapsulates the total monetary commitment made by an investor or trader to a specific trade. This metric represents the capital vulnerable to loss if the trade deviates from the expected outcome. In our approach, we've configured the parameter to be 50% of the available cash, allowing for dynamic adjustments based on risk appetite and trading preferences.

Moving forward, our strategy involves the implementation of stop loss and take profit parameters—fundamental tools in trading for managing risk and securing profits. A Stop Loss order automatically sells a security when it reaches a specified price, curbing potential losses. Conversely, a Take Profit order automatically sells a security at a predetermined price to lock in profits. Both orders provide a systematic framework for risk management and profit capture, fostering a disciplined and strategic trading approach. They are set acordigly - Take Profit to 20% and Stop Loss to 5%

After we created this parameter let's test the **[code](./Position_Sizing.py)**

<img src="./IMG/Sizing.PNG" width="1000">

So we just bought 106 shares of the SPY index, which is half of our cash as we set in the cash-at-risk parameter

### 2.3 News

Now that our bot is up and running, it's time to delve into the strategy and incorporate some cutting-edge Deep Learning elements. Forecasting market trends has historically been a laborious task, with traders dedicating countless hours to reading and analyzing news. This is where neural networks, specifically FinBERT, come into play. The objective is to extract insights from the news in the three days preceding our market actions and delegate the decision-making process to the model. FinBERT takes on the intricate task of deciphering the information and determining the optimal course of action.

# TODO explain the get news!

# Refferences
[1] Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace by Ivan Goncharov, Jul 28, 2023

[2] Financial Sentiment Analysis using FinBert by Praveen Purohit, Dec, 2021