# Backtest the strategies

Use an LLM to go through and predict the buy/ sell/ hold recommendation for the company for the given date. Steps needed:

1. Load the LLM - use DeepSeek R1 Qwen model at 7B parameters first and try the quantised models next
2. Step through each data and each financial statement to get a result
3. Log the results in a file and save to S3 (will need a logging file to save to S3 and resume in case of kernel crash)
4. Need a backtesting framework to apply the results


## Load libraries needed

In [1]:
import json
import boto3
from s3fs import S3FileSystem
import os

import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from huggingface_hub import login
import torch
from accelerate import Accelerator

import pandas as pd

from IPython.display import Markdown, display

from helper import get_s3_folder
import s3_model
import company_data
from s3_model import S3ModelHelper

In [2]:
import importlib
importlib.reload(company_data)
importlib.reload(s3_model)

<module 's3_model' from '/project/s3_model.py'>

## Load the LLM

Models to test:
- Qwen (Qwen/Qwen2.5-7B-Instruct)
- Llama (meta-llama/Llama-3.2-7B-Instruct)
- DeepSeek (deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)

In [3]:
# Log into Huggingface

with open('pass.txt') as p:
    hf_login = p.read()
    
hf_login = hf_login[hf_login.find('=')+1:hf_login.find('\n')]
login(hf_login, add_to_git_credential=False)

In [4]:
# Set up Quantization 
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4"

)

In [5]:
# Flag to download from Huggingface again or use stored model
USE_HF = False
USE_QUANTIZATION = False

model_id = "meta-llama/Llama-3.2-3B-Instruct"
model_id_s3 = 'llama'


if USE_HF:
   
    pipeline = transformers.pipeline(
        "text-generation",
        model=model_id,
        model_kwargs={"torch_dtype": torch.bfloat16},
        device_map="auto",
    )
    
    if USE_QUANTIZATION:
        model = AutoModelForCausalLM.from_pretrained(model_id, device_map='auto', quantization_config=quant_config)
    else:
        model = AutoModelForCausalLM.from_pretrained(model_id, device_map='auto', torch_dtype=torch.bfloat16)
    tokenizer = AutoTokenizer.from_pretrained(model_id)
else:
    model_helper = s3_model.S3ModelHelper(s3_sub_folder='tmp/fs')
    if USE_QUANTIZATION:
        model = model_helper.load_model(model_id_s3, quant_config)
    else:
        model = model_helper.load_model(model_id_s3)
    tokenizer = AutoTokenizer.from_pretrained(model_id)

    pipeline = transformers.pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer,
        model_kwargs={"torch_dtype": torch.bfloat16},
        device_map="auto",
    )
    model_helper.clear_folder(model_id_s3)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Device set to use cuda:0


## Load Financial PIT dataset

In [6]:
## Load from S3 using the helper file
sec_helper = company_data.SecurityData('tmp/fs','data_quarterly_pit.json')
all_data = sec_helper.get_all_data()

In [22]:
# USE WHILE DEVELOPING to
importlib.reload(company_data)
sec_helper = company_data.SecurityData('tmp/fs','data_quarterly_pit.json', all_data)

In [7]:
sec_helper.get_security_statement('2020-01-31','AON UN Equity','px')

Unnamed: 0_level_0,Price
Date,Unnamed: 1_level_1
2019-01-31,100.83
2019-02-28,101.25
2019-03-31,103.69
2019-04-30,118.13
2019-05-31,124.87
2019-06-30,127.68
2019-07-31,127.12
2019-08-31,129.44
2019-09-30,124.43
2019-10-31,125.22


In [48]:
system_prompt = "You are a financial analyst and must make a buy, sell or hold decision on a company based only on the provided datasets. \
        Compute common financial ratios and then determine the buy or sell decision. Explain your reasons in less than 500 words. Provide a \
        confidence score for how confident you are of the decision. If you are not confident then lower the confidence score. Provide your answer so it compiles to a JSON object. \
        Answer in the following JSON format only:\
        {'Decision': BUY, 'confidence score': 80, 'Reason': 'Gross profit and EPS have both increased over time'} \
        {'Decision': SELL, 'confidence score': 90, 'Reason': 'Price has declined and EPS is falling'}"


In [49]:
prompt = sec_helper.get_prompt('2020-01-31','AON UN Equity', system_prompt)

In [50]:
prompt

[{'role': 'system',
  'content': "You are a financial analyst and must make a buy, sell or hold decision on a company based only on the provided datasets.         Compute common financial ratios and then determine the buy or sell decision. Explain your reasons in less than 500 words. Provide a         confidence score for how confident you are of the decision. If you are not confident then lower the confidence score. Provide your answer so it compiles to a JSON object.         Answer in the following JSON format only:        {'Decision': BUY, 'confidence score': 80, 'Reason': 'Gross profit and EPS have both increased over time'}         {'Decision': SELL, 'confidence score': 90, 'Reason': 'Price has declined and EPS is falling'}"},
 {'role': 'user',
  'content': 'Income Statement:                                                                 t           t-1           t-2           t-3           t-4           t-5\n01 Revenue (Adj)                                      9.687000e+08  9.4

## Run an example in LLM

Run into out of memory problem - Potential fixes:
1. reduce size of model (quantize)
2. explore multi-gpu
3. reduce tokens

https://saturncloud.io/blog/how-to-solve-cuda-out-of-memory-error-in-pytorch/

https://huggingface.co/docs/accelerate/usage_guides/distributed_inference

https://medium.com/@geronimo7/llms-multi-gpu-inference-with-accelerate-5a8333e4c5db

Problem with splitting a single prompt into multiple gpus to calculate the result. Tensor parallelism - https://huggingface.co/docs/transformers/main/en/perf_train_gpu_many#tensor-parallelism

nvidia-smi will show available GPUs on the system.

In [76]:
accelerator = Accelerator()

In [77]:
a_model = accelerator.prepare(model)

In [52]:
tokens = tokenizer.apply_chat_template(prompt, tokenize=True)

In [53]:
len(tokens)

5165

In [27]:
def format_json(llm_output):
    form = llm_output['content'].replace('\n','')
    eoj = form.find('}```')
    additional = form[eoj + 4:]
    json_obj = json.loads(form[7:eoj + 1])
    json_obj['AdditionalContext'] = additional
    return json_obj

In [45]:
formatted_chat = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)
outputs = pipeline(
    prompt,
    max_new_tokens=1000,
)

test_output = outputs[0]['generated_text'][-1]
display(Markdown(test_output['content']))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


To make a buy, sell, or hold decision on the company, I will compute common financial ratios and analyze the data.

**Financial Ratios:**

1. **Gross Margin Ratio**: (Gross Profit / Revenue) = (1.878e+08 / 9.687e+08) = 0.193
2. **Operating Margin Ratio**: (Operating Income / Revenue) = (2.680e+07 / 9.687e+08) = 0.0277
3. **Return on Equity (ROE)**: (Net Income / Total Equity) = (1.010e+07 / 1.125e+09) = 0.0090
4. **Debt-to-Equity Ratio**: (Total Liabilities / Total Equity) = (3.259e+09 / 1.125e+09) = 2.89
5. **Current Ratio**: (Current Assets / Current Liabilities) = (1.010e+09 / 1.124e+09) = 0.90

**Analysis:**

Based on the financial ratios, the company has:

* A relatively high gross margin ratio of 0.193, indicating a good ability to maintain pricing power.
* A relatively low operating margin ratio of 0.0277, indicating a need to improve operational efficiency.
* A relatively low return on equity (ROE) of 0.0090, indicating a need to improve profitability.
* A relatively high debt-to-equity ratio of 2.89, indicating a high level of debt and potential risk.
* A relatively low current ratio of 0.90, indicating a need to improve liquidity.

**Decision:**

Based on the analysis, I would recommend a **BUY** decision with a confidence score of 80. The company's high gross margin ratio and relatively low debt-to-equity ratio are positive indicators. However, the low operating margin ratio and high debt-to-equity ratio are concerns that need to be addressed.

**JSON Response:**

```json
{
  "Decision": "BUY",
  "confidence score": 80,
  "Reason": "Gross profit and EPS have both increased over time, but there are concerns about operational efficiency and debt levels."
}
```

Note: The confidence score is subjective and based on my analysis of the data. It may vary depending on individual perspectives and market conditions.

In [46]:
#display(format_json(test_output))

In [47]:
test_output

{'role': 'assistant',
 'content': 'To make a buy, sell, or hold decision on the company, I will compute common financial ratios and analyze the data.\n\n**Financial Ratios:**\n\n1. **Gross Margin Ratio**: (Gross Profit / Revenue) = (1.878e+08 / 9.687e+08) = 0.193\n2. **Operating Margin Ratio**: (Operating Income / Revenue) = (2.680e+07 / 9.687e+08) = 0.0277\n3. **Return on Equity (ROE)**: (Net Income / Total Equity) = (1.010e+07 / 1.125e+09) = 0.0090\n4. **Debt-to-Equity Ratio**: (Total Liabilities / Total Equity) = (3.259e+09 / 1.125e+09) = 2.89\n5. **Current Ratio**: (Current Assets / Current Liabilities) = (1.010e+09 / 1.124e+09) = 0.90\n\n**Analysis:**\n\nBased on the financial ratios, the company has:\n\n* A relatively high gross margin ratio of 0.193, indicating a good ability to maintain pricing power.\n* A relatively low operating margin ratio of 0.0277, indicating a need to improve operational efficiency.\n* A relatively low return on equity (ROE) of 0.0090, indicating a need 

In [78]:
from accelerate import Accelerator
from accelerate.utils import gather_object

accelerator = Accelerator()

# each GPU creates a string
message=[ f"Hello this is GPU {accelerator.process_index}" ] 

# collect the messages from all GPUs
messages=gather_object(message)

# output the messages only on the main process with accelerator.print() 
accelerator.print(messages)

['Hello this is GPU 0']


In [81]:
t = torch.cuda.get_device_properties(0).total_memory
r = torch.cuda.memory_reserved(0)
a = torch.cuda.memory_allocated(0)
f = r-a  # free inside reserved

In [82]:
f

1251859968

In [83]:
torch.cuda.mem_get_info()

(124846080, 23609475072)

In [84]:
torch.cuda.empty_cache()