# Request Financial Statement Datasets
For the analysis in this report, we shall be requesting financial statement datasets for the Dow Jones index from 2020-01-01 to 2025-03-31. The helper file in the 'requesters' folder makes the requests and downloads the datasets into a format ready to use by the Strategy Construction module.

In [1]:
import bql
import os
import importlib
import json

import pandas as pd

import requesters.data_request_helper as helper
from utils.s3_helper import S3Helper
from requesters.company_data import SecurityData
from requesters.data_request_helper import FinancialDataRequester
from prompts import SYSTEM_PROMPTS

### Quarterly Data for the Dow Jones
Configure the variables for the index - INDU Index is the Bloomberg identifier for the Dow Jones Index, a filename to store the data in Bloomberg Lab S3 Storage, a reporting period which will be the quarterly reporting datasets, a start date and a set of rebalance dates.

The rebalance dates are used to check the index on a quarterly basis for changes.

In [5]:
# Index to use for point in time firms
index = 'INDU Index'
filename = 'data_quarterly.json'
reporting_period = 'Q'
start_date = '2020-01-01'

# rebalance dates for the index
rebalance_dates = ['2024-12-31',
        '2024-09-30',
        '2024-06-30',
        '2024-03-31',
        '2023-12-31',
        '2023-09-30',
        '2023-06-30',
        '2023-03-31',
        '2022-12-31',
        '2022-09-30',
        '2022-06-30',
        '2022-03-31',
        '2021-12-31',
        '2021-09-30',
        '2021-06-30',
        '2021-03-31',
        '2020-12-31',
        '2020-09-30',
        '2020-06-30',
        '2020-03-31',
        '2019-12-31',
        '2019-09-30',
        '2019-06-30',
        '2019-03-31']

In [28]:
# Set up the data_helper object using the Financial Data Requester which will format the datasets into the correct format for us to use in the strategy construction module.
data_helper = helper.FinancialDataRequester(index_id=index,
                                            dataset_name='quarterly_pit_indu_blended',
                                            rebalance_dates=rebalance_dates,
                                            reporting_frequency=reporting_period,
                                            start_date=start_date)

100%|██████████| 24/24 [1:04:26<00:00, 161.12s/it]


In [17]:
df_rebalance_dates = data_helper.get_rebalance_dates()

100%|██████████| 24/24 [00:16<00:00,  1.54it/s]

In [18]:
df_rebalance_dates

Unnamed: 0_level_0,Unnamed: 1_level_0,PERIOD_END_DATE
AS_OF_DATE,ID,Unnamed: 2_level_1
2020-01-07,GS UN Equity,2019-09-30
2020-01-07,NKE UN Equity,2019-11-30
2020-01-08,WBA UW Equity,2019-11-30
2020-01-08,WBA UQ Equity,2019-11-30
2020-01-14,JPM UN Equity,2019-12-31
...,...,...
2025-04-24,IBM UN Equity,2025-03-31
2025-04-24,INTC UQ Equity,2025-03-29
2025-04-24,INTC UW Equity,2025-03-29
2025-04-24,MRK UN Equity,2025-03-31


In [19]:
all_data = data_helper.create_financial_dataset()

100%|██████████| 24/24 [02:48<00:00,  7.01s/it]
100%|██████████| 24/24 [00:22<00:00,  1.06it/s]
100%|█████████▉| 456/457 [28:16<00:04,  4.03s/it]

#### Save to Bloomberg Lab S3 Storage

In [20]:
# Write the data to local ephemeral storage
local_file = '/tmp/dow_quarterly_ltm_v3.json'
with open(local_file, 'w') as f:
    json.dump(all_data, f)

In [21]:
# Create S3 Helper object
s3_helper = S3Helper('tmp/fs')

In [22]:
# Upload to Bloomberg Lab S3 Storage
s3_helper.add_file(local_filename=local_file)

### Annual Data for the Dow Jones

In [67]:
# Index to use for point in time firms
index = 'INDU Index'
filename = 'data_annual_pit_dow.json'
reporting_period = 'A'

# rebalance dates for the index
rebalance_dates = ['2024-12-31',
        '2024-09-30',
        '2024-06-30',
        '2024-03-31',
        '2023-12-31',
        '2023-09-30',
        '2023-06-30',
        '2023-03-31',
        '2022-12-31',
        '2022-09-30',
        '2022-06-30',
        '2022-03-31',
        '2021-12-31',
        '2021-09-30',
        '2021-06-30',
        '2021-03-31',
        '2020-12-31',
        '2020-09-30',
        '2020-06-30',
        '2020-03-31',
        '2019-12-31',
        '2019-09-30',
        '2019-06-30',
        '2019-03-31',]

In [68]:
data_helper = helper.FinancialDataRequester(index_id=index,
                                            dataset_name='annual_pit_indu_blended',
                                            rebalance_dates=rebalance_dates,
                                            reporting_frequency=reporting_period)

In [69]:
all_data = data_helper.create_financial_dataset()

100%|██████████| 24/24 [00:48<00:00,  2.04s/it]
 99%|█████████▉| 145/146 [08:51<00:03,  3.69s/it]

In [66]:
all_data['2020-04-24']['AXP UN Equity']['mt']

{'name': 'American Express Co', 'figi': 'BBG000BCQZS4', 'sector': 'Financials'}

### Request Data for Training

In [14]:
# select the index
training_index = 'SPX Index'
filename = 'data_quarterly_pit_spx_refresh_blended.json'
reporting_period = 'Q'
start_date = '2020-01-01'

# rebalance dates for the index
rebalance_dates = ['2024-12-31',
        '2024-09-30',
        '2024-06-30',
        '2024-03-31',
        '2023-12-31',
        '2023-09-30',
        '2023-06-30',
        '2023-03-31',
        '2022-12-31',
        '2022-09-30',
        '2022-06-30',
        '2022-03-31',
        '2021-12-31',
        '2021-09-30',
        '2021-06-30',
        '2021-03-31',
        '2020-12-31',
        '2020-09-30',
        '2020-06-30',
        '2020-03-31',
        '2019-12-31',
        '2019-09-30',
        '2019-06-30',
        '2019-03-31',]

In [16]:
data_helper = helper.FinancialDataRequester(index_id=index,
                                            dataset_name=filename,
                                            rebalance_dates=rebalance_dates,
                                            reporting_frequency=reporting_period,
                                            start_date=start_date)

In [17]:
training_data = data_helper.create_financial_dataset()

100%|██████████| 24/24 [00:32<00:00,  1.36s/it]
100%|█████████▉| 456/457 [29:01<00:03,  3.81s/it]

#### Save to S3

In [None]:
# Write the data to local ephemeral storage
local_file = f'/tmp/{filename}'
with open(local_file, 'w') as f:
    json.dump(all_data, f)

# Upload to Bloomberg Lab S3 Storage
s3_helper.add_file(local_filename=local_file)

### Example prompt
Below is an example of the Income Statement, Balance Sheet and Historical price data that has been generated from the SecurityData class. A more detailed breakdown of the Security Data class can be found in the Tests folder of this project.

In [2]:
security_data = SecurityData('tmp/fs','dow_quarterly_ltm_v3.json')

In [3]:
prompt = security_data.get_prompt('2020-01-08', 'WBA UW Equity', SYSTEM_PROMPTS['BASE']['prompt'])

In [4]:
prompt

[{'role': 'system',
  'content': "You are a financial analyst and must make a buy, sell or hold decision on a company based only on the provided datasets. Compute common financial ratios and then determine the buy or sell decision. Explain your reasons in less than 250 words. Provide a confidence score for how confident you are of the decision. If you are not confident then lower the confidence score. You must answer in a JSON format with a 'decision', 'confidence score' and 'reason'. Provide your answer in JSON format like the two examples: {'decision': BUY, 'confidence score': 80, 'reason': 'Gross profit and EPS have both increased over time'}, {'decision': SELL, 'confidence score': 90, 'reason': 'Price has declined and EPS is falling'} Company financial statements: {financials} "},
 {'role': 'user',
  'content': 'Income Statement:                                                        t           t-1           t-2           t-3           t-4           t-5\nitems                     