Make sure to enter the signularity container and activate the the conda environment before running this notebook:

```bash
singularity exec --nv \
    --overlay nscan-overlay.sqf:ro \
    /scratch/work/public/singularity/cuda12.1.1-cudnn8.9.0-devel-ubuntu22.04.2.sif \
    /bin/bash
```

```bash
source /ext3/env.sh
```

```bash
conda activate py311
```



In [5]:
from datasets import load_from_disk
import numpy as np
import torch
import torch._dynamo
torch._dynamo.config.suppress_errors = True
import json
from nscan import NSCAN
from nscan.utils import NewsReturnDataset
from nscan.config import DATA_DIR, PREPROCESSED_DATA_DIR, CHECKPOINT_DIR, RETURNS_DATA_DIR
from nscan.utils import load_returns_and_sp500_data
from nscan.backtesting import run_backtest
from evaluate_model import get_model_predictions


Load test data and model:

In [3]:
max_articles_per_day = 4 # low for demonstration purposes, but can be set higher

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    
# Load the preprocessed datasets and metadata
dataset_dict = load_from_disk(str(PREPROCESSED_DATA_DIR))  # This loads the DatasetDict
test_dataset = NewsReturnDataset(dataset_dict['test'], max_articles_per_day=max_articles_per_day)  # Get the test split

# Load checkpoint which contains model state
checkpoint = torch.load(CHECKPOINT_DIR / "best_model/epoch1_batch2999.pt", map_location=device)

# Load config
with open(CHECKPOINT_DIR / "best_model/params.json", 'r') as f:
    best_config = json.load(f)

# Initialize model with best config
model = NSCAN(
    num_stocks=best_config["num_stocks"],
    num_decoder_layers=best_config["num_decoder_layers"],
    num_heads=best_config["num_heads"],
    num_pred_layers=best_config["num_pred_layers"],
    attn_dropout=best_config["attn_dropout"],
    ff_dropout=best_config["ff_dropout"],
    encoder_name=best_config["encoder_name"],
    use_flash=False
)
model = torch.compile(model, mode='max-autotune-no-cudagraphs')
model.load_state_dict(checkpoint['model_state_dict'], strict=True)
model.to(device)

  checkpoint = torch.load(CHECKPOINT_DIR / "best_model/epoch1_batch2999.pt", map_location=device)
Some weights of RobertaModel were not initialized from the model checkpoint at FinText/FinText-Base-2007 and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


OptimizedModule(
  (_orig_mod): NSCAN(
    (encoder): RobertaModel(
      (embeddings): RobertaEmbeddings(
        (word_embeddings): Embedding(50265, 768, padding_idx=1)
        (position_embeddings): Embedding(514, 768, padding_idx=1)
        (token_type_embeddings): Embedding(1, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): RobertaEncoder(
        (layer): ModuleList(
          (0-11): 12 x RobertaLayer(
            (attention): RobertaAttention(
              (self): RobertaSdpaSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): RobertaSelfOutput(
                (dense): Linear(in_feature

Test the model on the test dataset:

In [6]:
test_results = get_model_predictions(model, test_dataset, device)
print(f"Test Loss: {test_results['test_loss']:.4f}")

Inferencing batch 0 of 46


W1220 14:49:16.642000 26084 Lib\site-packages\torch\_dynamo\convert_frame.py:1125] WON'T CONVERT forward C:\Users\mares\GitProjects\nscan\src\nscan\model\model.py line 254 
W1220 14:49:16.642000 26084 Lib\site-packages\torch\_dynamo\convert_frame.py:1125] due to: 
W1220 14:49:16.642000 26084 Lib\site-packages\torch\_dynamo\convert_frame.py:1125] Traceback (most recent call last):
W1220 14:49:16.642000 26084 Lib\site-packages\torch\_dynamo\convert_frame.py:1125]   File "c:\Users\mares\GitProjects\nscan\venv\Lib\site-packages\torch\_dynamo\output_graph.py", line 1446, in _call_user_compiler
W1220 14:49:16.642000 26084 Lib\site-packages\torch\_dynamo\convert_frame.py:1125]     compiled_fn = compiler_fn(gm, self.example_inputs())
W1220 14:49:16.642000 26084 Lib\site-packages\torch\_dynamo\convert_frame.py:1125]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
W1220 14:49:16.642000 26084 Lib\site-packages\torch\_dynamo\convert_frame.py:1125]   File "c:\Users\mares\GitProjects\nscan\

Inferencing batch 1 of 46
Inferencing batch 2 of 46
Inferencing batch 3 of 46
Inferencing batch 4 of 46
Inferencing batch 5 of 46
Inferencing batch 6 of 46
Inferencing batch 7 of 46
Inferencing batch 8 of 46
Inferencing batch 9 of 46
Inferencing batch 10 of 46
Inferencing batch 11 of 46
Inferencing batch 12 of 46
Inferencing batch 13 of 46
Inferencing batch 14 of 46
Inferencing batch 15 of 46
Inferencing batch 16 of 46
Inferencing batch 17 of 46
Inferencing batch 18 of 46
Inferencing batch 19 of 46
Inferencing batch 20 of 46
Inferencing batch 21 of 46
Inferencing batch 22 of 46
Inferencing batch 23 of 46
Inferencing batch 24 of 46
Inferencing batch 25 of 46
Inferencing batch 26 of 46
Inferencing batch 27 of 46
Inferencing batch 28 of 46
Inferencing batch 29 of 46
Inferencing batch 30 of 46
Inferencing batch 31 of 46
Inferencing batch 32 of 46
Inferencing batch 33 of 46
Inferencing batch 34 of 46
Inferencing batch 35 of 46
Inferencing batch 36 of 46
Inferencing batch 37 of 46
Inferencin

Save the results:

In [7]:
save_dict = {
        'predictions': test_results['predictions'].numpy(),
        'confidences': test_results['confidences'].numpy(),
        'returns': test_results['returns'].numpy(),
        'dates': test_results['dates'],
        'stock_indices': test_results['stock_indices'].numpy(),
        'test_loss': test_results['test_loss']
}
    
save_path = DATA_DIR / "evaluation_results.npz"
np.savez(save_path, **save_dict)
print(f"Saved evaluation results to {save_path}")

Saved evaluation results to C:\Users\mares\GitProjects\nscan\data\evaluation_results.npz


Run backtest:

In [9]:
# Load returns data
returns_by_year, sp500_by_year = load_returns_and_sp500_data([2023], RETURNS_DATA_DIR)

# Run backtest
backtest_results = run_backtest(
    test_results,
    returns_by_year,
    sp500_by_year
)


Year 2023:
Raw data NaN count: 1
Pivoted data NaN count: 424
Total cells: 121750
NaN percentage: 0.35%


  metadata = torch.load(PREPROCESSED_DATA_DIR / "metadata.pt")



Date: 2023-01-03
Portfolio Value: 100000.0
Buying stock 18163 (index = 155): 10000.0 units at price 1.0

Date: 2023-01-04
Portfolio Value: 100000.0
Buying stock 18163 (index = 155): 9956.648751336681 units at price 1.004354

Date: 2023-01-05
Portfolio Value: 99875.30945090001
Buying stock 18163 (index = 155): 10069.243408250106 units at price 0.99188494509

Date: 2023-01-06
Portfolio Value: 100110.48306695
Buying stock 18163 (index = 155): 9858.199862593718 units at price 1.0155047012874283

Date: 2023-01-09
Portfolio Value: 99985.59047081601
Buying stock 18163 (index = 155): 9967.646118384833 units at price 1.0031013268659037

Date: 2023-01-10
Portfolio Value: 99975.83025150705
Buying stock 18163 (index = 155): 9976.519939238422 units at price 1.0021112658562872

Date: 2023-01-11
Portfolio Value: 99894.94183608664
Buying stock 18163 (index = 155): 10049.831676326648 units at price 0.9939961688253829

Date: 2023-01-12
Portfolio Value: 99838.99225138978
Buying stock 18163 (index = 155)

Hmmm, we can see that the model is choosing to buy the same one stock every day. Why are we seeing this behavior?

Let's take a peek at the results:

In [13]:
with np.load(DATA_DIR / "evaluation_results.npz") as data:
        # Get list of all arrays in the file
        array_names = data.files
        print("-" * 40)
        
        # Loop through and display info about each array
        for name in array_names:
            array = data[name]
            print(f"\nArray name: {name}")
            print(f"Shape: {array.shape}")
            print(f"Data type: {array.dtype}")
            if name == "predictions":
                large_preds = np.sum(np.abs(array) > 0.01)
                print(f"Number of predictions > 0.01: {large_preds} (out of {array.size})")
            if name == "confidences":
                large_confs = np.sum(np.abs(array) > 0.01)
                print(f"Number of confidences > 0.01: {large_confs} (out of {array.size})")
            if name != "dates" and name != "test_loss":
                print(f"First few elements: {array[:20, :20]}")  # Show first elements
            elif name == "dates":
                print(f"First few elements: {array[:20]}")  # Show first elements
            # elif name == "test_loss":
            #     print(f"First few elements: {array}")
            
            # For small arrays, optionally show full contents
            if array.size < 20:  # Adjust this threshold as needed
                print("Full contents:")
                print(array)

----------------------------------------

Array name: predictions
Shape: (1442, 487)
Data type: float32
Number of predictions > 0.01: 0 (out of 702254)
First few elements: [[0.0007987  0.00079584 0.00080442 0.0007863  0.00079775 0.00079107
  0.00078678 0.0007844  0.00079346 0.00080824 0.00078917 0.00083351
  0.00081396 0.00078869 0.00078917 0.00080299 0.0007925  0.00079107
  0.00083303 0.00079155]
 [0.00080252 0.00079727 0.00080538 0.00078726 0.00079155 0.00079155
  0.00078678 0.00078535 0.0007863  0.00081444 0.0007863  0.00082922
  0.00081253 0.00079012 0.00078487 0.00079393 0.0007925  0.00079346
  0.00084019 0.00079346]
 [0.00080204 0.00079536 0.00080442 0.00078678 0.00079155 0.00079107
  0.00078678 0.00078487 0.00078869 0.00081205 0.00078678 0.00082922
  0.00081205 0.00078964 0.00078535 0.00079441 0.0007925  0.0007925
  0.00083971 0.00079346]
 [0.00079918 0.00079632 0.00080347 0.00078678 0.00079203 0.00078964
  0.0007863  0.00078487 0.00078773 0.00080967 0.0007863  0.00082493
  0.00

We can see that almost all of the confidences are 0. Confidences must sum to one for each article, so where are the positive confidences? Let's take a closer look:

In [17]:
# Number of days to print
n_days = 20
# Load the data
with np.load(DATA_DIR / "evaluation_results.npz") as data:
    confidences = data['confidences']
    stock_indices = data['stock_indices']

# Find where confidences are above 0.01
high_conf_mask = confidences > 0.01
high_conf_indices = np.where(high_conf_mask)

# Print results
print(f"Found {np.sum(high_conf_mask)} confidences > 0.01")
print("\nDetails of high confidence predictions:")
print("Day | Stock Index | Confidence")
print("-" * 30)
for day, stock in list(zip(high_conf_indices[0], high_conf_indices[1]))[:n_days]:
    print(f"{day:3d} | {stock_indices[day, stock]:11d} | {confidences[day, stock]:.4f}")

# Print summary statistics
unique_stocks = np.unique(stock_indices[high_conf_indices[0], high_conf_indices[1]])
print(f"\nUnique stocks with high confidence: {len(unique_stocks)}")
print("Stock indices:", unique_stocks)

Found 1442 confidences > 0.01

Details of high confidence predictions:
Day | Stock Index | Confidence
------------------------------
  0 |         155 | 1.0000
  1 |         155 | 1.0000
  2 |         155 | 1.0000
  3 |         155 | 1.0000
  4 |         155 | 1.0000
  5 |         155 | 1.0000
  6 |         155 | 1.0000
  7 |         155 | 1.0000
  8 |         155 | 1.0000
  9 |         155 | 1.0000
 10 |         155 | 1.0000
 11 |         155 | 1.0000
 12 |         155 | 1.0000
 13 |         155 | 1.0000
 14 |         155 | 1.0000
 15 |         155 | 1.0000
 16 |         155 | 1.0000
 17 |         155 | 1.0000
 18 |         155 | 1.0000
 19 |         155 | 1.0000

Unique stocks with high confidence: 1
Stock indices: [155]


Every day the model is putting 100% of its confidence in one stock! Let's use the metadata to find out which stock it is:

In [19]:
metadata = torch.load(PREPROCESSED_DATA_DIR / "metadata.pt")
idx_to_symbol = {idx: symbol for symbol, idx in metadata['symbol_to_idx'].items()}

for stock in unique_stocks:
    print(f"Stock index: {stock}, PERMNO: {idx_to_symbol[stock]}")

Stock index: 155, PERMNO: 18163


  metadata = torch.load(PREPROCESSED_DATA_DIR / "metadata.pt")


Searching for "PERMNO: 10108" in the any of the returns data files, we find that its ticker is "PG", that is, the company Procter & Gamble Co. On Yahoo Finance, we can see that this stock has a Beta of 0.41. Low Betas usually indicate stocks that are more stable and less volatile. Low volatility implies higher predictability. Maybe the model is hacking the loss function by selecting a more predictable stock?

Let's take a look at the volatility of the stock:


In [22]:
from calculate_ticker_volatility import calculate_ticker_volatility

data_dir = RETURNS_DATA_DIR
start_year = 2006
end_year = 2021

volatility = calculate_ticker_volatility(data_dir, start_year, end_year)

print("Total number of stocks: ", len(volatility))

# Print results sorted by volatility (highest to lowest)
print("\nTicker Volatilities (sorted low to high):")
print(volatility[:30])

Total number of stocks:  854

Ticker Volatilities (sorted low to high):
        Volatility  Position
PERMNO                      
91380     0.000814         1
83693     0.002279         2
45671     0.002897         3
91388     0.003982         4
12456     0.004358         5
23318     0.006333         6
24360     0.007220         7
50032     0.008119         8
75333     0.008381         9
48960     0.008981        10
22947     0.009659        11
77063     0.009760        12
62770     0.009946        13
22032     0.009955        14
79718     0.010153        15
54690     0.010529        16
42083     0.010562        17
41443     0.010665        18
13821     0.010713        19
84001     0.010842        20
22111     0.010956        21
85238     0.011123        22
76090     0.011125        23
76171     0.011466        24
59379     0.011516        25
18163     0.011541        26
13856     0.011611        27
81126     0.011796        28
11308     0.011888        29
17750     0.011936        30


PG's PERMNO is number 26 on this list, out of 854 stocks. Indeed, the model is choosing a low volatility stock!

To summarize, the intent of the confidence scores is to allow the model to dynamically discover which stocks are impacted by the news article fed to the model, as these stocks would be more predictable than stocks for which the model was given no information. However, instead the model learned to hack the loss function by selecting a low volatility stock every day.