In [1]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split

Get tram single label data:

In [2]:
%%bash

mkdir -p ../data/input
wget -O ../data/input/single_label.json https://raw.githubusercontent.com/center-for-threat-informed-defense/tram/main/data/tram2-data/single_label.json

--2024-04-21 15:15:05--  https://raw.githubusercontent.com/center-for-threat-informed-defense/tram/main/data/tram2-data/single_label.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1024483 (1000K) [text/plain]
Saving to: ‘../data/input/single_label.json’

     0K .......... .......... .......... .......... ..........  4% 53.0M 0s
    50K .......... .......... .......... .......... ..........  9% 69.0M 0s
   100K .......... .......... .......... .......... .......... 14%  304M 0s
   150K .......... .......... .......... .......... .......... 19%  301M 0s
   200K .......... .......... .......... .......... .......... 24%  186M 0s
   250K .......... .......... .......... .......... .......... 29% 84.6M 0s
   300K .......... .......... .......... ..........

In this version, we will consider the text, tactic and document title, all 3 of them as nodes.
The ontology then will be:

Nodes: 
    text, technique, doc_title
    
Relationships: 
    uses, found-in

Graph triple types will be:
    text uses technique
    text found-in doc_title
    technique found-in doc_title

In [3]:
data = pd.read_json('../data/input/single_label.json')

In [4]:
data

Unnamed: 0,text,label,doc_title
0,This file extracts credentials from LSASS simi...,T1003.001,NotPetya Technical Analysis A Triple Threat F...
1,It calls OpenProcess on lsass.exe with access ...,T1003.001,NotPetya Technical Analysis A Triple Threat F...
2,It spreads to Microsoft Windows machines using...,T1210,NotPetya Technical Analysis A Triple Threat F...
3,SMB exploitation via EternalBlue,T1210,NotPetya Technical Analysis A Triple Threat F...
4,SMBv1 Exploitation via EternalBlue,T1210,NotPetya Technical Analysis A Triple Threat F...
...,...,...,...
5084,collects local files and information from the ...,T1005,AA21076A TrickBot Malware
5085,uses HTTPS to communicate with its C2 servers,T1071.001,AA21076A TrickBot Malware
5086,samples have used HTTP over ports 447 and 8082...,T1071.001,AA21076A TrickBot Malware
5087,downloads several additional files and saves t...,T1105,AA21076A TrickBot Malware


Getting all unique labels, doc_titles and text:

In [5]:
all_techniques = data['label'].explode().dropna().unique()
all_techniques

array(['T1003.001', 'T1210', 'T1570', 'T1140', 'T1218.011', 'T1059.003',
       'T1057', 'T1518.001', 'T1106', 'T1082', 'T1016', 'T1078', 'T1047',
       'T1027', 'T1056.001', 'T1083', 'T1053.005', 'T1070.004', 'T1105',
       'T1090', 'T1005', 'T1574.002', 'T1071.001', 'T1484.001',
       'T1204.002', 'T1055', 'T1562.001', 'T1033', 'T1566.001', 'T1219',
       'T1547.001', 'T1021.001', 'T1543.003', 'T1569.002', 'T1036.005',
       'T1112', 'T1041', 'T1110', 'T1190', 'T1564.001', 'T1113',
       'T1573.001', 'T1095', 'T1552.001', 'T1012', 'T1074.001',
       'T1548.002', 'T1068', 'T1072', 'T1557.001'], dtype=object)

In [6]:
doc_titles = data['doc_title'].explode().dropna().unique()
doc_titles

array(['NotPetya Technical Analysis  A Triple Threat File Encryption MFT Encryption Credential Theft',
       'Earth Zhulong Familiar Patterns Target Southeast Asian Firms',
       'Malware Spotlight Camaro Dragons TinyNote Backdoor',
       'Rorschach  A New Sophisticated and Fast Ransomware  Check Point Research',
       'Bypassing Intel CET with Counterfeit Objects  OffSec',
       'Emotet Strikes Again  LNK File Leads to Domain Wide Ransomware  The DFIR Report',
       'Malware Analysis LummaC2 Stealer',
       'FedEx Phishing Campaign Abusing TrustedForm and PAAY',
       'Take a NetWalk on the Wild Side',
       'Malicious OAuth applications used to compromise email servers and spread spam  Microsoft Security Blog',
       'Nefilim Ransomware',
       'Deja Vu All Over Again Tax Scammers at Large',
       'Threat Assessment Black Basta Ransomware',
       'Hafniuminspired cyberattacks neutralized by AI',
       'eSentire Threat Intelligence Malware Analysis BatLoader',
       'Ea

In [7]:
text = data['text'].to_numpy()
text

array(['This file extracts credentials from LSASS similar to Mimikatz.',
       'It calls OpenProcess on lsass.exe with access flag set to VM_READ, and looks for the modules wdigest.dll and lsasrv.dll loaded in the lsass.exe process.',
       'It spreads to Microsoft Windows machines using several propagation methods, including the EternalBlue exploit for the CVE-2017-0144 vulnerability in the SMB service.',
       ..., 'samples have used HTTP over ports 447 and 8082 for C2.',
       "downloads several additional files and saves them to the victim's machine.",
       'uses a custom crypter leveraging Microsoft’s CryptoAPI to encrypt C2 traffic.'],
      dtype=object)

Adding them all in one place labels, text, doc_titles:
there are 50 labels, 149 doc_titles and 5089 text

In [8]:
nodes = np.concatenate((all_techniques, doc_titles, text))
nodes

array(['T1003.001', 'T1210', 'T1570', ...,
       'samples have used HTTP over ports 447 and 8082 for C2.',
       "downloads several additional files and saves them to the victim's machine.",
       'uses a custom crypter leveraging Microsoft’s CryptoAPI to encrypt C2 traffic.'],
      dtype=object)

The node list will then have 
    0-49 techniques
    50-198 doc_titles 
    199-5287 text
    
Now to make the numeric triples, we will use the indexes of the nodes from the nodes list.


Let us say that of the two relationships, uses = 0 and found-in = 1

1. we make the triples for text uses technique
2. we make the triples for text found-in doc_title
3. we make the triples for technique found-in doc_title

In [9]:
triples = []
tech2doc = []

np_data = data.to_numpy()

for row in np_data:
    text_index = np.where(nodes == row[0])[0][0]
    technique_index = np.where(nodes == row[1])[0][0]
    doc_title_index = np.where(nodes == row[2])[0][0]

    triples.extend(
        ((text_index, 0, technique_index),
         (text_index, 1, doc_title_index))
    )
    tech2doc.append((technique_index, 1, doc_title_index))

tech2doc = np.unique(tech2doc, axis=0)
triples = np.array(triples)
triples = np.append(triples, tech2doc, axis=0)

In [10]:
triples

array([[199,   0,   0],
       [199,   1,  50],
       [200,   0,   0],
       ...,
       [ 48,   1, 170],
       [ 48,   1, 183],
       [ 49,   1, 170]])

In [11]:
len(triples)

11868

In [12]:
assert len(triples) == 2 * len(np_data) + len(tech2doc)

split the triples into train, validation, test and save them to a file

In [13]:
%%bash

mkdir -p ../data/output/single

In [14]:
output = "../data/output/single"
pd.DataFrame(triples).to_csv(output + '/triples.txt', index=False, header=False, sep=' ')
train, valid = train_test_split(triples, test_size=0.05)
pd.DataFrame(train).to_csv(output + '/train.txt', index=False, header=False, sep=' ')
pd.DataFrame(valid).to_csv(output + '/valid.txt', index=False, header=False, sep=' ')
assert len(train) + len(valid) == len(triples)

Also train test validation split the nodes.txt for MLM 

Question: If we split the nodes, what happens to the indexes of the nodes. they should not be shuffled, otherwise the triples will be wrong.
Look at kepler.
_ Answer: its ok, kepler use the whole node.bpe for ke tasks.

In [15]:
def write_file(file_path, _list):
    with open(file_path, 'w') as f:
        for _row in _list:
            f.write(_row.replace("\n", r"\n").replace("\t", r"\t") + "\n")

In [16]:
n_train, n_test = train_test_split(nodes, test_size=0.2)

n_train, n_valid = train_test_split(n_train, test_size=0.05)

assert len(n_train) + len(n_test) + len(n_valid) == len(nodes)

write_file(output + '/nodes_train.txt', n_train)
write_file(output + '/nodes_valid.txt', n_valid)
write_file(output + '/nodes_test.txt', n_test)

save the nodes to a file, this is somewhat tricky, since some of the node texts contain newline characters, and we need to preserve them.

In [17]:
write_file(output + '/nodes.txt', nodes)

Now we follow Kepler@s Readme.md and prepare the KE and MLM data from the above files.

    We will use the nodes...txt as our MLM data.
    We will use the triples...txt as our KE data.
    

We first install the local version of kepler, which is built by extending fairsec:
We now start with KE data preprocessing:

In [11]:
%%bash

cd ../..
python -m pip install --editable .

Obtaining file:///home/sougata/projects/MyKEPLER
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Installing collected packages: fairseq
  Attempting uninstall: fairseq
    Found existing installation: fairseq 0.8.0
    Uninstalling fairseq-0.8.0:
      Successfully uninstalled fairseq-0.8.0
  Running setup.py develop for fairseq
Successfully installed fairseq


1. Encode the entity descriptions with the GPT-2 BPE:

In [41]:
%%bash

mkdir -p ../data/gpt2_bpe
wget -O ../data/gpt2_bpe/encoder.json https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/encoder.json
wget -O ../data/gpt2_bpe/vocab.bpe https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/vocab.bpe

--2024-04-21 15:29:52--  https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/encoder.json
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 54.230.111.15, 54.230.111.84, 54.230.111.29, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|54.230.111.15|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1042301 (1018K) [text/plain]
Saving to: ‘../data/gpt2_bpe/encoder.json’

     0K .......... .......... .......... .......... ..........  4%  104M 0s
    50K .......... .......... .......... .......... ..........  9% 81.3M 0s
   100K .......... .......... .......... .......... .......... 14% 86.1M 0s
   150K .......... .......... .......... .......... .......... 19%  106M 0s
   200K .......... .......... .......... .......... .......... 24%  259M 0s
   250K .......... .......... .......... .......... .......... 29%  268M 0s
   300K .......... .......... .......... .......... .......... 34%  243M 0s
   350K .......... .......... .......... ........

In [43]:
!ls

m_bpe_encoder.sh  pretrain.sh		       singleLabel2kepler_v2.py
pretrain.py	  singleLabel2kepler_v2.ipynb


In [50]:
%%bash

export LD_LIBRARY_PATH=/home/sougata/.local/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

python ../../examples/roberta/multiprocessing_bpe_encoder.py \
    --encoder-json ../data/gpt2_bpe/encoder.json \
    --vocab-bpe ../data/gpt2_bpe/vocab.bpe \
    --inputs ../data/output/single/nodes.txt \
    --outputs ../data/output/single/nodes.bpe \
    --keep-empty \
    --workers 60

2. Do negative sampling and dump the whole training and validation data:

In [51]:
%%bash

python ../../examples/KEPLER/Pretrain/KGpreprocess.py --dumpPath ../data/output/single/KE1 \
    -ns 1 \
    --ent_desc ../data/output/single/nodes.bpe \
    --train ../data/output/single/train.txt \
    --valid ../data/output/single/valid.txt

2024-04-21 16:00:53.934957 load finish
2024-04-21 16:00:53.967690 preparation finished
2024-04-21 16:00:55.061555 training set finished
2024-04-21 16:00:55.121704 all finished


3. then randomly split the KE training data into smaller parts and the number of training instances in each part aligns with the MLM training data
For our case it will be just one split, since our data is small.

Question: what does the negative_sampling_size = 1 do? it could be that the relation triples are false.

In [52]:
%%bash

python ../../examples/KEPLER/Pretrain/splitDump.py --Path ../data/output/single/KE1 \
    --split_size 6834352 \
    --negative_sampling_size 1

The data will be splited into 1 splits


4. We then binarize them for training:

In [54]:
%%bash

export LD_LIBRARY_PATH=/home/sougata/.local/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

wget -O ../data/gpt2_bpe/dict.txt https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/dict.txt

KE_Data=../data/output/single/KE1_0/
for SPLIT in head tail negHead negTail;
  do
    python -m fairseq_cli.preprocess \
      --only-source \
      --srcdict ../data/gpt2_bpe/dict.txt \
      --trainpref ${KE_Data}${SPLIT}/train.bpe \
      --validpref ${KE_Data}${SPLIT}/valid.bpe \
      --destdir ${KE_Data}${SPLIT} \
      --workers 60; \
  done

--2024-04-21 16:02:24--  https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/dict.txt
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 54.230.111.15, 54.230.111.29, 54.230.111.20, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|54.230.111.15|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 603290 (589K) [text/plain]
Saving to: ‘../data/gpt2_bpe/dict.txt’

     0K .......... .......... .......... .......... ..........  8% 86.5M 0s
    50K .......... .......... .......... .......... .......... 16%  118M 0s
   100K .......... .......... .......... .......... .......... 25%  259M 0s
   150K .......... .......... .......... .......... .......... 33%  216M 0s
   200K .......... .......... .......... .......... .......... 42%  286M 0s
   250K .......... .......... .......... .......... .......... 50%  290M 0s
   300K .......... .......... .......... .......... .......... 59%  500M 0s
   350K .......... .......... .......... .......... .......

Namespace(no_progress_bar=False, log_interval=1000, log_format=None, tensorboard_logdir='', tbmf_wrapper=False, seed=1, cpu=False, fp16=False, memory_efficient_fp16=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, min_loss_scale=0.0001, threshold_loss_scale=None, user_dir=None, criterion='cross_entropy', tokenizer=None, bpe=None, optimizer='nag', lr_scheduler='fixed', task='translation', source_lang=None, target_lang=None, trainpref='../data/output/single/KE1_0/head/train.bpe', validpref='../data/output/single/KE1_0/head/valid.bpe', testpref=None, destdir='../data/output/single/KE1_0/head', thresholdtgt=0, thresholdsrc=0, tgtdict=None, srcdict='../data/gpt2_bpe/dict.txt', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='mmap', joined_dictionary=False, only_source=True, padding_factor=8, workers=60, bert=False)
| [None] Dictionary: 50263 types
| [None] ../data/output/single/KE1_0/head/train.bpe: 11274 sents, 136303 tokens, 0.0% replaced by <unk>
| [

We now start with MLM data preprocessing:


1. Now we encode the nodes_train, nodes_train and nodes_valid with the GPT-2 BPE:
   (gpt2_bpe is already downloaded during the KE data preparation, we reuse that.)

In [56]:
%%bash

export LD_LIBRARY_PATH=/home/sougata/.local/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

mkdir -p ../data/output/single/MLM

for SPLIT in train valid test; do \
    python -m examples.roberta.multiprocessing_bpe_encoder \
        --encoder-json ../data/gpt2_bpe/encoder.json \
        --vocab-bpe ../data/gpt2_bpe/vocab.bpe \
        --inputs ../data/output/single/nodes_${SPLIT}.txt \
        --outputs ../data/output/single/MLM/nodes_${SPLIT}.bpe \
        --keep-empty \
        --workers 60; \
done

2. We then preprocess/binarize the data using the GPT-2 fairseq dictionary:

In [57]:
%%bash

export LD_LIBRARY_PATH=/home/sougata/.local/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

mkdir -p ../data/output/single/MLM-bin

python -m fairseq_cli.preprocess \
    --only-source \
    --srcdict ../data/gpt2_bpe/dict.txt \
    --trainpref ../data/output/single/MLM/nodes_train.bpe \
    --validpref ../data/output/single/MLM/nodes_valid.bpe \
    --testpref ../data/output/single/MLM/nodes_test.bpe \
    --destdir ../data/output/single/MLM-bin \
    --workers 60

Namespace(no_progress_bar=False, log_interval=1000, log_format=None, tensorboard_logdir='', tbmf_wrapper=False, seed=1, cpu=False, fp16=False, memory_efficient_fp16=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, min_loss_scale=0.0001, threshold_loss_scale=None, user_dir=None, criterion='cross_entropy', tokenizer=None, bpe=None, optimizer='nag', lr_scheduler='fixed', task='translation', source_lang=None, target_lang=None, trainpref='../data/output/single/MLM/nodes_train.bpe', validpref='../data/output/single/MLM/nodes_valid.bpe', testpref='../data/output/single/MLM/nodes_test.bpe', destdir='../data/output/single/MLM-bin', thresholdtgt=0, thresholdsrc=0, tgtdict=None, srcdict='../data/gpt2_bpe/dict.txt', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='mmap', joined_dictionary=False, only_source=True, padding_factor=8, workers=60, bert=False)
| [None] Dictionary: 50263 types
| [None] ../data/output/single/MLM/nodes_train.bpe: 4019 sents, 53423 toke

All preprocessing is done, now we try out training the model with our data.

We first download the pretrained models:

In [58]:
# %%bash

# mkdir ../data/keplerModels
# 
# if ! [ -f ../data/keplerModels/KEPLERforNLP.pt ]; then
#     wget -o ../data/keplerModels/KEPLERforNLP.pt https://cloud.tsinghua.edu.cn/seafhttp/files/a21e5254-ceac-4b88-88e9-8ec58cbe8a1a/KEPLERforNLP.pt
# fi
# if ! [ -f ../data/keplerModels/KEPLERforKE.p ]; then
#     wget -o ../data/keplerModels/KEPLERforKE.pt https://cloud.tsinghua.edu.cn/seafhttp/files/a684dc30-6a1a-4613-97ad-0144ae84e1ca/KEPLERforKE.pt
# fi

CalledProcessError: Command 'b'\n# mkdir ../data/keplerModels\n# \n# if ! [ -f ../data/keplerModels/KEPLERforNLP.pt ]; then\n#     wget -o ../data/keplerModels/KEPLERforNLP.pt https://cloud.tsinghua.edu.cn/seafhttp/files/a21e5254-ceac-4b88-88e9-8ec58cbe8a1a/KEPLERforNLP.pt\n# fi\n# if ! [ -f ../data/keplerModels/KEPLERforKE.p ]; then\nwget -o ../data/keplerModels/KEPLERforKE.pt https://cloud.tsinghua.edu.cn/seafhttp/files/a684dc30-6a1a-4613-97ad-0144ae84e1ca/KEPLERforKE.pt\n# fi\n'' returned non-zero exit status 8.

In [None]:
# !mkdir ../data/keplerModels

In [None]:
# !wget -o ../data/keplerModels/KEPLERforNLP.pt https://cloud.tsinghua.edu.cn/seafhttp/files/70495ae5-48a0-48e4-9c1a-fe4893e80d3f/KEPLERforNLP.pt

In [None]:
# !wget -d -o ../data/keplerModels/KEPLERforKE.pt https://cloud.tsinghua.edu.cn/seafhttp/files/a3b23761-e0bd-4850-b8c4-3788ce6cea3f/KEPLERforKE.pt

Then we first train on the NLP model:

In [None]:
%%bash

export LD_LIBRARY_PATH=/home/sougata/.local/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

TOTAL_UPDATES=125000                                    # Total number of training steps
WARMUP_UPDATES=10000                                    # Warmup the learning rate over this many updates
LR=6e-04                                                # Peak LR for polynomial LR scheduler.
NUM_CLASSES=2                           
MAX_SENTENCES=3                                         # Batch size.
NUM_NODES=1			                                    # Number of machines
ROBERTA_PATH=../data/checkpoints/checkpoint_last.pt
# ROBERTA_PATH=../data/keplerModels/KEPLERforNLP.pt       # Path to the original roberta model
CHECKPOINT_PATH=../data/checkpoints                     # Directory to store the checkpoints
UPDATE_FREQ=`expr 784 / $NUM_NODES`                     # Increase the batch size

DATA_DIR=../data/output/single

#Path to the preprocessed KE dataset, each item corresponds to a data directory for one epoch
KE_DATA=$DATA_DIR/KE1_0:

DIST_SIZE=`expr $NUM_NODES`

python -m fairseq_cli.train $DATA_DIR/MLM-bin --KEdata $KE_DATA --restore-file $ROBERTA_PATH \
        --save-dir $CHECKPOINT_PATH \
        --max-sentences $MAX_SENTENCES \
        --tokens-per-sample 512 \
        --task MLMetKE \
        --sample-break-mode complete \
        --required-batch-size-multiple 1 \
        --arch roberta_base \
        --criterion MLMetKE \
        --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
        --optimizer adam --adam-betas "(0.9, 0.98)" --adam-eps 1e-06 \
        --clip-norm 0.0 \
        --lr-scheduler polynomial_decay --lr $LR --total-num-update $TOTAL_UPDATES --warmup-updates $WARMUP_UPDATES \
        --update-freq "$UPDATE_FREQ" \
        --negative-sample-size 1 --ke-model TransE \
        --init-token 0 \
        --separator-token 2 \
        --gamma 4 --nrelation 822 \
        --skip-invalid-size-inputs-valid-test \
        --fp16 --fp16-init-scale 2 --threshold-loss-scale 1 --fp16-scale-window 128 \
        --reset-optimizer --distributed-world-size "${DIST_SIZE}" --ddp-backend no_c10d --distributed-port 23456 \
        --log-format simple --log-interval 1 > out_single.log \
        #--relation-desc  #Add this option to encode the relation descriptions as relation embeddings (KEPLER-Rel in the paper)

	add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
	add_(Tensor other, *, Number alpha) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:1630.)
  exp_avg.mul_(beta1).add_(1 - beta1, grad)


Evaluate model

In [12]:
%%bash

export LD_LIBRARY_PATH=/home/sougata/.local/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

python -m fairseq_cli.eval_lm ../data/output/single/MLM-bin \
    --path ../data/checkpoints/checkpoint_last.pt \
    --sample-break-mode complete --max-tokens 3072 \
    --context-window 2560 --softmax-batch 1024

Traceback (most recent call last):
  File "/home/sougata/anaconda3/envs/mykepler/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/sougata/anaconda3/envs/mykepler/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/sougata/projects/MyKEPLER/fairseq_cli/eval_lm.py", line 227, in <module>
    cli_main()
  File "/home/sougata/projects/MyKEPLER/fairseq_cli/eval_lm.py", line 223, in cli_main
    main(args)
  File "/home/sougata/projects/MyKEPLER/fairseq_cli/eval_lm.py", line 59, in main
    models, args = checkpoint_utils.load_model_ensemble(
  File "/home/sougata/projects/MyKEPLER/fairseq/checkpoint_utils.py", line 157, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
  File "/home/sougata/projects/MyKEPLER/fairseq/checkpoint_utils.py", line 176, in load_model_ensemble_and_task
    model.load_state_dict(state['model'], st

Namespace(no_progress_bar=False, log_interval=1000, log_format=None, tensorboard_logdir='', tbmf_wrapper=False, seed=1, cpu=False, fp16=False, memory_efficient_fp16=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, min_loss_scale=0.0001, threshold_loss_scale=None, user_dir=None, criterion='cross_entropy', tokenizer=None, bpe=None, optimizer='nag', lr_scheduler='fixed', task='language_modeling', num_workers=1, skip_invalid_size_inputs_valid_test=False, max_tokens=3072, max_sentences=None, required_batch_size_multiple=8, dataset_impl=None, gen_subset='test', num_shards=1, shard_id=0, path='../data/checkpoints/checkpoint_last.pt', remove_bpe=None, quiet=False, model_overrides='{}', results_path=None, output_word_probs=False, output_word_stats=False, context_window=2560, softmax_batch=1024, momentum=0.99, weight_decay=0.0, force_anneal=None, lr_shrink=0.1, warmup_updates=0, data='../data/output/single/MLM-bin', sample_break_mode='complete', tokens_per_sample=102

CalledProcessError: Command 'b'\nexport LD_LIBRARY_PATH=/home/sougata/.local/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH\n\npython -m fairseq_cli.eval_lm ../data/output/single/MLM-bin \\\n    --path ../data/checkpoints/checkpoint_last.pt \\\n    --sample-break-mode complete --max-tokens 3072 \\\n    --context-window 2560 --softmax-batch 1024\n'' returned non-zero exit status 1.