# Training demo for AlloyGPT

## Notes:
> - [ ] This demo is to show how to train AlloyGPT with a formated language dataset based Thermo-Calc dataset
> - [ ] Due to license restriction of Thermo-Calc, you will be asked for "dataset_key" to access the dataset for training.

## Ref:
> - 1. 

#### Bo Ni, Feb 18, 2025

## 1. Check the environment

In [1]:

import os, sys
print('Here is : \n', os.popen('pwd').read())
print('What we get in hardware: \n', os.popen('nvidia-smi').read())
kernel_name = os.path.basename(sys.executable.replace("/bin/python",""))
print ("The VirEnv kernal in action: ", kernel_name)
#
# # make it work for both py and notebook
# try:
#     from jupyter_client import kernelspec
#     spec = kernelspec.get_kernel_spec(kernel_name)
#     print("Path to it: ", spec.resource_dir)
# except:
#     print ("This suppose to be a .py run")
# # /path/to/my/kernel
import torch
print("What we have in software: \n Torch version:", torch.__version__)
print('Python: ', sys.version) # no switch case code
#
print('What hardware the software see:')
device = torch.device(
    "cuda:0" if torch.cuda.is_available() else "cpu"
)
print(device)
num_of_gpus = torch.cuda.device_count()
print("# of GPU", num_of_gpus)
#
torch.cuda.empty_cache()

Here is : 
 /trace/group/tmousavi/bni2/1_JupyterGit/workspace/AlloyGPT_InternalTest_0

What we get in hardware: 
 Fri Feb 21 16:24:06 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA A40                     On  | 00000000:25:00.0 Off |                    0 |
|  0%   30C    P0              69W / 300W |   8214MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+-



What we have in software: 
 Torch version: 2.6.0+cu124
Python:  3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]
What hardware the software see:
cuda:0
# of GPU 1


## 2. Configurate the task

In [2]:
# for debug
import importlib

In [3]:
# load in source code packages

import AlloyGPT.UtilityPack as UtilityPack
importlib.reload(UtilityPack)
#
import AlloyGPT.DataPack as DataPack
importlib.reload(DataPack)
#
import AlloyGPT.ModelPack as ModelPack
importlib.reload(ModelPack)
#
import AlloyGPT.TrainPack as TrainPack
importlib.reload(TrainPack)
#
import AlloyGPT.TestPack as TestPack
importlib.reload(TestPack)

<module 'AlloyGPT.TestPack' from '/trace/group/tmousavi/bni2/1_JupyterGit/workspace/AlloyGPT_InternalTest_0/AlloyGPT/TestPack.py'>

In [4]:
# load in additional packages

import pickle
from matplotlib import pyplot as plt
import seaborn as sns
from tqdm import tqdm
import math
import pandas as pd
# ++
import numpy as np
import getpass

In [5]:
# 
# code_dir = '/content/AlloyGPT_InternalTest_0/' # for colab run
# 
code_dir = './' # for local run

In [6]:


dataset_key = getpass.getpass(prompt="Please enter your key to the dataset: \n\n")

Please enter your key to the dataset: 

 ········


In [7]:
# control key

# ===============================================
# Global control key setup: may into yaml file
# ===============================================
# control key to be shared everywhere
#
CKeys = {}
CKeys['Working_Mode'] = 1 # 1 # 0 # 1 # 2
# 0: prepare dataset: run only on single node
# 1: training,
# 2: testing

# only for restart of training part
CKeys['IF_FirstRun'] = 1 # 1 # 2
CKeys['Resume_from_where'] = 'LAST'
# 1: train for 1st run; 2: other training loop

CKeys['Problem_ID'] = 6
# 1: Transformer-based causal LM: GPT2
# 2: Mamba-based causal LM
# 3: T-based causal LM + HF dataset on AlloyLan
# 4: selfmade S6
# 5: BPE tokenizer + hf-S6
# 6: BPE tokenizer + GPT2

# where to pick up the checkpoint
CKeys['Resume_from_where'] = "LAST" # "BEST"

# On Debug
# CKeys['Debug']=1
CKeys['Debug']=0

if CKeys['Debug'] == 1:
    CKeys['Debug_Data'] = 1
    CKeys['Debug_Model'] = 1
    CKeys['Debug_Train'] = 1
    CKeys['Debug_Test'] = 1
    # for On cluster run
    CKeys['if_slient_run']=0
else:
    CKeys['Debug_Data']=0
    CKeys['Debug_Model']=0
    CKeys['Debug_Train']=0
    CKeys['Debug_Test']=0
    # for On cluster run
    CKeys['if_slient_run']=1

In [8]:
print ("Check CKeys: \n")
for this_key in CKeys.keys():
    print (f"{this_key}: {CKeys[this_key]}")

Check CKeys: 

Working_Mode: 1
IF_FirstRun: 1
Resume_from_where: LAST
Problem_ID: 6
Debug: 0
Debug_Data: 0
Debug_Model: 0
Debug_Train: 0
Debug_Test: 0
if_slient_run: 1


In [9]:
# problem review:
print ("Problem Review: ")
print('Problem type: ', CKeys['Problem_ID'])
print('Debug mode: ', CKeys['Debug'])
print('Working mode: ', CKeys['Working_Mode'])

Problem Review: 
Problem type:  6
Debug mode:  0
Working mode:  1


In [10]:
# ===============================================
# Parameter Keys: may think of moving this into
# a yaml file
# 1. set for the first time
# 2. reload in sequential runs
# ===============================================
#
PKeys = {}
PKeys['prefix']='Causal_LM_TwoWays'

PKeys['wk_path']='./training_resu/'
#
# may add keys for lower level
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||
# on data + tokenizer
Data_PKeys = {}
Data_PKeys['data_dir']=PKeys['wk_path']+'/0_dataprocess/'
Data_PKeys['tokenizer_dir']=Data_PKeys['data_dir']+'/tokenizer'
# problem-specified ones
Data_PKeys['tokenizer_type']='BPE_customerized_0' # 'Byte_Tokenizer'
# Data_PKeys['tokenizer_file']='/trace/group/tmousavi/bni2/1_JupyterGit/workspace/Test_NanoGPT_1/Local_Store/cBPE_tokenizer_ver1.json'
Data_PKeys['tokenizer_file']=code_dir+'/assets/cBPE_tokenizer_ver1.json'
Data_PKeys['train_ratio']=0.900
Data_PKeys['valid_ratio']=0.001
Data_PKeys['fix_random']=12345
Data_PKeys['context_length']=1024 # length of text window
# not working for google colab T4
Data_PKeys['batch_size'] = 12 # 12 # 8 # 4 # 8 # 16 # 24 # 32
#
Data_PKeys['batch_size'] = 4 # 12 # 8 # 4 # 8 # 16 # 24 # 32
# use hf-dataset:
Data_PKeys['hf_data_repo'] = 'Bo-Ni/Al_alloy_CALPHAD_test_4_full' # 'Bo-Ni/Al_alloy_CALPHAD_test_3_full'
#
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||
# on model
Model_PKeys = {}
Model_PKeys['model_dir']=PKeys['wk_path']+'/1_model/'
Model_PKeys['model_type'] = 6 # 5: hfS6; 6: GPT2
Model_PKeys['model_args'] = dict(
    vocab_size=256, # len of the dict of the tokenizer
    n_layer = 36, # 24 # 6 # num of MHA blocks
    block_size = 1024,
    n_embd = 1024,
    # inside one MHA layer
    n_head = 16,
    bias = False,
    dropout = 0.2,
)
#
# ||||||||||||||||||||||||||||||||||||||||||||||||||||||
# on training
Train_PKeys = {}
Train_PKeys['out_dir'] = Model_PKeys['model_dir'] + "/training_dir/"
# + customize ones +
Train_PKeys['batch_size'] = Data_PKeys['batch_size']
Train_PKeys['block_size'] = Data_PKeys['context_length']
Train_PKeys['vocab_size'] = Model_PKeys['model_args']['vocab_size']
Train_PKeys['num_train_epochs'] = 1 if CKeys['Debug']==1 else 5 # 2
Train_PKeys['gradient_accumulation_steps'] = 8 # 40 # 1 # this's GAS
#
# AdamW optimizer, may put this into model class
Train_PKeys['weight_decay'] = 1e-1
Train_PKeys['beta1'] = 0.9
Train_PKeys['beta2'] = 0.95
# 0.99  # make a bit bigger because number of tokens per iter is small
Train_PKeys['grad_clip'] = 1.0 # clip gradients at this value, or disable if == 0.0
#
# Learning Plan:
# (linear warmup) + (cosine schedule) for lr_decay_iters + constant at min_lr
# Train_PKeys['max_iters'] = math.ceil(
#     len(train_dataloader)/Train_PKeys['gradient_accumulation_steps']
# ) * Train_PKeys['num_train_epochs'] # in GAS
# Train_PKeys['lr_decay_iters'] = Train_PKeys['max_iters']
Train_PKeys['warmup_iters'] = 1_000 # for large one # in GAS
Train_PKeys['learning_rate'] = 6e-4
Train_PKeys['decay_lr'] = True
# whether to decay the learning rate
# True: linear warmup + cosine decay + constant
# False: constant
#
Train_PKeys['min_lr'] = Train_PKeys['learning_rate']*1.e-1
#
# reporting and recording
Train_PKeys['report_1_trai_loss_this_GAS'] = 2 if CKeys['Debug']==1 else 20
Train_PKeys['report_2_vali_pred_this_GAS'] = 2 if CKeys['Debug']==1 else 200
Train_PKeys['report_3_save_mode_this_GAS'] = \
    Train_PKeys['report_2_vali_pred_this_GAS']*2 \
    if CKeys['Debug']==1 else \
    Train_PKeys['report_2_vali_pred_this_GAS']*2
#
# OTHERS: parallel setup
# DDP settings
Train_PKeys['backend'] = 'nccl' # 'nccl', 'gloo', etc.
# system
Train_PKeys['device'] = 'cuda' # examples: 'cpu', 'cuda', 'cuda:0', 'cuda:1' etc., or try 'mps' on macbooks
Train_PKeys['dtype'] = 'bfloat16' if torch.cuda.is_available() and torch.cuda.is_bf16_supported() else 'float16' # 'float32', 'bfloat16', or 'float16', the latter will auto implement a GradScaler
Train_PKeys['compile'] = True # use PyTorch 2.0 to compile the model to be faster
#

# ||||||||||||||||||||||||||||||||||||||||||||||||||||||
# on testing
Test_PKeys = {}
Test_PKeys['out_dir'] = Model_PKeys['model_dir'] + "/test_dir/"
#
# TBA: useful keys for testing
# do a initialization
Test_PKeys['num_samples'] = 2 # 10
# number of samples to draw
Test_PKeys['max_new_tokens'] = 1024 # 500
# number of tokens generated in each sample
Test_PKeys['temperature'] = 0.8
# 1.0 = no change, < 1.0 = less random, > 1.0 = more random, in predictions
Test_PKeys['top_k'] = 200
# retain only the top_k most likely tokens, clamp others to have 0 probability
Test_PKeys['seed'] = 1337
Test_PKeys['device'] = Train_PKeys['device']
# examples: 'cpu', 'cuda', 'cuda:0', 'cuda:1', etc.
Test_PKeys['dtype'] = 'bfloat16' if torch.cuda.is_available() and torch.cuda.is_bf16_supported() else 'float16'
# 'float32' or 'bfloat16' or 'float16'
Test_PKeys['compile'] = Train_PKeys['compile']

# pack keys for later
PKeys['Data_PKeys'] = Data_PKeys
PKeys['Model_PKeys'] = Model_PKeys
PKeys['Train_PKeys'] = Train_PKeys
PKeys['Test_PKeys'] = Test_PKeys


In [11]:
# ===============================================
# setup folder structures and parameter files
# place_holder
# ===============================================
print ("===============================================")
print ("setup folder structures and parameter files")
print ("===============================================")
#
# 1. global dir
print ('Working path exists or not: ', os.path.exists(PKeys['wk_path']))
if not os.path.exists(PKeys['wk_path']):
    UtilityPack.create_path(PKeys['wk_path'])

# 2. some global keys for recalling
# store info for later
PKeys['pk_data_pack']=PKeys['wk_path']+'/data_pack.pickle'
PKeys['pk_model_pack']=PKeys['wk_path']+'/model_pack.pickle'
PKeys['pk_train_pack']=PKeys['wk_path']+'/train_pack.pickle'


setup folder structures and parameter files
Working path exists or not:  True


In [12]:
print (f"Working path: ", PKeys['wk_path'])

Working path:  ./training_resu/


In [13]:
#
# clean EVERYTHING in the dir if 1st
#
# if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']==1:
print ("===============================================")
print ("Clean EVERYTHING in the dir if 1st...")
print ("===============================================")
#
if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']==1:
    #
    if os.path.exists(PKeys['wk_path']):
        cmd_line=f"rm -r {PKeys['wk_path']}"
        print("clean the slade...")
        print(f"excute {cmd_line}")
        os.popen(cmd_line).read()
        #
    # create dir for working space
    UtilityPack.create_path(PKeys['wk_path'])

Clean EVERYTHING in the dir if 1st...
clean the slade...
excute rm -r ./training_resu/
Creating the given path...
Done.


## 3. On dataset

In [14]:
# load in
# use HF infurstructures
from datasets import (
    load_dataset,
    load_from_disk,
    load_dataset_builder,
    get_dataset_split_names,
    DatasetDict,
    load_from_disk,
)
from torch.utils.data.dataloader import DataLoader

In [15]:
import datasets
print(datasets.__version__)

3.3.2


In [16]:
# ===================================================
# setup the dataset key
# ===================================================
#
if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']==1:
    print ("===================================================")
    print ("setup the dataset key")
    print ("===================================================")

    # first initialize the key
    DataKeys={}
    # pass on the keys
    print ("Get the following keys...")
    for this_key in PKeys['Data_PKeys'].keys():
        print ("{}: \n{}".format(this_key, PKeys['Data_PKeys'][this_key]))
        DataKeys[this_key] = PKeys['Data_PKeys'][this_key]

    print ("Create folders if needed...")
    # 1. creat subdir for datapart
    UtilityPack.create_path(DataKeys['data_dir'])
    # tokenizer folder
    UtilityPack.create_path(DataKeys['tokenizer_dir'])

setup the dataset key
Get the following keys...
data_dir: 
./training_resu//0_dataprocess/
tokenizer_dir: 
./training_resu//0_dataprocess//tokenizer
tokenizer_type: 
BPE_customerized_0
tokenizer_file: 
.//assets/cBPE_tokenizer_ver1.json
train_ratio: 
0.9
valid_ratio: 
0.001
fix_random: 
12345
context_length: 
1024
batch_size: 
4
hf_data_repo: 
Bo-Ni/Al_alloy_CALPHAD_test_4_full
Create folders if needed...
Creating the given path...
Done.
Creating the given path...
Done.


In [17]:
# ===================================================
# process the dataset
# ===================================================
#
if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']==1:
    print ("===================================================")
    print ("process the dataset")
    print ("===================================================")
    # process
    # 1. download the dict from HF
    dataset_dict = load_dataset(
        DataKeys['hf_data_repo'],
        token=dataset_key,
        revision="main",
        
    )
    # # 1.5 alternative way:
    # # from datasets import load_from_disk
    # dataset_dict = load_from_disk(data_dict_path)

    print ("raw dataset:\n")
    print (dataset_dict)
    #
    # 2.seperate into three
    dataset_dict_sepe = DataPack.data_dict_seperate_train_vali_test(
        dataset_dict,
        train_ratio=DataKeys['train_ratio'],
        vali_ratio=DataKeys['valid_ratio'],
        if_seed=(DataKeys['fix_random']>0),
        seed=DataKeys['fix_random'] if DataKeys['fix_random']>0 else None
    )
    print ("seperate raw dataset:\n")
    print (dataset_dict_sepe)
    # 3. do statistics on the dataset
    num_key_list = [
        # composition
        'AlMolePC','NiMolePC','ErMolePC','ZrMolePC','YMolePC','YbMolePC',\
        # structure
        'L12MolePC','ScheilL12MeltMolePC','ScheilL12MolePC',\
        'ScheilTerneryMeltMolePC','ScheilTernaryMolePC',\
        'ScheilAl3NiMeltMolePC','ScheilAl3NiMolePC',\
        'ScheilAl3ZrMeltMolePC','ScheilAl3ZrMolePC',\
        # properties
        'BulkResistivity','Misfit','CoarseningRate',\
        'ScheilFRCutoff','ScheilFRMatrix','ScheilCSC','ScheilHCS'
    ]
    # draw_distri_of_dataset_dict(
    #     dataset_dict_sepe,
    #     num_key_list=num_key_list,
    #     save_path=DataKeys['data_dir'],
    #     if_slient_run=CKeys['if_slient_run'],
    # )

    # 4. create the tokenizer
    tokenizer = DataPack.build_tokenizer(
        tokenizer_type=DataKeys['tokenizer_type'],
        tokenizer_file=DataKeys['tokenizer_file'],
        seq_len=DataKeys['context_length'],
    )

    # 5. convert raw data into text data for GPT
    # a. number record ==> sentences
    sentence_dataset_dict_sepe = dataset_dict_sepe.map(
        # self-defined function
        DataPack.assemble_multi_sentence,
        # other keys besides element in the procssing fun
        fn_kwargs={
            "key_list": ['Pred001', 'Gene001'],
        },
        batched= False, # True, # Note, sentences don't work for batched
        # only keep those for training by removing the old ones
        remove_columns=dataset_dict_sepe["train"].column_names,
    )
    # b. sentences ==> tokenized sentences
    tokenized_dataset_dict_sepe = sentence_dataset_dict_sepe.map(
        # self-defined function
        DataPack.tokenize_multi_sentence_with_BPE,
        # other keys
        fn_kwargs={
            "key_list": ['Pred001', 'Gene001'],
            "tokenizer": tokenizer,
            "context_length": DataKeys['context_length'],
        },
        #
        batched = True,
        # only keep those for training by removing the old ones
        remove_columns=sentence_dataset_dict_sepe["train"].column_names,
    )
    # c. convert into dataloader
    tokenized_dataset_dict_sepe.set_format(type="torch")
    # build into pytorch object
    #
    train_dataloader = DataLoader(
        tokenized_dataset_dict_sepe["train"],
        batch_size=DataKeys['batch_size'], # 32,
        shuffle=False,
    )
    #
    eval_dataloader = DataLoader(
        tokenized_dataset_dict_sepe["valid"],
        batch_size=DataKeys['batch_size'], # 32,
        shuffle=True,
    )
    # use for test
    #
    test_dataloader = DataLoader(
        tokenized_dataset_dict_sepe["test"],
        batch_size=DataKeys['batch_size'], # 32,
        shuffle=False,
    )

    # store the key and results
else:
    # load back DataKeys
    pass


process the dataset


Using the latest cached version of the dataset since Bo-Ni/Al_alloy_CALPHAD_test_4_full couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at /trace/home/bni2/.cache/huggingface/datasets/Bo-Ni___al_alloy_calphad_test_4_full/default/0.0.0/96b80fc0f9e668d7916a6bc2208fa0130858e801 (last modified on Tue Feb 18 15:19:15 2025).


raw dataset:

DatasetDict({
    train: Dataset({
        features: ['AlMolePC', 'NiMolePC', 'ErMolePC', 'ZrMolePC', 'YMolePC', 'YbMolePC', 'L12MolePC', 'TerneryMolePC', 'Al3NiMolePC', 'Al3ZrMolePC', 'ScheilL12MolePC', 'ScheilTernaryMolePC', 'ScheilAl3NiMolePC', 'ScheilAl3ZrMolePC', 'NZ_BulkResistivity', 'NZ_Misfit', 'NZ_CoarseningMetric', 'NZ_Freezing_Range_From_fccAl', 'CSC', 'NZ_HCS'],
        num_rows: 523599
    })
})
seperate raw dataset:

DatasetDict({
    train: Dataset({
        features: ['AlMolePC', 'NiMolePC', 'ErMolePC', 'ZrMolePC', 'YMolePC', 'YbMolePC', 'L12MolePC', 'TerneryMolePC', 'Al3NiMolePC', 'Al3ZrMolePC', 'ScheilL12MolePC', 'ScheilTernaryMolePC', 'ScheilAl3NiMolePC', 'ScheilAl3ZrMolePC', 'NZ_BulkResistivity', 'NZ_Misfit', 'NZ_CoarseningMetric', 'NZ_Freezing_Range_From_fccAl', 'CSC', 'NZ_HCS'],
        num_rows: 471239
    })
    valid: Dataset({
        features: ['AlMolePC', 'NiMolePC', 'ErMolePC', 'ZrMolePC', 'YMolePC', 'YbMolePC', 'L12MolePC', 'TerneryMolePC', '

In [18]:
#
# save or recall necessary parts
#
if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']==1:
    print ("==========================================")
    print ("save the dataset and tokenizer....")
    print ("==========================================")

    # 1. on DataKeys
    with open(PKeys['pk_data_pack'], 'wb') as handle:
        pickle.dump(DataKeys, handle, protocol=pickle.HIGHEST_PROTOCOL)
    #
    # 2. on tokenizer
    # # --
    # tokenizer.save_pretrained(
    #     DataKeys['tokenizer_dir']
    # )
    # ++
    DataPack.save_tokenizer(
        DataKeys,
        tokenizer,
    )
    #
    # 3. dataloaders
    # for trainining
    torch.save(train_dataloader, DataKeys['data_dir']+'/train_dataloader.pt')
    torch.save(eval_dataloader, DataKeys['data_dir']+'/eval_dataloader.pt')
    torch.save(test_dataloader, DataKeys['data_dir']+'/test_dataloader.pt')

    # 4. something else: sentence_data_dict
    # hf_object
    sentence_dataset_dict_sepe.save_to_disk(
        DataKeys['data_dir']+'/sentence_dataset_dict_hf'
    )

else:
    print("==========================================")
    print ('This is not a data-prepare run')
    print ('Load back in the data packages...')
    print("==========================================")



    with open(PKeys['pk_data_pack'], 'rb') as handle:
        # data_pack = pickle.load(handle)
        DataKeys = pickle.load(handle)


    # on tokenizer
    # # --
    # from transformers import AutoTokenizer
    # tokenizer = AutoTokenizer.from_pretrained(
    #     DataKeys['tokenizer_dir']
    # )
    # ++
    tokenizer  = DataPack.reload_tokenizer(
        DataKeys
    )

    # on dataloaders
    train_dataloader = torch.load(DataKeys['data_dir']+'/train_dataloader.pt')
    eval_dataloader = torch.load(DataKeys['data_dir']+'/eval_dataloader.pt')
    test_dataloader = torch.load(DataKeys['data_dir']+'/test_dataloader.pt')

    # something else:
    sentence_dataset_dict_sepe = load_from_disk(
        DataKeys['data_dir']+'/sentence_dataset_dict_hf'
    )

    print ('Done.')


save the dataset and tokenizer....


Saving the dataset (0/2 shards):   0%|          | 0/471239 [00:00<?, ? examples/s]

Saving the dataset (0/1 shards):   0%|          | 0/523 [00:00<?, ? examples/s]

Saving the dataset (0/1 shards):   0%|          | 0/51837 [00:00<?, ? examples/s]

## 4. On model

In [19]:
if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']==1:

    print ("======================================")
    print ("Initialize the model key")
    print ("======================================")
    ModelKeys = {}
    # pass on the keys
    print ("Get the following keys...")
    for this_key in PKeys['Model_PKeys'].keys():
        print ("{}: \n{}".format(this_key, PKeys['Model_PKeys'][this_key]))
        ModelKeys[this_key] = PKeys['Model_PKeys'][this_key]

    # 1. create the folder
    UtilityPack.create_path(ModelKeys['model_dir'])
    # 2. deliver one key for model building
    model_args = ModelKeys['model_args']


    print("==========================================")
    print("Save the model config ...")
    print("==========================================")
    # on ModelKeys
    with open(PKeys['pk_model_pack'], 'wb') as handle:
        pickle.dump(ModelKeys, handle, protocol=pickle.HIGHEST_PROTOCOL)

else:
    # this can be: 2nd training run or test run
    # load in the keys

    print("==========================================")
    print ('This is not the first run')
    print ('Load back in the model packages...')
    print("==========================================")

    with open(PKeys['pk_model_pack'], 'rb') as handle:
        # data_pack = pickle.load(handle)
        ModelKeys = pickle.load(handle)

    # unpack the model_args key
    model_args = ModelKeys['model_args']
    print ("model_args: \n", model_args)

    # From Trainer to know what parameters to load in
    with open(PKeys['pk_train_pack'], 'rb') as handle:
        # data_pack = pickle.load(handle)
        TrainKeys = pickle.load(handle)
    # # unpack the model_args key
    # model_args = ModelKeys['model_args']
    print ("TrainKeys: \n", TrainKeys)


Initialize the model key
Get the following keys...
model_dir: 
./training_resu//1_model/
model_type: 
6
model_args: 
{'vocab_size': 256, 'n_layer': 36, 'block_size': 1024, 'n_embd': 1024, 'n_head': 16, 'bias': False, 'dropout': 0.2}
Creating the given path...
Done.
Save the model config ...


In [20]:
print (
    f"model dir: "+
    ModelKeys['model_dir']
)

model dir: ./training_resu//1_model/


In [21]:
# ====================================================
# needed for the second run
# ====================================================
# load in the model if this is not first train
#
if CKeys['Working_Mode']>0 and CKeys['IF_FirstRun']>1:
    #
    if CKeys['Resume_from_where'] == 'LAST':
        ckpt_name = TrainKeys['out_dir_last']+'Last_ckpt.pt'
    elif CKeys['Resume_from_where'] == 'BEST':
        ckpt_name = TrainKeys['out_dir_best']+'Best_ckpt.pt'
    #
    print ("CK_PT: ", ckpt_name)
    #
    # prepare to resume training from a checkpoint
    checkpoint = torch.load(ckpt_name, map_location=TrainKeys['device'])
    checkpoint_model_args = checkpoint['model_args']
    #
    # force these config attributes to be equal otherwise we can't even resume training
    # the rest of the attributes (e.g. dropout) can stay as desired from command line
    #
    # For new models, this is just a check
    # |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    # problem-specified
    for k in checkpoint_model_args.keys():
        if model_args[k] != checkpoint_model_args[k]:
            print ("hard update needed on ", k)
            print ("old: ", model_args[k])
            print ("new: ", checkpoint_model_args[k])
            model_args[k] = checkpoint_model_args[k]
    print ("Updated model_args: ", model_args)



In [22]:
#
if CKeys['Working_Mode']>0:
    print ("=========================================")
    print ("Initialize the model from the scratch...")
    print ("=========================================")
    #
    model_config = ModelPack.build_model_config(
        CKeys,
        model_args
    )
    #
    model = ModelPack.build_model(
        CKeys,
        model_config,
    )

    # # may add another IF to distinguish the model types
    # #
    # # process: model configuration
    # gptconf = GPTConfig(**model_args)
    # print ("Recieve model config: \n", gptconf)
    # # process: model
    # model = GPT(gptconf)

    print ("\n\n")
    print (model)

Initialize the model from the scratch...
number of parameters: 453.32M



GPT(
  (transformer): ModuleDict(
    (wte): Embedding(256, 1024)
    (wpe): Embedding(1024, 1024)
    (drop): Dropout(p=0.2, inplace=False)
    (h): ModuleList(
      (0-35): 36 x MHA_Block(
        (ln_1): LayerNorm()
        (attn): CausalSelfAttention(
          (c_attn): Linear(in_features=1024, out_features=3072, bias=False)
          (c_proj): Linear(in_features=1024, out_features=1024, bias=False)
          (attn_dropout): Dropout(p=0.2, inplace=False)
          (resid_dropout): Dropout(p=0.2, inplace=False)
        )
        (ln_2): LayerNorm()
        (mlp): MLP(
          (c_fc): Linear(in_features=1024, out_features=4096, bias=False)
          (gelu): GELU(approximate='none')
          (c_proj): Linear(in_features=4096, out_features=1024, bias=False)
          (dropout): Dropout(p=0.2, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm()
  )
  (lm_head): Linear(in_features=1024, out_features

In [23]:
print ("===================================================")
print ("Load back the model if this's not the 1st training.")
print ("===================================================")
#
if CKeys['Working_Mode']>0 and CKeys['IF_FirstRun']>1:
    # Here, for GPT, we use checkpoint to load back the model
    print ("Loading the saved model...")
    #
    state_dict = checkpoint['model']
    # NOTE, this one has unwanted contents.
    # fix the keys of the state dictionary :(
    # honestly no idea how checkpoints sometimes get this prefix, have to debug more
    unwanted_prefix = '_orig_mod.'
    for k,v in list(state_dict.items()):
        if k.startswith(unwanted_prefix):
            state_dict[k[len(unwanted_prefix):]] = state_dict.pop(k)
    #
    # load back the previous breaking point
    print ("Load in the model...")
    model.load_state_dict(state_dict)
    print ("Update some other recrods...")
    # clean
    state_dict = None
    print ("Done")
    # TBA

Load back the model if this's not the 1st training.


In [24]:

# consider loading back the trained model
print ('===========================================================')
print ('Consider loading back the trained model IF available...')
print ('===========================================================')
#
if CKeys['Working_Mode']>0 and CKeys['IF_FirstRun']>1:
    # Here, for GPT, we use checkpoint to load back the model
    print ("Loading the training history...")
    # ref-code:
    # checkpoint = {
    #                 'model': raw_model.state_dict(),
    #                 'optimizer': optimizer.state_dict(),
    #                 'model_args': model_args,
    #                 'completed_updating_steps': completed_updating_steps,
    #                 'step_num': this_step,
    #                 'iter_num_at_best_loss': GAS_at_best_val_loss,
    #                 'best_val_loss': best_val_loss
    #             }
    best_val_loss_0 = checkpoint['best_val_loss']
    GAS_at_best_val_loss_0 = checkpoint['iter_num_at_best_loss'] # -100
    # finished_steps_0 = checkpoint['iter_num']*TrainKeys['gradient_accumulation_steps']
    finished_steps_0 = checkpoint['step_num']
    completed_updating_steps_0 = checkpoint['completed_updating_steps']
    #
    print ("On the read checkpoint:")
    print (f"step_num: {finished_steps_0}")
    print (f"GAS_num: {completed_updating_steps_0}")
    print (f"best_val_loss: {best_val_loss_0}")
    print (f"GAS_at_best_val_loss: {GAS_at_best_val_loss_0}")
    # TBA
    # # cleaning
    # checkpoint=None

Consider loading back the trained model IF available...


## 5. On training

In [25]:
# load in
from torch.nn.parallel import DistributedDataParallel as DDP
from torch.distributed import init_process_group, destroy_process_group

In [26]:
# Training mode
if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']==1:

    print ('===========================================================')
    print ('Initialize training at 1st run ...')
    print ('===========================================================')

    TrainKeys = {}
    # pass on the keys
    print ("Get the following keys...")
    for this_key in PKeys['Train_PKeys'].keys():
        print ("{}: \n{}".format(this_key, PKeys['Train_PKeys'][this_key]))
        TrainKeys[this_key] = PKeys['Train_PKeys'][this_key]

    # 1. create dir
    UtilityPack.create_path(TrainKeys['out_dir'])
    # two subdir just for model checkpoint
    TrainKeys['out_dir_last'] = TrainKeys['out_dir'] + "last_check/"
    TrainKeys['out_dir_best'] = TrainKeys['out_dir'] + "best_check/"
    UtilityPack.create_path(TrainKeys['out_dir_last'])
    UtilityPack.create_path(TrainKeys['out_dir_best'])
    #
    # Some that can only be defined here
    TrainKeys['max_iters'] = math.ceil(
        len(train_dataloader)/TrainKeys['gradient_accumulation_steps']
    ) * TrainKeys['num_train_epochs']  # in GAS
    TrainKeys['lr_decay_iters'] = TrainKeys['max_iters']

    # 2. process: problem-specific ones
    # ||||||||||||||||||||||||||||||||||||||||||||||||||||||
    # initialize the recrods
    best_val_loss_0 = 1.E09
    GAS_at_best_val_loss_0 = -100
    finished_steps_0 = 0
    completed_updating_steps_0 = 0

    # TrainKeys['best_val_loss'] = 1.e09
    # TrainKeys['GAS_at_best_val_loss'] = -100

    # =========================================================
    # below are not from PKeys
    # secondary ones
    # files: use fixed ones
    TrainKeys['1_train_loss.log']=TrainKeys['out_dir']+'1_train_loss.log'
    TrainKeys['2_vali_loss.log']=TrainKeys['out_dir']+'2_vali_loss.log'
    TrainKeys['2_vali_gene.log']=TrainKeys['out_dir']+'2_vali_gene.log'
    # # --
    # TrainKeys['3_save_model.log']=TrainKeys['out_dir']+'3_save_model.log'
    # ++
    TrainKeys['3_save_model_last.log']=TrainKeys['out_dir_last']+'3_save_model_last.log'
    TrainKeys['3_save_model_best.log']=TrainKeys['out_dir_best']+'3_save_model_best.log'

    # 3. save the TrainKeys
    print("==========================================")
    print("Save the train config ...")
    print("==========================================")

    # on ModelKeys
    with open(PKeys['pk_train_pack'], 'wb') as handle:
        pickle.dump(TrainKeys, handle, protocol=pickle.HIGHEST_PROTOCOL)

elif CKeys['Working_Mode']==1 and CKeys['IF_FirstRun']>1:

    print("==========================================")
    print ('This is not the first run')
    print ('Load back in the train packages...')
    print ('Done in the previous block with ModelKeys')
    print("==========================================")

else:
    # this a test mode: do whatever needed
    pass

Initialize training at 1st run ...
Get the following keys...
out_dir: 
./training_resu//1_model//training_dir/
batch_size: 
4
block_size: 
1024
vocab_size: 
256
num_train_epochs: 
5
gradient_accumulation_steps: 
8
weight_decay: 
0.1
beta1: 
0.9
beta2: 
0.95
grad_clip: 
1.0
warmup_iters: 
1000
learning_rate: 
0.0006
decay_lr: 
True
min_lr: 
5.9999999999999995e-05
report_1_trai_loss_this_GAS: 
20
report_2_vali_pred_this_GAS: 
200
report_3_save_mode_this_GAS: 
400
backend: 
nccl
device: 
cuda
dtype: 
bfloat16
compile: 
True
Creating the given path...
Done.
Creating the given path...
Done.
Creating the given path...
Done.
Save the train config ...


In [27]:
if CKeys['Working_Mode']==1:
    print ("Check the status before the preparation...")
    print ("allow_ft32 in cuda: \n",
           torch.backends.cuda.matmul.allow_tf32)
    print ("allow_ft32 in cudnn: \n",
           torch.backends.cudnn.allow_tf32)

Check the status before the preparation...
allow_ft32 in cuda: 
 False
allow_ft32 in cudnn: 
 True


In [28]:

if CKeys['Working_Mode']==1 and CKeys['Debug_Train']==1:
    print ({**TrainKeys, **ModelKeys})
    print (TrainKeys['dtype'])
    print (TrainKeys['compile'])
    # print (TrainKeys['ddp']) # this one comes only after processing

In [29]:
if CKeys['Working_Mode']==1:
    # prepare for training
    # ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    # This block is problem-specified
    #
    # before ddp, make a record of model+trainer:
    config = {**TrainKeys, **ModelKeys}
    print ("old config: \n", config)
    #
    # XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    # ATTENSION: afte this step, ranks matter
    # XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    #
    TrainKeys, ctx = TrainPack.initialize_train_fun(TrainKeys)
    # this one updates the TrainKeys based on whether this is the master_process
    # and create folder if it is master_process
    #
    #
    print ("Is this a DDP run: ", TrainKeys['ddp'])
    #
    # crop down the model block size if desired, using model surgery
    if TrainKeys['block_size'] < model.config.block_size:
        print ("crop down the model block size")
        model.crop_block_size(TrainKeys['block_size'])
        model_args['block_size'] = TrainKeys['block_size'] # so that the checkpoint will have the right value
    #
    # move the model to the device
    model.to(TrainKeys['device'])
    #
    # initialize a GradScaler. If enabled=False scaler is a no-op
    scaler = torch.cuda.amp.GradScaler(enabled=(TrainKeys['dtype'] == 'float16'))
    #
    # optimizer: initilize
    optimizer = model.configure_optimizers(
        TrainKeys['weight_decay'],
        TrainKeys['learning_rate'],
        (TrainKeys['beta1'], TrainKeys['beta2']),
        TrainKeys['device_type']
    )
    # load it back
    if CKeys['IF_FirstRun'] !=1:
        print ("load in optimizer from a previous run")
        optimizer.load_state_dict(checkpoint['optimizer'])
    #
    # initilize for 1st run
    # free up memory if not 1st run;
    checkpoint = None # free up memory
    #
    # compile the model
    if TrainKeys['compile']:
        print("compiling the model... (takes a ~minute)")
        # https://stackoverflow.com/questions/62691279/how-to-disable-tokenizers-parallelism-true-false-warning
        # import os
        os.environ["TOKENIZERS_PARALLELISM"] = "true" # "false"

        unoptimized_model = model
        model = torch.compile(model) # requires PyTorch 2.0
    # wrap model into DDP container
    if TrainKeys['ddp']:
        print ("adjust model through DDP...")
        model = DDP(model, device_ids=[TrainKeys['ddp_local_rank']])
    #
    # loagging
    config = {**TrainKeys, **ModelKeys}
    print ("new config: \n", config)
    # if TrainKeys['wandb_log'] and TrainKeys['master_process']:
    #     import wandb
    #     wandb.init(
    #         project=TrainKeys['wandb_project'],
    #         name=TrainKeys['wandb_run_name'],
    #         config=config,
    #     )

old config: 
 {'out_dir': './training_resu//1_model//training_dir/', 'batch_size': 4, 'block_size': 1024, 'vocab_size': 256, 'num_train_epochs': 5, 'gradient_accumulation_steps': 8, 'weight_decay': 0.1, 'beta1': 0.9, 'beta2': 0.95, 'grad_clip': 1.0, 'warmup_iters': 1000, 'learning_rate': 0.0006, 'decay_lr': True, 'min_lr': 5.9999999999999995e-05, 'report_1_trai_loss_this_GAS': 20, 'report_2_vali_pred_this_GAS': 200, 'report_3_save_mode_this_GAS': 400, 'backend': 'nccl', 'device': 'cuda', 'dtype': 'bfloat16', 'compile': True, 'out_dir_last': './training_resu//1_model//training_dir/last_check/', 'out_dir_best': './training_resu//1_model//training_dir/best_check/', 'max_iters': 147265, 'lr_decay_iters': 147265, '1_train_loss.log': './training_resu//1_model//training_dir/1_train_loss.log', '2_vali_loss.log': './training_resu//1_model//training_dir/2_vali_loss.log', '2_vali_gene.log': './training_resu//1_model//training_dir/2_vali_gene.log', '3_save_model_last.log': './training_resu//1_mode

  scaler = torch.cuda.amp.GradScaler(enabled=(TrainKeys['dtype'] == 'float16'))


num decayed parameter tensors: 146, with 454,295,552 parameters
num non-decayed parameter tensors: 73, with 74,752 parameters
using fused AdamW: True
compiling the model... (takes a ~minute)




new config: 
 {'out_dir': './training_resu//1_model//training_dir/', 'batch_size': 4, 'block_size': 1024, 'vocab_size': 256, 'num_train_epochs': 5, 'gradient_accumulation_steps': 8, 'weight_decay': 0.1, 'beta1': 0.9, 'beta2': 0.95, 'grad_clip': 1.0, 'warmup_iters': 1000, 'learning_rate': 0.0006, 'decay_lr': True, 'min_lr': 5.9999999999999995e-05, 'report_1_trai_loss_this_GAS': 20, 'report_2_vali_pred_this_GAS': 200, 'report_3_save_mode_this_GAS': 400, 'backend': 'nccl', 'device': 'cuda', 'dtype': 'bfloat16', 'compile': True, 'out_dir_last': './training_resu//1_model//training_dir/last_check/', 'out_dir_best': './training_resu//1_model//training_dir/best_check/', 'max_iters': 147265, 'lr_decay_iters': 147265, '1_train_loss.log': './training_resu//1_model//training_dir/1_train_loss.log', '2_vali_loss.log': './training_resu//1_model//training_dir/2_vali_loss.log', '2_vali_gene.log': './training_resu//1_model//training_dir/2_vali_gene.log', '3_save_model_last.log': './training_resu//1_mode

In [30]:
if CKeys['Working_Mode']==1:
    print ("Check the status After the preparation...")
    print ("allow_ft32 in cuda: \n",
           torch.backends.cuda.matmul.allow_tf32)
    print ("allow_ft32 in cudnn: \n",
           torch.backends.cudnn.allow_tf32)

Check the status After the preparation...
allow_ft32 in cuda: 
 True
allow_ft32 in cudnn: 
 True


In [31]:

if CKeys['Working_Mode']==1:
    print ("IF this is master_process: ", TrainKeys['master_process'])

IF this is master_process:  True


In [32]:
#
# prepare to record the training history
#
if CKeys['Working_Mode']==1 and CKeys['IF_FirstRun'] == 1:
    print ("==============================================")
    print ("perpare for the 1st training run ...")
    print ("==============================================")
    #
    # ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    # This is problem-specified
    #
    # for 1st run, initialize the logs
    if TrainKeys['master_process']:
        # 1. train loss
        top_line = f"epoch,step,GAS,wei_loss_trai,plain_loss_trai,lr\n"
        UtilityPack.add_one_line_to_file(
            file_name=TrainKeys['1_train_loss.log'],
            this_line=top_line,
            mode='w', # if exitst, erase it
        )
        # 2. eval loss
        # f"epoch: %d, step: %d, GAS: %d, wei_loss/trai: %f, plain_loss/trai: %f, loss/eval: %f, lr: %f\n"
        top_line = f"epoch,step,GAS,wei_loss_trai,plain_loss_trai,loss_eval,lr\n"
        UtilityPack.add_one_line_to_file(
            file_name=TrainKeys['2_vali_loss.log'],
            this_line=top_line,
            mode='w', # if exitst, erase it
        )
        #
        # 3. predict lines
        UtilityPack.add_one_line_to_file(
            file_name=TrainKeys['2_vali_gene.log'],
            this_line='\n',
            mode='w', # if exitst, erase it
        )



perpare for the 1st training run ...


In [33]:

if CKeys['Working_Mode']==1:
    print (device)
    # print (TrainKeys['ddp'])
    # print (ddp)
    print (TrainKeys['gradient_accumulation_steps'])
    print (TrainKeys['device_type'])
    print (TrainKeys['device'])
    print (TrainKeys['warmup_iters'])
    print (TrainKeys['learning_rate'])
    print (TrainKeys['decay_lr'])
    print (TrainKeys['master_process'])

cuda:0
8
cuda
cuda
1000
0.0006
True
True


In [34]:


if CKeys['Working_Mode']==1:
    # input
    # test prompts for training
    test_prompts_during_train = [
        "{Task:Pred001}={Composition:[Al:+9.440e+01,Ni:+3.552e+00,Er:+4.962e-01,Zr:+3.085e-01,Y:+7.322e-01,Yb:+5.108e-01]}=>{Structure:",
        "{Task:Gene001}={Property:[BulkResistivity:+4.514e-01,Misfit:+9.770e-01,CoarseningRate:+2.164e+00,ScheilFRCutoff:+1.412e+02,ScheilFRMatrix:+1.699e+01,ScheilCSC:+2.172e-01,ScheilHCS:+3.067e-01]}=>{Structure",
    ]
    # update for remaked one
    test_prompts_during_train = [
        "{Task:Pred001}={Composition:[(Al):+9.388e+01,(Ni):+3.656e+00,(Er):+1.439e+00,(Zr):+3.195e-01,(Y):+4.847e-01,(Yb):+2.209e-01]}=>{Structure:",
        "{Task:Gene001}={Property:[DiffusionResistivity:+1.696e-01,Misfit:+1.257e+00,CoarseningMetric:+4.000e+00,FreezingRange:+1.318e-01,CrackSusceptibilityCoefficient:+0.000e+00,HotCrackingSusceptibility:+0.000e+00]}=>{Structure:"
    ]
    # key tokens
    target_keywords = [
        "0","1","2","3","4","5","6","7","8","9",
        ".","+","-","e",
    ]


In [35]:
# prepare training
#
if CKeys['Working_Mode']==1:
    #
    # 1. key tokens
    keytoken_ids = TrainPack.translate_words_w_tokenizers(
        tokenizer,
        target_keywords,
    )
    print (keytoken_ids)
    #
    # 2. translate that into weight
    wei_list_for_all_vocab = TrainPack.build_weight_list_for_vocab(
        model.config.vocab_size,
        keytoken_ids,
        wei_0=1.,
        wei_1=2.,
    ).to(device)
    print (wei_list_for_all_vocab.shape)

[129, 120, 121, 122, 123, 124, 125, 126, 127, 128, 130, 131, 132, 158]
torch.Size([256])


In [None]:
if CKeys['Working_Mode'] == 1:
    # for some 
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
    
    print ("Start the training loop...")
    eval_size = 20

    TrainPack.training_loop_AR_LM(
        #
        model,
        tokenizer,
        ctx,
        optimizer,
        TrainKeys,
        train_dataloader,
        eval_dataloader,
        eval_size,
        wei_list_for_all_vocab,
        scaler,
        test_prompts_during_train,
        #
        best_val_loss_0,
        GAS_at_best_val_loss_0,
        finished_steps_0,
        completed_updating_steps_0,
        model_args,
    )

Start the training loop...
####################

epoch: 0, step: 160, GAS: 20, wei_loss/trai: 1.268766, plain_loss/trai: 1.152305, lr: 0.000012

####################

epoch: 0, step: 320, GAS: 40, wei_loss/trai: 0.671931, plain_loss/trai: 0.545766, lr: 0.000024

####################

epoch: 0, step: 480, GAS: 60, wei_loss/trai: 0.301311, plain_loss/trai: 0.185495, lr: 0.000036

####################

epoch: 0, step: 640, GAS: 80, wei_loss/trai: 0.315458, plain_loss/trai: 0.194371, lr: 0.000048

####################

epoch: 0, step: 800, GAS: 100, wei_loss/trai: 0.378221, plain_loss/trai: 0.248343, lr: 0.000060

####################

epoch: 0, step: 960, GAS: 120, wei_loss/trai: 0.295191, plain_loss/trai: 0.180656, lr: 0.000072

####################

epoch: 0, step: 1120, GAS: 140, wei_loss/trai: 0.295083, plain_loss/trai: 0.181643, lr: 0.000084

####################

epoch: 0, step: 1280, GAS: 160, wei_loss/trai: 0.390396, plain_loss/trai: 0.258548, lr: 0.000096

####################

e

  8%|▊         | 20/262 [00:17<03:37,  1.12it/s] 


epoch: 0, step: 1600, GAS: 200, wei_loss/trai: 0.306828, plain_loss/trai: 0.189326, loss/eval: 0.236367, lr: 0.000120

0: 
{Task:Pred001}={Composition:[(Al):+9.388e+01,(Ni):+3.656e+00,(Er):+1.439e+00,(Zr):+3.195e-01,(Y):+4.847e-01,(Yb):+2.209e-01]}=>{Structure:[AsBuilt_L12Mol%:+8.258e+01,AsBuilt_TernaryMol%:+6.593e+01,AsBuilt_Al3NiMol%:+0.371e+01,AsBuilt_Al3ZrMol%:+5.306e+00,L12Mol%:+1.782e+01,TernaryMol%:+0.398e+00,Al3NiMol%:+3.658e+00,Al3ZrMol%:+8.900e+00]}=>{Property:[DiffusionResistivity:+1.083e-01,Misfit:+1.260e+00,CoarseningMetric:+1.132e+00,FreezingRange:+1.307e-00,CrackSusceptibilityCoefficient:+2.615e-01,HotCrackingSusceptibility:+1.496e-01]}

1: 
{Task:Gene001}={Property:[DiffusionResistivity:+1.696e-01,Misfit:+1.257e+00,CoarseningMetric:+4.000e+00,FreezingRange:+1.318e-01,CrackSusceptibilityCoefficient:+0.000e+00,HotCrackingSusceptibility:+0.000e+00]}=>{Structure:[AsBuilt_L1e+00,L12Mol%:+01,AsBuilt_TernaryMol%:+4.76e+00,Al3NiMol%:+1.000e+01,Al3ZrMol%:+0.621e+00]}=>{Property: