<h1>CS689A: Computational Linguistics for Indian Languages</h1>
<h2>Assignment-2</h2>
<h2>Avnish Tripathi 22111014</h2>
<h3>Question-2-a</h3>

**Dependencies:** `numpy, conllu, datasets, transformers, seqeval`

seqeval is a Python framework for sequence labeling evaluation. seqeval can evaluate the performance of chunking tasks such as named-entity recognition, part-of-speech tagging, semantic role labeling and so on.

### Creating train-validation-test splits
> Replace the file accordingly

**Note: Skip this section if files are already available in form of train-val-test splits**

In [1]:
from conllu import parse
data_file = open("hi_hdtb-ud-dev.conllu", "r", encoding="utf-8").read()
sentences = parse(data_file)
len(sentences)

1659

In [3]:
sentences[0]

TokenList<रामायण, काल, में, भगवान, राम, के, पुत्र, कुश, की, राजधानी, कुशावती, को, 483, ईसा, पूर्व, बुद्ध, ने, अपने, अंतिम, विश्राम, के, लिए, चुना, ।, metadata={sent_id: "dev-s1", text: "रामायण काल में भगवान राम के पुत्र कुश की राजधानी कुशावती को 483 ईसा पूर्व बुद्ध ने अपने अंतिम विश्राम के लिए चुना ।"}>

In [4]:
sentences[0][0]

{'id': 1,
 'form': 'रामायण',
 'lemma': 'रामायण',
 'upos': 'PROPN',
 'xpos': 'NNPC',
 'feats': {'Case': 'Nom', 'Gender': 'Masc', 'Number': 'Sing', 'Person': '3'},
 'head': 2,
 'deprel': 'compound',
 'deps': None,
 'misc': {'Vib': '0',
  'Tam': '0',
  'ChunkId': 'NP',
  'ChunkType': 'child',
  'Translit': 'rāmāyaṇa'}}

In [5]:
sentences = [sentence.serialize() for sentence in sentences]

In [7]:
#sentences

In [8]:
from sklearn.model_selection import train_test_split
## Split size may be adjusted here
train, test = train_test_split(sentences,test_size=0.20)
train, val = train_test_split(train,test_size=0.10)

In [9]:
len(train),len(val),len(test)

(1194, 133, 332)

> Storing the outputs in `data/` folder

In [10]:
import os

## Create directory data/ if does not exist
if not os.path.exists('data'):
    os.makedirs('data')
    
for l,s in zip([train,val,test],['train','validation','test']):
    with open(f'data/{s}.conllu','w',encoding='utf-8') as fp:
        fp.write(''.join(l))

### Tokenization

The following section does tokenization for UPOS tagging task

> Loading data

In [11]:
from datasets import load_dataset

**Note: If the above convention for saving files was not followed, the file `ConlluIndic.py` needs to be editied so that `__TRAINING_FILE__` etc. point correctly**

In [15]:
dataset = load_dataset('ConlluIndic.py')
dataset

Downloading and preparing dataset conllu_indic/conlluindic to C:/Users/pc/.cache/huggingface/datasets/conllu_indic/conlluindic/1.0.0/c9e09ccab0932c0febc47671ade20124d5711a11252234fc13369eae14b5014e...


                                                                   

Dataset conllu_indic downloaded and prepared to C:/Users/pc/.cache/huggingface/datasets/conllu_indic/conlluindic/1.0.0/c9e09ccab0932c0febc47671ade20124d5711a11252234fc13369eae14b5014e. Subsequent calls will reuse this data.


100%|██████████| 3/3 [00:00<00:00, 63.82it/s]


DatasetDict({
    train: Dataset({
        features: ['id', 'tokens', 'lemmas', 'upos_tags'],
        num_rows: 1194
    })
    validation: Dataset({
        features: ['id', 'tokens', 'lemmas', 'upos_tags'],
        num_rows: 133
    })
    test: Dataset({
        features: ['id', 'tokens', 'lemmas', 'upos_tags'],
        num_rows: 332
    })
})

In [16]:
dataset['train'][0]

{'id': '0',
 'tokens': ['उन्होंने',
  'कहा',
  'कि',
  'जहाँ',
  'तक',
  'पीएसयू',
  'पर',
  'सरकारी',
  'नियंत्रण',
  'क़ायम',
  'है',
  'और',
  'सार्वजनिक',
  'क्षेत्र',
  'की',
  'विशेषताओं',
  'पर',
  'कोई',
  'असर',
  'नहीं',
  'पड़ता',
  '।'],
 'lemmas': ['वह',
  'कह',
  'कि',
  'जहाँ',
  'तक',
  'पीएसयू',
  'पर',
  'सरकारी',
  'नियंत्रण',
  'कायम',
  'है',
  'और',
  'सार्वजनिक',
  'क्षेत्र',
  'का',
  'विशेषता',
  'पर',
  'कोई',
  'असर',
  'नहीं',
  'पड़',
  '।'],
 'upos_tags': [10,
  15,
  13,
  10,
  1,
  7,
  1,
  0,
  7,
  0,
  3,
  4,
  7,
  7,
  1,
  7,
  1,
  10,
  7,
  9,
  15,
  12]}

In [17]:
# 'ADJ','ADP','ADV','AUX',  'CCONJ','DET','INTJ','NOUN','NUM','PART','PRON','PROPN','PUNCT','SCONJ','SYM','VERB','X','_'
# Upos mapping

> Tokenization

Like any other neural network, the transformers also can’t process the raw input text directly, hence it needs to be preprocessed into a form that the model can make sense of. This process is called tokenization where text input is converted into numbers. To do this we use a tokenizer that does the following</br>
    ->Splitting the input text into words, subwords, or individual letters that are called tokens.</br>
    ->Mapping each token with a unique integer.</br>
    ->Arranging and adding required inputs that are useful to the model.</br>
The preprocessing and tokenization process needs to be done in the same way as when the model was trained.  Since we are using pre-trained models, we need to use the corresponding tokenizer for the model and this can be achieved by using AutoTokenizer class. </br>

In [None]:
from transformers import AutoTokenizer

model_checkpoint = 'ai4bharat/indic-bert'

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

https://huggingface.co/docs/transformers/model_doc/auto

In the above code, we have imported the AutoTokenizer class from the transformers library and initialized the model checkpoint name.
When the above code is executed, the tokenizer of the model named 'ai4bharat/indic-bert' is downloaded and cached for further usage.

In [19]:
def align_labels_with_tokens(labels, word_ids):
    new_labels = []
    current_word = None
    for word_id in word_ids:
        if word_id != current_word:
            # Start of a new word!
            current_word = word_id
            label = -100 if word_id is None else labels[word_id]
            new_labels.append(label)
        elif word_id is None:
            # Special token
            new_labels.append(-100)
        else:
            # Same word as previous token
            label = labels[word_id]
            # If the label is B-XXX we change it to I-XXX
            #if label % 2 == 1:
            #    label += 1
            new_labels.append(label)

    return new_labels

In [20]:
def tokenize_and_align_labels(examples):
    tokenized_inputs = tokenizer(
        examples["tokens"], truncation=True, is_split_into_words=True
    )
    all_labels = examples["upos_tags"]
    new_labels = []
    for i, labels in enumerate(all_labels):
        word_ids = tokenized_inputs.word_ids(i)
        new_labels.append(align_labels_with_tokens(labels, word_ids))

    tokenized_inputs["labels"] = new_labels
    return tokenized_inputs

In [21]:
tokenized_datasets = dataset.map(
    tokenize_and_align_labels,
    batched=True,
    remove_columns=dataset["train"].column_names,
)

  0%|          | 0/2 [00:00<?, ?ba/s]Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
 50%|█████     | 1/2 [00:00<00:00,  1.57ba/s]
  0%|          | 0/1 [00:00<?, ?ba/s]
  0%|          | 0/1 [00:00<?, ?ba/s]


In [32]:
tokenized_datasets['train']['labels'][0]

[-100,
 10,
 10,
 15,
 13,
 10,
 1,
 7,
 7,
 7,
 1,
 0,
 0,
 7,
 7,
 7,
 0,
 0,
 3,
 4,
 7,
 7,
 7,
 7,
 7,
 1,
 7,
 7,
 7,
 7,
 1,
 10,
 7,
 9,
 9,
 15,
 12,
 -100]

In [33]:
#-100 is used to mark starting and ending. Why 100 beacause loss is not calculated for 100.

In [36]:
tokenized_datasets.save_to_disk('D:/data/tokenized_datasets/')  # Saving Tokenized dataset

### Fine-tuning and Evaluation

BERT (Bidirectional Encoder Representations from Transformers) is a big neural network architecture, with a huge number of parameters, that can range from 100 million to over 300 million. So, training a BERT model from scratch on a small dataset would result in overfitting.

So, it is better to use a pre-trained BERT model that was trained on a huge dataset, as a starting point. We can then further train the model on our relatively smaller dataset and this process is known as model fine-tuning.

In [37]:
from transformers import AutoTokenizer, AutoModelForTokenClassification, DataCollatorForTokenClassification, TrainingArguments, Trainer
from datasets import load_metric, load_from_disk, load_dataset
import numpy as np

> Loads the model, tokenizer, data collator, metric and tokenized datasets

**Note: Don't forget to edit the file paths if the previous conventions were not followed**

In [38]:
model_checkpoint = 'ai4bharat/indic-bert'


tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
tokenized_datasets = load_from_disk('D:/data/tokenized_datasets/')



data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer)
metric = load_metric('seqeval')


dataset = load_dataset('ConlluIndic.py')

upos_feature = dataset['train'].features['upos_tags']
label_names = upos_feature.feature.names



id2label = {str(i): label for i, label in enumerate(label_names)}
label2id = {v: k for k, v in id2label.items()}


model = AutoModelForTokenClassification.from_pretrained(model_checkpoint,id2label=id2label,label2id=label2id)

  metric = load_metric('seqeval')
Found cached dataset conllu_indic (C:/Users/pc/.cache/huggingface/datasets/conllu_indic/conlluindic/1.0.0/c9e09ccab0932c0febc47671ade20124d5711a11252234fc13369eae14b5014e)
100%|██████████| 3/3 [00:00<00:00, 91.76it/s]
Some weights of the model checkpoint at ai4bharat/indic-bert were not used when initializing AlbertForTokenClassification: ['predictions.decoder.bias', 'predictions.LayerNorm.weight', 'sop_classifier.classifier.bias', 'sop_classifier.classifier.weight', 'predictions.LayerNorm.bias', 'predictions.dense.weight', 'predictions.dense.bias', 'predictions.bias', 'predictions.decoder.weight']
- This IS expected if you are initializing AlbertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing AlbertForTokenClassification from the checkpoint of a model that y

Data collators are objects that will form a batch by using a list of dataset elements as input. These elements are of the same type as the elements of train_dataset or eval_dataset.

To be able to build batches, data collators may apply some processing (like padding). Some of them (like DataCollatorForLanguageModeling) also apply some random data augmentation (like random masking) on the formed batch.

> Metric computation

In [39]:
def compute_metrics(eval_preds):
    logits, labels = eval_preds
    predictions = np.argmax(logits, axis=-1)

    # Remove ignored index (special tokens) and convert to labels
    true_labels = [[label_names[l] for l in label if l != -100] for label in labels]
    true_predictions = [
        [label_names[p] for (p, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(predictions, labels)
    ]
    all_metrics = metric.compute(predictions=true_predictions, references=true_labels)
    return {
        "precision": all_metrics["overall_precision"],
        "recall": all_metrics["overall_recall"],
        "f1": all_metrics["overall_f1"],
        "accuracy": all_metrics["overall_accuracy"],
    }


In [43]:
tokenized_datasets

DatasetDict({
    train: Dataset({
        features: ['input_ids', 'token_type_ids', 'attention_mask', 'labels'],
        num_rows: 1194
    })
    validation: Dataset({
        features: ['input_ids', 'token_type_ids', 'attention_mask', 'labels'],
        num_rows: 133
    })
    test: Dataset({
        features: ['input_ids', 'token_type_ids', 'attention_mask', 'labels'],
        num_rows: 332
    })
})

> Defining the trainer

`fp16 = True` may be set for GPU

Similarly `batch_size` can be increased

In [107]:
batch_size = 32
args = TrainingArguments(
    output_dir=f"model/upos",
    overwrite_output_dir=True,
    evaluation_strategy="epoch",
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    save_strategy="epoch",
    learning_rate=4e-4,
    num_train_epochs=10,
    weight_decay=0.004,
    save_total_limit=1,
    #fp16= True,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    # eval_dataset=tokenized_datasets["test"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)


PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).


> Train the model

In [105]:
trainer.train()

***** Running training *****
  Num examples = 1194
  Num Epochs = 10
  Instantaneous batch size per device = 32
  Total train batch size (w. parallel, distributed & accumulation) = 32
  Gradient Accumulation steps = 1
  Total optimization steps = 380
  Number of trainable parameters = 32866834


Epoch,Training Loss,Validation Loss,Precision,Recall,F1,Accuracy
1,No log,0.67827,0.687408,0.72932,0.707744,0.776244
2,No log,0.491261,0.784104,0.796893,0.790447,0.842428
3,No log,0.43823,0.813272,0.818641,0.815947,0.859084
4,No log,0.401737,0.833207,0.853592,0.843276,0.87574
5,No log,0.382244,0.860202,0.857864,0.859032,0.890861
6,No log,0.419318,0.863531,0.869903,0.866705,0.892615
7,No log,0.380274,0.874758,0.878835,0.876792,0.900723
8,No log,0.421361,0.872504,0.88233,0.877389,0.901381
9,No log,0.432009,0.882148,0.886602,0.88437,0.905325
10,No log,0.429093,0.887941,0.88932,0.88863,0.907955


***** Running Evaluation *****
  Num examples = 133
  Batch size = 32
  _warn_prf(average, modifier, msg_start, len(result))
Saving model checkpoint to model/upos\checkpoint-38
Configuration saved in model/upos\checkpoint-38\config.json
Model weights saved in model/upos\checkpoint-38\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-38\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-38\special_tokens_map.json
***** Running Evaluation *****
  Num examples = 133
  Batch size = 32
  _warn_prf(average, modifier, msg_start, len(result))
Saving model checkpoint to model/upos\checkpoint-76
Configuration saved in model/upos\checkpoint-76\config.json
Model weights saved in model/upos\checkpoint-76\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-76\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-76\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-38] due to args.save_total_limit


Saving model checkpoint to model/upos\checkpoint-114
Configuration saved in model/upos\checkpoint-114\config.json
Model weights saved in model/upos\checkpoint-114\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-114\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-114\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-76] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 133
  Batch size = 32
  _warn_prf(average, modifier, msg_start, len(result))
Saving model checkpoint to model/upos\checkpoint-152
Configuration saved in model/upos\checkpoint-152\config.json
Model weights saved in model/upos\checkpoint-152\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-152\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-152\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-114] due to args.save_total_limit
***** Running Evaluation *****
 

Saving model checkpoint to model/upos\checkpoint-228
Configuration saved in model/upos\checkpoint-228\config.json
Model weights saved in model/upos\checkpoint-228\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-228\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-228\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-190] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 133
  Batch size = 32
  _warn_prf(average, modifier, msg_start, len(result))
Saving model checkpoint to model/upos\checkpoint-266
Configuration saved in model/upos\checkpoint-266\config.json
Model weights saved in model/upos\checkpoint-266\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-266\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-266\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-228] due to args.save_total_limit
***** Running Evaluation *****


Saving model checkpoint to model/upos\checkpoint-342
Configuration saved in model/upos\checkpoint-342\config.json
Model weights saved in model/upos\checkpoint-342\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-342\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-342\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-304] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 133
  Batch size = 32
  _warn_prf(average, modifier, msg_start, len(result))
Saving model checkpoint to model/upos\checkpoint-380
Configuration saved in model/upos\checkpoint-380\config.json
Model weights saved in model/upos\checkpoint-380\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-380\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-380\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-342] due to args.save_total_limit


Training completed. Do not fo

TrainOutput(global_step=380, training_loss=0.2645432723195929, metrics={'train_runtime': 5714.8017, 'train_samples_per_second': 2.089, 'train_steps_per_second': 0.066, 'total_flos': 39144345836976.0, 'train_loss': 0.2645432723195929, 'epoch': 10.0})

> Evaluate the model

**Note: Make sure evaluation results are reported on test dataset, for this training args should be changed**

In [None]:
batch_size = 32
args = TrainingArguments(
    output_dir=f"model/upos",
    overwrite_output_dir=True,
    evaluation_strategy="epoch",
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    save_strategy="epoch",
    learning_rate=4e-4,
    num_train_epochs=10,
    weight_decay=0.004,
    save_total_limit=1,
    #fp16= True,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_datasets["train"],
    # eval_dataset=tokenized_datasets["validation"],
    eval_dataset=tokenized_datasets["test"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)


In [None]:
trainer.evaluate()

# Question - 2b

## Precision, Recall , F-score for Validation dataset (Micro)

'precision': 0.8879410624272974,
'recall': 0.8893203883495145,
'f-score': 0.8886301901435778

## Precision, Recall , F-score for Test dataset (Micro)

'precision': 0.8805875848996999,
'recall': 0.8915720454181992,
'f-score': 0.8860457724094087


## Precision, Recall , F-score for dataset (Macro)

### Macro-average precision = (P1+P2)/2
### Macro-average recall = (R1+R2)/2 

### The Macro-average F-Score will be simply the harmonic mean of these two figures.



In [111]:
precision1 =  0.8879410624272974
recall1 = 0.8893203883495145
precision2 = 0.8805875848996999
recall2 = 0.8915720454181992 

macro_precision = (precision1 + precision2)/2
macro_recall = (recall1 + recall2)/2
macro_f_score = 2*(macro_precision * macro_recall) /(macro_precision + macro_recall)
print(" Macro-average Precision: ",macro_precision)
print(" Macro-average Recall: ",macro_recall)
print(" Macro-average F-score: ",macro_f_score)


 Macro-average Precision:  0.8842643236634986
 Macro-average Recall:  0.890446216883857
 Macro-average F-score:  0.887344503502727


In [44]:
batch_size = 16
args = TrainingArguments(
    output_dir=f"model/upos",
    overwrite_output_dir=True,
    evaluation_strategy="epoch",
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    save_strategy="epoch",
    learning_rate=4e-4,
    num_train_epochs=2,
    weight_decay=0.004,
    save_total_limit=1,
    #fp16= True,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    # eval_dataset=tokenized_datasets["test"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)

PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).


In [45]:
trainer.train()

***** Running training *****
  Num examples = 1194
  Num Epochs = 2
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 150
***** Running Evaluation *****
  Num examples = 133
  Batch size = 16

[A
[A
[A
[A
[A
[A
[A
[A
  _warn_prf(average, modifier, msg_start, len(result))
                                             
[A                                             

 40%|████      | 2/5 [10:04<00:12,  4.10s/it]
[A
[ASaving model checkpoint to model/upos\checkpoint-75
Configuration saved in model/upos\checkpoint-75\config.json


{'eval_loss': 1.1313750743865967, 'eval_precision': 0.5687263556116016, 'eval_recall': 0.5403354632587859, 'eval_f1': 0.5541675199672331, 'eval_accuracy': 0.6081632653061224, 'eval_runtime': 24.3899, 'eval_samples_per_second': 5.453, 'eval_steps_per_second': 0.369, 'epoch': 1.0}


Model weights saved in model/upos\checkpoint-75\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-75\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-75\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-42] due to args.save_total_limit
***** Running Evaluation *****
  Num examples = 133
  Batch size = 16

[A
[A
[A
[A
[A
[A
[A
[A
  _warn_prf(average, modifier, msg_start, len(result))
                                             
[A                                              

 40%|████      | 2/5 [18:19<00:12,  4.10s/it]
[A
[ASaving model checkpoint to model/upos\checkpoint-150
Configuration saved in model/upos\checkpoint-150\config.json


{'eval_loss': 0.7866435647010803, 'eval_precision': 0.6163682864450127, 'eval_recall': 0.6737220447284346, 'eval_f1': 0.6437702728486929, 'eval_accuracy': 0.7336197636949516, 'eval_runtime': 24.4271, 'eval_samples_per_second': 5.445, 'eval_steps_per_second': 0.368, 'epoch': 2.0}


Model weights saved in model/upos\checkpoint-150\pytorch_model.bin
tokenizer config file saved in model/upos\checkpoint-150\tokenizer_config.json
Special tokens file saved in model/upos\checkpoint-150\special_tokens_map.json
Deleting older checkpoint [model\upos\checkpoint-75] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)


                                             
100%|██████████| 150/150 [17:06<00:00,  6.84s/it]

{'train_runtime': 1026.8184, 'train_samples_per_second': 2.326, 'train_steps_per_second': 0.146, 'train_loss': 1.2835548909505208, 'epoch': 2.0}





TrainOutput(global_step=150, training_loss=1.2835548909505208, metrics={'train_runtime': 1026.8184, 'train_samples_per_second': 2.326, 'train_steps_per_second': 0.146, 'train_loss': 1.2835548909505208, 'epoch': 2.0})