---------------------------------------- None ---------------------------------------- Task Parameters ---------------------------------------- Training Language: en Evaluation Languages: ('en',) EUROVOC Concepts: level_3 (567) Max Document Length: 512 ---------------------------------------- Model Parameters ---------------------------------------- Bert Model: xlm-roberta-base Frozen Layers: 0 Use Adapters: False LayerNorm ONLY (LNFIT): False Bottle-neck Size: 256 ---------------------------------------- Training Parameters ---------------------------------------- Epochs: 70 Batch Size: 8 Learning Rate: 3e-05 Label Smoothing: 0.0 EarlyStop Monitor: val_rp ---------------------------------------- Couldn't find file locally at /trainman-mount/trainman-k8s-storage-8d05e340-710b-47e3-8dca-932ef5eca277/semantic_learning/downstream_tasks/multi-eurlex/experiments/multi_eurlex/multi_eurlex.py, or remotely at https://raw.githubusercontent.com/huggingface/datasets/1.11.0/datasets/multi_eurlex/multi_eurlex.py. The file was picked from the master branch on github instead at https://raw.githubusercontent.com/huggingface/datasets/master/datasets/multi_eurlex/multi_eurlex.py. Using custom data configuration default-0468e26c422c587d Reusing dataset multi_eurlex (/root/.cache/huggingface/datasets/multi_eurlex/default-0468e26c422c587d/1.0.0/8ec8b79877a517369a143ead6679d1788d13e51cf641ed29772f4449e8364fb6) Couldn't find file locally at /trainman-mount/trainman-k8s-storage-8d05e340-710b-47e3-8dca-932ef5eca277/semantic_learning/downstream_tasks/multi-eurlex/experiments/multi_eurlex/multi_eurlex.py, or remotely at https://raw.githubusercontent.com/huggingface/datasets/1.11.0/datasets/multi_eurlex/multi_eurlex.py. The file was picked from the master branch on github instead at https://raw.githubusercontent.com/huggingface/datasets/master/datasets/multi_eurlex/multi_eurlex.py. Using custom data configuration default-0b80c1f4c7e7ea09 Reusing dataset multi_eurlex (/root/.cache/huggingface/datasets/multi_eurlex/default-0b80c1f4c7e7ea09/1.0.0/8ec8b79877a517369a143ead6679d1788d13e51cf641ed29772f4449e8364fb6) 55000 documents will be used for training Lock 140592661239648 acquired on /root/.cache/huggingface/transformers/87683eb92ea383b0475fecf99970e950a03c9ff5e51648d6eee56fb754612465.ab95cf27f9419a99cce4f19d09e655aba382a2bafe2fe26d0cc24c18cf1a1af6.lock Lock 140592661239648 released on /root/.cache/huggingface/transformers/87683eb92ea383b0475fecf99970e950a03c9ff5e51648d6eee56fb754612465.ab95cf27f9419a99cce4f19d09e655aba382a2bafe2fe26d0cc24c18cf1a1af6.lock Lock 140592660650352 acquired on /root/.cache/huggingface/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8.lock Lock 140592660650352 released on /root/.cache/huggingface/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8.lock Lock 140592659928736 acquired on /root/.cache/huggingface/transformers/daeda8d936162ca65fe6dd158ecce1d8cb56c17d89b78ab86be1558eaef1d76a.a984cf52fc87644bd4a2165f1e07e0ac880272c1e82d648b4674907056912bd7.lock Lock 140592659928736 released on /root/.cache/huggingface/transformers/daeda8d936162ca65fe6dd158ecce1d8cb56c17d89b78ab86be1558eaef1d76a.a984cf52fc87644bd4a2165f1e07e0ac880272c1e82d648b4674907056912bd7.lock 5000 documents will be used for development Lock 140592205037280 acquired on /root/.cache/huggingface/transformers/97d0ea09f8074264957d062ec20ccb79af7b917d091add8261b26874daf51b5d.f42212747c1c27fcebaa0a89e2a83c38c6d3d4340f21922f892b88d882146ac2.lock Lock 140592205037280 released on /root/.cache/huggingface/transformers/97d0ea09f8074264957d062ec20ccb79af7b917d091add8261b26874daf51b5d.f42212747c1c27fcebaa0a89e2a83c38c6d3d4340f21922f892b88d882146ac2.lock Model: "BertClassifier" ____________________________________________________________________________________________________ Layer (type) Output Shape Param # ==================================================================================================== tfxlm_roberta_model (TFXLMRobertaModel) multiple 278043648 dropout_37 (Dropout) multiple 0 classifier (Dense) multiple 436023 ==================================================================================================== Total params: 278,479,671 Trainable params: 278,479,671 Non-trainable params: 0 ____________________________________________________________________________________________________ ---------------------------------------------------------------------------------------------------- Training History ---------------------------------------------------------------------------------------------------- Loss Val Loss RP Val RP EPOCH #1 : 0.70654 0.70390 0.00000 0.00909 EPOCH #2 : 0.69967 0.69715 0.00000 0.01497 EPOCH #3 : 0.68365 0.64677 0.00909 0.01422 EPOCH #4 : 0.66976 0.64405 0.00000 0.00588 EPOCH #5 : 0.66649 0.64676 0.01250 0.02010 EPOCH #6 : 0.64259 0.58437 0.02361 0.00588 EPOCH #7 : 0.60391 0.58496 0.00000 0.00588 EPOCH #8 : 0.59998 0.60895 0.01250 0.00000 EPOCH #9 : 0.60050 0.62296 0.00909 0.00000 EPOCH #10: 0.57687 0.61215 0.02159 0.00588 ---------------------------------------------------------------------------------------------------- Evaluation Metrics ---------------------------------------------------------------------------------------------------- Development ---------------------------------------------------------------------------------------------------- "en": R-Precision: 2.01 NDCG@1: 0.00 NDCG@2: 0.00 NDCG@3: 2.35 NDCG@4: 1.95 NDCG@5: 1.70 R@1: 0.00 R@2: 0.00 R@3: 0.59 R@4: 0.59 R@5: 0.59 ---------------------------------------------------------------------------------------------------- Test ---------------------------------------------------------------------------------------------------- "en": R-Precision: 5.64 NDCG@1: 0.00 NDCG@2: 0.00 NDCG@3: 2.35 NDCG@4: 1.95 NDCG@5: 1.70 R@1: 0.00 R@2: 0.00 R@3: 0.71 R@4: 0.71 R@5: 0.71 ---------------------------------------------------------------------------------------------------- Training time: 00:02:47 sec Training + Evaluation time: 00:03:04 sec