Skip to content

CyberAgentAILab/japanese-nli-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Japanese Natural Language Inference Model

This repository provides the code for Japanese NLI model, a fine-tuned masked language model.

Performance

The model showed performance comparable with those reported in JGLUE [Kurihara et al. 2022] and JSICK [Yanaka and Mineshima 2022] papers, in terms of overall accuracy:

Model JGLUE-JNLI valid [%] JSICK test [%]
[Kurihara et al. 2022] 91.9 N/A
[Yanaka and Mineshima 2022] N/A 89.1
ours using both JNLI and JSICK 90.9 89.0

References

Appendix: Hyperparameters

random seeds

Yes, we tested only a single run :(

torch.manual_seed(0)
random.seed(0)
np.random.seed(0)

dataset order

  1. JSICK
  2. JGLUE

labels

We converted string label into integer using the following mapping:

label2int = {"contradiction": 0, "entailment": 1, "neutral": 2}

CrossEncoder

We mimicked batch_size=128 using gradient accumulation 32 * 4 = 128.

batch_size=32,
shuffle=True,
epochs=3,
accumulation_steps=4,
optimizer_params={'lr': 5e-5},
warmup_steps=math.ceil(0.1 * len(data)),