Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RobertaDot_NLL_LN class not defined? #13

Closed
ylwangy opened this issue Jul 20, 2021 · 5 comments
Closed

RobertaDot_NLL_LN class not defined? #13

ylwangy opened this issue Jul 20, 2021 · 5 comments

Comments

@ylwangy
Copy link

ylwangy commented Jul 20, 2021

Hi, jingtao

I find that the adore model released is not defined in your code.

The config.json file indicates that the model architecture is "RobertaDot_NLL_LN", however , it seems not defined in model.py.

@jingtaozhan
Copy link
Owner

jingtaozhan commented Jul 20, 2021

Oh, sorry for the confusion. This is quite tricky.
In fact, RobertaDot_NLL_LN is defined in a previous work. To enable a fair comparison, we utilize the same model architecture and the same warmup model. Our released RobertaDot is the same as RobertaDot_NLL_LN except for the name. As for the RobertaDot_NLL_LN in the config.json, it is from the warmup model released by that previous work and I did not change it.

@ylwangy
Copy link
Author

ylwangy commented Jul 20, 2021

Thanks, and when we train adore, which model should we give for the init_path arg?

config = RobertaConfig.from_pretrained(args.init_path)
model = RobertaDot.from_pretrained(args.init_path, config=config)

Should the init_path be the star model we trained previously (say the checkpoint you released? The model architecture is RobertaDot_InBatch, not RobertaDot ), or other PLMs such as roberta_base?

@jingtaozhan
Copy link
Owner

ADORE further finetunes the query encoder and uses a frozen document encoder.
In our experiments, we use the frozen document encoder as initialization.

@jingtaozhan
Copy link
Owner

The so-called RobertaDot_Inbatch only means that we use a trick similar to in-batch training. The actual model architecture is still the same as RobertaDot.
In all our experiments, we use the same model architecture defined as RobertaDot

@ylwangy
Copy link
Author

ylwangy commented Jul 20, 2021

I see. Thanks and I'll try.

@ylwangy ylwangy closed this as completed Jul 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants