Using flauBERT for similarities between sentences #7

QuentinSpalla · 2020-01-23T09:56:27Z

Hi,

My goal here is to do clustering on sentences. For this purpose, I chose to use similarities between sentence embedding for all my sentences. Unfortunately, camemBERT seems not great for that task and fine-tuning flauBERT could be a solution.

So thanks to @formiel, I managed to fine tune flauBERT on an NLI dataset.
My question is about that fine-tuning. What is the output exactly ? I only got a few files in the dump_path:

train.log ==> logs of the training
params.pkl ==> parameters of the training
test.pred.0 ==> prediction of the test dataset after first epoch
valid.pred.0 ==> valid classification of the test dataset after first epoch
test.pred.1 ==> etc

I wonder if after fine-tuning flauBERT, i could use it to make a new embedding of a sentence (like flauBERT before fine-tuning). So where is the new flauBERT model trained on the NLI dataset ? And how use it to make embeddings ?

Thanks in advance

formiel · 2020-01-26T11:24:56Z

Hi @QuentinSpalla,

The current fine-tuning code of XLM does not save the best models (I'm sorry that you spent your time fine-tuning without having a trained model at the end). I did not realise this issue as I only needed the validation scores and prediction files.

I can add some code that allows saving the best model and loading it for computing embeddings. However, since I plan to migrate all the fine-tuning tasks to Hugging Face's transformers (for ease of comparison with other methods), it's difficult for me to do it now. Could you please wait to use transformers pipeline? I will push an update in the next few days for the XNLI task (with the weights of the fine-tuned model so that you won't have to train it again).

QuentinSpalla · 2020-01-27T08:38:11Z

HI @formiel

Thank you for your answer. Yes, it could be great using huggingface transformers for flauBERT.

Thanks

schwabdidier · 2020-02-15T21:36:43Z

@QuentinSpalla is it ok for you ? we can close this thread now I guess.

schwabdidier closed this as completed Feb 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using flauBERT for similarities between sentences #7

Using flauBERT for similarities between sentences #7

QuentinSpalla commented Jan 23, 2020

formiel commented Jan 26, 2020

QuentinSpalla commented Jan 27, 2020

schwabdidier commented Feb 15, 2020

Using flauBERT for similarities between sentences #7

Using flauBERT for similarities between sentences #7

Comments

QuentinSpalla commented Jan 23, 2020

formiel commented Jan 26, 2020

QuentinSpalla commented Jan 27, 2020

schwabdidier commented Feb 15, 2020