Can‘t achieve the scores in paper #3

PaffxAroma · 2021-07-22T03:49:25Z

I'm trying to reproduce the paper, but cann't reach 0.85 ACC runing this code on GoogleNews-S. All hyper-parameters are set to same values in paper, and the data is enhanced with contextual argumentation. The running result shows, not the model but the representation with K-Means performances better with 0.75 acc , and the model just reaches 0.62 acc. When I increase the clustering head lr, the result with model still remains 0.62 level. What should I do to improve this？

yanhan19940405 · 2021-07-30T09:07:04Z

I also did not achieve good results. After visualizing the data embedding space, I found that the sentence embedding matrix generated by SCCL does not have obvious discrimination. What happened to your follow-up?

Dejiao2018 · 2021-07-30T14:52:13Z

Thanks for your interests in our work @PaffxAroma. To your questions,

The reported accuracy of google-s is 83.1 instead of 85. 2) As we claimed in the paper, ACC is reported as the KMeans clustering results. 3) Also you should check how the clustering accuracy changes along the learning process. Arbitrary long learning process will result in degenerated performance.

@yanhan19940405 , can you provide more context about your plot? Is it a TSNE visualization? If so, why only one color here. Please refer to my answer to your original question. thanks

yanhan19940405 · 2021-07-30T15:13:03Z

Hello, the result of this picture is not obtained by TSNE. Instead, according to the Bert-Flow paper, the original verification data of the viewpoint is all encoded using the SCCL model obtained by training to get the sentence embedding drawn. The sentence embedding matrix dimension is (m, 128), and the sample distribution map drawn by using PCA to reduce the 128 feature dimensions to 2 dimensions.

…

---Original--- From: ***@***.***> Date: Fri, Jul 30, 2021 22:52 PM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [amazon-research/sccl] Can‘t achieve the scores in paper (#3) Thanks for your interests in our work @PaffxAroma. To your questions, The reported accuracy of google-s is 83.1 instead of 85. 2) As we claimed in the paper, ACC is reported as the KMeans clustering results. 3) Also you should check how the clustering accuracy changes along the learning process. Arbitrary long learning process will result in degenerated performance. @yanhan19940405 , can you provide more context about your plot? Is it a TSNE visualization? If so, why only one color here. Please refer to my answer to your original question. thanks — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

yanhan19940405 · 2021-07-30T15:16:20Z

In addition, the opinion data is derived from news data, and the binary data is manually corrected to determine whether it contains opinion information. After manual proofreading, the F1 index of the supervised classification model is around 0.91. But the data is invalid after being used on SCCL. In addition, the data enhancement method uses the Google translation engine back translation method.

…

---Original--- From: ***@***.***> Date: Fri, Jul 30, 2021 22:52 PM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [amazon-research/sccl] Can‘t achieve the scores in paper (#3) Thanks for your interests in our work @PaffxAroma. To your questions, The reported accuracy of google-s is 83.1 instead of 85. 2) As we claimed in the paper, ACC is reported as the KMeans clustering results. 3) Also you should check how the clustering accuracy changes along the learning process. Arbitrary long learning process will result in degenerated performance. @yanhan19940405 , can you provide more context about your plot? Is it a TSNE visualization? If so, why only one color here. Please refer to my answer to your original question. thanks — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Dejiao2018 · 2021-07-30T15:21:19Z

Please refer to Table 3 in our paper. Back translation does not perform well in our experiment, which we do not recommend for contrastive learning based short text clustering.

Also for your problem, in addition to the data augmentation, I do encourage you check 1) and 2) possible causes in my response to your original question #4 , which should more likely cause the problems you encounter.

As for the sentence embedding matrix, should it be (m, 768) instead, m indicates the batch size? It seems BERT-flow focuses on pairwise semantic similarity only, I'm not sure whether the statement there can generalize to categorical data. I may encourage using the TSNE plot on the (distil)bert embeddings instead.

yanhan19940405 · 2021-07-30T15:30:44Z

thanks yep,We can Learn all this information from your paper. But I haven't found the data enhancement details in your code for the time being. Therefore, only the back translation method can be used instead. In addition, even if the effect is poor, it should not be ineffective. . Because from the results of the distribution map, the samples are highly dense and not distinguishable. In the process of reproducing the paper, the codes of clustering probability distribution in my code were all extracted and replaced with your original code. I will conduct a second experiment verification based on the content you mentioned. Thank you again for your reply. If the team allows me, I will release my code and data before contacting you. Have a nice weekend, thank you.

…

---Original--- From: ***@***.***> Date: Fri, Jul 30, 2021 23:21 PM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [amazon-research/sccl] Can‘t achieve the scores in paper (#3) Please refer to Table 3 in our paper. Back translation does not perform well in our experiment, which we do not recommend for contrastive learning based short text clustering. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

yanhan19940405 · 2021-07-30T15:57:39Z

Sorry, I just noticed your last reply. Yes, 128 means embedding size, which is obtained by linear transformation from 768 dimensions. m represents the overall sample number (not clearly stated here,feel sorry)

rajat-tech-002 · 2021-08-16T03:40:37Z

Thanks for your interests in our work @PaffxAroma. To your questions,

The reported accuracy of google-s is 83.1 instead of 85. 2) As we claimed in the paper, ACC is reported as the KMeans clustering results. 3) Also you should check how the clustering accuracy changes along the learning process. Arbitrary long learning process will result in degenerated performance.

@yanhan19940405 , can you provide more context about your plot? Is it a TSNE visualization? If so, why only one color here. Please refer to my answer to your original question. thanks

@Dejiao2018 . The idea in the paper in quite good. I like the approach.
So, all the results reported in the paper are with Bert Embeddings and Kmeans? Rather than with Clustering Head? What was the reason for not reporting results with Cluster Head? Was the ACC with Clustering Head always less than with Kmeans? Thanks

1085737319 · 2021-11-09T08:48:24Z

I'm trying to reproduce the paper, but cann't reach 0.85 ACC runing this code on GoogleNews-S. All hyper-parameters are set to same values in paper, and the data is enhanced with contextual argumentation. The running result shows, not the model but the representation with K-Means performances better with 0.75 acc , and the model just reaches 0.62 acc. When I increase the clustering head lr, the result with model still remains 0.62 level. What should I do to improve this？

What is the parameter setting of the data set SearchSnippets?

PaffxAroma changed the title ~~Can‘t reach the results in paper~~ Can‘t achieve the scores in paper Jul 22, 2021

Dejiao2018 closed this as completed Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can‘t achieve the scores in paper #3

Can‘t achieve the scores in paper #3

PaffxAroma commented Jul 22, 2021 •

edited

yanhan19940405 commented Jul 30, 2021

Dejiao2018 commented Jul 30, 2021

yanhan19940405 commented Jul 30, 2021 via email

yanhan19940405 commented Jul 30, 2021 via email

Dejiao2018 commented Jul 30, 2021 •

edited

yanhan19940405 commented Jul 30, 2021 via email

yanhan19940405 commented Jul 30, 2021

rajat-tech-002 commented Aug 16, 2021 •

edited

1085737319 commented Nov 9, 2021

Can‘t achieve the scores in paper #3

Can‘t achieve the scores in paper #3

Comments

PaffxAroma commented Jul 22, 2021 • edited

yanhan19940405 commented Jul 30, 2021

Dejiao2018 commented Jul 30, 2021

yanhan19940405 commented Jul 30, 2021 via email

yanhan19940405 commented Jul 30, 2021 via email

Dejiao2018 commented Jul 30, 2021 • edited

yanhan19940405 commented Jul 30, 2021 via email

yanhan19940405 commented Jul 30, 2021

rajat-tech-002 commented Aug 16, 2021 • edited

1085737319 commented Nov 9, 2021

PaffxAroma commented Jul 22, 2021 •

edited

Dejiao2018 commented Jul 30, 2021 •

edited

rajat-tech-002 commented Aug 16, 2021 •

edited