You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
I have a question regarding the DSC full dataset. In the CTR paper it is said each domain has 2500-2500 positive and negative reviews for training, but the dataset itself (at least the 10 chosen domains as in the paper) have a max 2000-2000 samples, in some domains even less. Is this possible?
Also the two scenarios (dil_classification, til_classification) are not completely clear for me, if I understood correctly from the code, DIL doesnt use task-ids while TIL does. Which scenario should be used with the DSC dataset?
Finally, we have been able to reproduce some results on the DSC dataset, mostly within a couple of points compared to the table (in this repo or in the CTR paper), but BERT frozen NCL consistently produces 10-15% higher results. Currently we have an average accuracy of 0.8772 over 5 runs with different sequence seeds. Any idea why this naive approach would overperform the reported numbers?
The text was updated successfully, but these errors were encountered:
Hi!
I have a question regarding the DSC full dataset. In the CTR paper it is said each domain has 2500-2500 positive and negative reviews for training, but the dataset itself (at least the 10 chosen domains as in the paper) have a max 2000-2000 samples, in some domains even less. Is this possible?
Also the two scenarios (dil_classification, til_classification) are not completely clear for me, if I understood correctly from the code, DIL doesnt use task-ids while TIL does. Which scenario should be used with the DSC dataset?
Finally, we have been able to reproduce some results on the DSC dataset, mostly within a couple of points compared to the table (in this repo or in the CTR paper), but BERT frozen NCL consistently produces 10-15% higher results. Currently we have an average accuracy of 0.8772 over 5 runs with different sequence seeds. Any idea why this naive approach would overperform the reported numbers?
The text was updated successfully, but these errors were encountered: