Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes on #619 #638

Merged
merged 5 commits into from
Jan 12, 2023
Merged

Changes on #619 #638

merged 5 commits into from
Jan 12, 2023

Conversation

sanagno
Copy link
Collaborator

@sanagno sanagno commented Jan 11, 2023

@ekurtulus I made some changes on #619. Feel free to edit/remove anything.

Copy link
Collaborator

@theblackcat102 theblackcat102 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code overall doesn't have much issue. Any result on polyloss-1?

@theblackcat102 theblackcat102 merged commit f58f473 into main Jan 12, 2023
@theblackcat102 theblackcat102 deleted the ekurtulus/main branch January 12, 2023 00:23
@ekurtulus
Copy link
Contributor

ekurtulus commented Jan 12, 2023

Code overall doesn't have much issue. Any result on polyloss-1?

Yes, I did some experiments finetuning Roberta-Large on GLUE. In that case, PolyLoss improved the results quite a bit:
image

Also, there is Sharpness Aware Minimization which we can use during fine-tuning. There is a paper showing that it improves generalization. To implement this, the _inner_training_loop function of the Huggingface Trainer must be overridden. I can do this, but did not want to implement it before feedback on this idea.

@sanagno
Copy link
Collaborator Author

sanagno commented Jan 12, 2023

Very cool results! Haven't tried SAM before, but I guess it doesn't hurt to try. I would say it is not a priority, but we can parallelize things for sure!

@sanagno sanagno added the ml label Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants