Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbalanced classes #272

Open
arthurdouillard opened this issue Jun 16, 2022 · 0 comments
Open

Unbalanced classes #272

arthurdouillard opened this issue Jun 16, 2022 · 0 comments

Comments

@arthurdouillard
Copy link
Collaborator

From @umbertocappellazzo:

Well, the split into train, test and valid has been made by the authors who created the corpus and I don't know whether they crafted then different sets. Since I'm the first to use FSC in a CL scenario, I think it could be ok to proceed in this way, and I understand your rigorousness for this matter. So, you have the last word about this.
I take advantage of this thread for asking one thing: does Continuum handle the case of unbalanced classes for rehearsal? I had a look at the I suppose not, but I wanna be sure. If the dataset contains unbalanced classes, it's not fair to keep the same # of samples for each class. If continnum doesn't cover this case, I can come up with a solution for my project and then I can make a PR (if you think this is worth it).

I'd see two solutions:

  • either use a sampler given to the data loader to {over,under}-sample classes
  • or use a custom RehearsalMemory where you'd allow sampling a different amount of samples per class (not sure this very particular case is worth adding to Continuum though)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant