Remove hideInDatasets for multimodal tasks #1495

merveenoyan · 2025-05-26T10:38:23Z

More and more datasets are showing up for multimodal tasks, and some authors are picking wrong task tags because hideInDataset is true, so removing them

Vaibhavs10

Do we have any intuition of how many such cases are there?

pcuenca · 2025-05-26T14:55:38Z

Yes, a few examples could be great for better understanding. I see, for example, this one that could possibly be assigned an image-text-to-text tag, but I wonder if other VQA datasets, such as the Cauldron, should have the same.

pcuenca

Spoke offline with Merve and had another look at things.

I'd be supportive of merging, given that:

The tasks affected by this PR have non-empty datasets (image-text-to-text (90), any-to-any (13), visual-document-retrieval (8))
video-text-to-text is already displayed in the filter bar
The tasks here are incomplete (I'm not sure if they are populated from this file or not)
@merveenoyan's intuition is that people pick from whatever options are offered/visible

But please, let's wait for vb to come back and see if he has additional insight!

julien-c

no objection!

Vaibhavs10

Thanks for pulling the numbers! Only recommendation/ suggestion would be to tag a few more datasets for the following:

any-to-any (13), visual-document-retrieval (8)

atleast so we have one page full of datasets.

pcuenca · 2025-06-20T12:58:29Z

Can we maybe merge this PR? We can always iterate later.

merveenoyan · 2025-06-20T13:04:44Z

sorry through the releases I couldn't work on this, @pcuenca I'm currently opening automatic PRs to a lot of models, I think it's ok to merge this

merveenoyan · 2025-06-20T13:50:34Z

I have opened more than 100 PRs, merging this, thanks a ton!

remove flags

d7e264b

merveenoyan requested review from SBrandeis, gary149, Wauplin, julien-c, pcuenca and ngxson as code owners May 26, 2025 10:38

Vaibhavs10 reviewed May 26, 2025

View reviewed changes

pcuenca approved these changes Jun 4, 2025

View reviewed changes

julien-c approved these changes Jun 4, 2025

View reviewed changes

Vaibhavs10 approved these changes Jun 5, 2025

View reviewed changes

Merge branch 'main' into mm-datasets

3abcaf0

Vaibhavs10 approved these changes Jun 20, 2025

View reviewed changes

merveenoyan merged commit 0e2b369 into main Jun 20, 2025
4 of 5 checks passed

merveenoyan deleted the mm-datasets branch June 20, 2025 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove hideInDatasets for multimodal tasks #1495

Remove hideInDatasets for multimodal tasks #1495

Uh oh!

merveenoyan commented May 26, 2025

Uh oh!

Vaibhavs10 left a comment

Uh oh!

pcuenca commented May 26, 2025

Uh oh!

pcuenca left a comment

Uh oh!

julien-c left a comment

Uh oh!

Vaibhavs10 left a comment

Uh oh!

pcuenca commented Jun 20, 2025

Uh oh!

merveenoyan commented Jun 20, 2025 •

edited

Loading

Uh oh!

merveenoyan commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

Remove hideInDatasets for multimodal tasks #1495

Remove hideInDatasets for multimodal tasks #1495

Uh oh!

Conversation

merveenoyan commented May 26, 2025

Uh oh!

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Uh oh!

pcuenca commented May 26, 2025

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

julien-c left a comment

Choose a reason for hiding this comment

Uh oh!

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Uh oh!

pcuenca commented Jun 20, 2025

Uh oh!

merveenoyan commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

merveenoyan commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

merveenoyan commented Jun 20, 2025 •

edited

Loading