FLAVA code #1219

sameeravithana · 2022-03-25T17:23:29Z

In the original FLAVA paper [1], it cited MMF for the implementation. We want to check whether we can access the FLAVA implementation in this codebase.

[1] Singh, Amanpreet, et al. "FLAVA: A Foundational Language And Vision Alignment Model." arXiv preprint arXiv:2112.04482 (2021).

apsdehal · 2022-03-29T17:21:49Z

Hi,

The FLAVA codebase is on track to be released via torchmultimodal library. I will reply back to this issue by end of this week with further instructions.

PeterDykas · 2022-03-30T22:18:53Z

Why is there going to be two different repositories for multi-modal models. What is the difference going to be between TorchMultimodal and mmf?

kartikayk · 2022-03-31T00:06:11Z

Thanks for the question! We will have more detailed communication around this, but a quick note here. MMF currently supports text + image understanding tasks with some initial support for video understanding models added recently. We have received feedback from the community that MMF is slowly becoming over-engineered and the layers of inheritance is making it hard to use components outside of MMF. It’s also getting harder to add support for new tasks (eg: generation), support recent trends like model scaling, and extending to new modalities (audio for example).

As we rethink the Multimodal ecosystem in PyTorch, we will look to evolve MMF into a library for text + image understanding (refactor the models to be Pytorch components, deprecate the trainers and config systems etc) and provide more general support for combining modalities and tasks through TorchMultimodal. Our goal is to provide a collection of examples in TorchMultimodal that bring together components and infrastructure from all over the ecosystem, including MMF, for training multitask multimodal models at scale. As such, TorchMultimodal is designed with extensibility and composability in mind which makes adding new modalities (and tasks) or reusing components in other frameworks easy. The first example of this is the official release of FLAVA in TorchMultimodal. We don’t plan on adding this to MMF.

As I mentioned, we will share a more detailed communication around this soon!

PeterDykas · 2022-03-31T04:07:01Z

Thanks for the reply, that makes sense. Looking forward for the FLAVA implementation in TorchMultimodal
.

MicPie mentioned this issue Apr 24, 2022

Suggest your favorite papers to add! lucidrains/x-clip#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FLAVA code #1219

FLAVA code #1219

sameeravithana commented Mar 25, 2022

apsdehal commented Mar 29, 2022

PeterDykas commented Mar 30, 2022

kartikayk commented Mar 31, 2022

PeterDykas commented Mar 31, 2022

FLAVA code #1219

FLAVA code #1219

Comments

sameeravithana commented Mar 25, 2022

apsdehal commented Mar 29, 2022

PeterDykas commented Mar 30, 2022

kartikayk commented Mar 31, 2022

PeterDykas commented Mar 31, 2022