Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
-
Updated
Apr 8, 2023 - Jupyter Notebook
Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.
Multimodal Co-Attention Transformer for Survival Prediction in Gigapixel Whole Slide Images - ICCV 2021
Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
MIntRec: A New Dataset for Multimodal Intent Recognition (ACM MM 2022)
Creating multimodal multitask models
Multimodal sentiment analysis using hierarchical fusion with context modeling
[CVAMD 2021] "End-to-End Learning of Fused Image and Non-Image Feature for Improved Breast Cancer Classification from MRI"
Few-Shot malware classification using fused features of static analysis and dynamic analysis (基于静态+动态分析的混合特征的小样本恶意代码分类框架)
This repository contains the dataset and baselines explained in the paper: M2H2: A Multimodal Multiparty Hindi Dataset For HumorRecognition in Conversations
FusionBrain Challenge 2.0: creating multimodal multitask model
Deep-HOSeq: Deep Higher-Order Sequence Fusion for Multimodal Sentiment Analysis.
Multimodal sentiment analysis
Official implementation of "Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection"
Source code for the paper by Alfreds Lapkovskis, Natalia Nefedova & Ali Beikmohammadi (2024): Automatic Fused Multimodal Deep Learning for Plant Identification
A generalized self-supervised training paradigm for unimodal and multimodal alignment and fusion.
The code and data for the Paper 'Inferring Climate Change Stances from Multimodal Tweets' accepted by the Short Paper track of SIGIR 2024
Source code of a sample iOS app for the paper by Alfreds Lapkovskis, Natalia Nefedova & Ali Beikmohammadi (2024): Automatic Fused Multimodal Deep Learning for Plant Identification
[FR|EN - Trio] 2023 - 2024 Centrale Méditerranée AI Master | Multimodal retranscription with text, audio and video
Repo for "Centaur: Robust Multimodal Fusion for Human Activity Recognition"
Add a description, image, and links to the multimodal-fusion topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-fusion topic, visit your repo's landing page and select "manage topics."