The Multimodal Model for Vietnamese Visual Question Answering (ViVQA)
-
Updated
Jul 29, 2024 - Python
The Multimodal Model for Vietnamese Visual Question Answering (ViVQA)
Enhancing Vietnamese VQA through Curriculum Learning on Raw and Augmented Text Representations (AAAIW 2025)
Add a description, image, and links to the vivqa topic page so that developers can more easily learn about it.
To associate your repository with the vivqa topic, visit your repo's landing page and select "manage topics."