multimodality

Here are 78 public repositories matching this topic...

XavierSpycy / MultiCLIP

MultiCLIP: A framework for multimodal-multilabel-multistage classification utilizing advanced pretrained models like CLIP and BLIP. 一个多模态多标签多阶段分类框架，利用像CLIP和BLIP这样的先进预训练模型。

pretrained-models clip multimodality blip multilabel-classification multimodal-deep-learning

Updated Jun 21, 2024
Python

Aeternalis-Ingenium / V4Vision-POC-Backend

Star

API to infer automated disease detection and report generation from medical images.

machine-learning software-engineering multimodality radiology multimodal-deep-learning med-tech llm

Updated Jun 17, 2024
Python

manunair1990 / Phi3-Vision-huggingface

Star

This repository contains Python code for performing vision tasks using the Microsoft Phi-3 Vision model and the Hugging Face library. It demonstrates generating textual responses based on image content, showcasing the integration of advanced vision-language models for tasks such as image analysis and descriptive text generation.

microsoft multimodality huggingface genai genaivision phi3-vision

Updated Jun 6, 2024
Python

narendranathjoshi / visual-qa

Star

Visual Question Answering project as a part of 11-777 course requirements

machine-learning deep-learning convolutional-neural-networks multimodality carnegie-mellon-university lstm-neural-networks

Updated Apr 3, 2017
Python

thiippal / distant-viewing-diagrams

Star

A repository for the article "Semiotically-grounded distant viewing of diagrams: insights from two multimodal corpora" published in Digital Scholarship in the Humanities (2022)

python diagrams multimodality ai2d-dataset ai2d-rst-dataset diagram-understanding

Updated Jul 29, 2023
Python

thiippal / diagrams-genre

Star

A repository for the article "Corpus-based insights into multimodality and genre in primary school science diagrams" published in Visual Communication (2023)

python diagrams multimodality ai2d-dataset ai2d-rst-dataset diagram-understanding

Updated Jul 29, 2023
Python

GINK03 / adult-site-user-click-prediction

Star

prediction adult site user numbers with multimodel source (Image and text and tag)

nlp image-processing deeplearning multimodality

Updated Aug 18, 2018
Python

Dazzid / DataToRepovizz

Star

Under the framework of TELMI Project, this is a python script to automatically upload multimodal data into repovizz repository. The project is part of TELMI within MTG Universitat Pompeu Fabra

data multimodality music-performance

Updated May 2, 2017
Python

DigitalGeographyLab / AI2D-RST

Star

Repository for the conference article "Enhancing the AI2 Diagrams dataset using Rhetorical Structure Theory", published in the Proceedings of the 11th International Language Resources and Evaluation Conference.

diagrams annotator multimodality ai2-diagrams-dataset

Updated Aug 8, 2019
Python

JiaqingFu / MultimodalModel

Star

deep-learning keras multimodality

Updated Sep 27, 2021
Python

enkaranfiles / remote-sensing-dataset-construction

Star

Vision Language Dataset Construction Library for Remote Sensing Domain

spectral multimodality vision-language llava

Updated May 30, 2024
Python

Droliven / DHMP_jittor

Star

Jittor reimplementation of DiverseSampling (MM22)

diversity deep-learning stochastic accuracy sampling manifold multimodality variational-inference likelihood cvae gcn gaussian-distribution variational-autoencoder gumbel-softmax diverse hinge-loss human-motion-prediction jittor acmmm2022

Updated Feb 25, 2023
Python

vita-epfl / AdversarialLoss-SGAN

Star

Analysing Adversarial Loss of Social GAN

deep-learning multimodality trajectory-prediction

Updated Feb 27, 2020
Python

Clealiya / Multimodal-model

Star

[FR|EN - Trio] 2023 - 2024 Centrale Méditerranée AI Master | Multimodal retranscription with text, audio and video

machine-learning ai deep-learning multimodality multimodal multimodal-fusion

Updated Jan 27, 2024
Python

diaoenmao / Multimodal-Controller-for-Generative-Models

Star

[CVMI 2022] Multimodal Controller for Generative Models

image-generation multimodality

Updated Feb 27, 2023
Python

Spider101 / Visual-Semantic-Alignments

Star

An exploration into the possibility of generating multi-sentence image descriptions by leveraging the latent dependencies between visual concepts in an image with their textual counterparts

stories keras image-captioning multimodality

Updated Dec 8, 2022
Python

kyegomez / MMCA-MGQA

Sponsor

Star

Experiments around using Multi-Modal Casual Attention with Multi-Grouped Query Attention

artificial-intelligence attention attention-mechanism multimodality attention-is-all-you-need multimodal multimodal-deep-learning gpt4

Updated Mar 11, 2024
Python

Warvito / integrating-multi-modal-neuroimaging

Star

Integrating machining learning and multi-modal neuroimaging to detect schizophrenia at the level of the individual

svm neuroimaging multimodality svm-classifier neuroimage schizophrenia

Updated Oct 18, 2019
Python

gangeshwark / multimodal_feature_extractors

Star

[IN PROGRESS] Multimodal feature extraction modules for ease of doing research and reproducibility.

features multimodality multimodal features-extraction

Updated Sep 19, 2018
Python

cleopatra-itn / image_text_claim_detection

Star

Code and Dataset for paper "On the Role of Images for Analyzing Claims in Social Media" @2nd International Workshop on Cross-lingual Event-centric Open Analytics (CLEOPATRA) co-located with The Web Conf 2021

multilingual social-media dataset fake-news multimodality multimodal-classification claim-detection analyzing-claims

Updated Sep 13, 2021
Python

Improve this page

Add a description, image, and links to the multimodality topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodality topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodality

Here are 78 public repositories matching this topic...

XavierSpycy / MultiCLIP

Aeternalis-Ingenium / V4Vision-POC-Backend

manunair1990 / Phi3-Vision-huggingface

narendranathjoshi / visual-qa

thiippal / distant-viewing-diagrams

thiippal / diagrams-genre

GINK03 / adult-site-user-click-prediction

Dazzid / DataToRepovizz

DigitalGeographyLab / AI2D-RST

JiaqingFu / MultimodalModel

enkaranfiles / remote-sensing-dataset-construction

Droliven / DHMP_jittor

vita-epfl / AdversarialLoss-SGAN

Clealiya / Multimodal-model

diaoenmao / Multimodal-Controller-for-Generative-Models

Spider101 / Visual-Semantic-Alignments

kyegomez / MMCA-MGQA

Warvito / integrating-multi-modal-neuroimaging

gangeshwark / multimodal_feature_extractors

cleopatra-itn / image_text_claim_detection

Improve this page

Add this topic to your repo