An easy implementation of Faster R-CNN (https://arxiv.org/pdf/1506.01497.pdf) in PyTorch.
-
Updated
Jul 3, 2020 - Python
An easy implementation of Faster R-CNN (https://arxiv.org/pdf/1506.01497.pdf) in PyTorch.
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)
An easy implementation of FPN (https://arxiv.org/pdf/1612.03144.pdf) in PyTorch.
Real-time semantic image segmentation on mobile devices
Using LSTM or Transformer to solve Image Captioning in Pytorch
Convert segmentation binary mask images to COCO JSON format.
PyTorch implementation of paper: "Self-critical Sequence Training for Image Captioning"
The pytorch implementation on “Fine-Grained Image Captioning with Global-Local Discriminative Objective”
Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]
Image caption generation using GRU-based attention mechanism
Microsoft COCO: Common Objects in Context for huggingface datasets
Caption generation from images using topics as additional guiding inputs.
An end-to-end vision and language model incorporating explicit knowledge graphs and OOD-detection.
Mixed vision-language Attention Model that gets better by making mistakes
Analysis of Image Captioning Models
Augment the MS COCO training set while training NIC
Create a YOLO-format subset of the COCO dataset
A simple Python API (built on top of TensorFlow) for neural image captioning with MSCOCO data.
COCO-Stuff dataset for huggingface datasets
Image captioning with pretrained encoder on MSCOCO.
Add a description, image, and links to the mscoco-dataset topic page so that developers can more easily learn about it.
To associate your repository with the mscoco-dataset topic, visit your repo's landing page and select "manage topics."