An easy implementation of Faster R-CNN (https://arxiv.org/pdf/1506.01497.pdf) in PyTorch.
-
Updated
Jul 3, 2020 - Python
An easy implementation of Faster R-CNN (https://arxiv.org/pdf/1506.01497.pdf) in PyTorch.
Using LSTM or Transformer to solve Image Captioning in Pytorch
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)
An easy implementation of FPN (https://arxiv.org/pdf/1612.03144.pdf) in PyTorch.
Real-time semantic image segmentation on mobile devices
Convert segmentation binary mask images to COCO JSON format.
The pytorch implementation on “Fine-Grained Image Captioning with Global-Local Discriminative Objective”
Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]
PyTorch implementation of paper: "Self-critical Sequence Training for Image Captioning"
Caption generation from images using topics as additional guiding inputs.
Image caption generation using GRU-based attention mechanism
Create a YOLO-format subset of the COCO dataset
Microsoft COCO: Common Objects in Context for huggingface datasets
Analysis of Image Captioning Models
A simple Python API (built on top of TensorFlow) for neural image captioning with MSCOCO data.
Mixed vision-language Attention Model that gets better by making mistakes
A helper library for easily converting MSCOCO format data using the loading script of huggingface datasets.
Object Detection Dataset Format Converter
COCO-Stuff dataset for huggingface datasets
Image captioning with pretrained encoder on MSCOCO.
Add a description, image, and links to the mscoco-dataset topic page so that developers can more easily learn about it.
To associate your repository with the mscoco-dataset topic, visit your repo's landing page and select "manage topics."