Meshed-Memory Transformer for Image Captioning. CVPR 2020
-
Updated
Dec 21, 2022 - Python
Meshed-Memory Transformer for Image Captioning. CVPR 2020
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
A modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
Display an image and text file side-by-side for easy manual caption editing.
A tool for downloading from public image boards (which allow scraping) / preview your images & tags / edit your images & tags. Additional tabs for downloading other desired code repositories as well as S.O.T.A. diffusion and clips models for your purposes. Custom datasets can be added!
A simple implementation of neural image caption generator
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
Code of our recently published attack FDA: Feature Disruptive Attack. Colab Notebook: https://colab.research.google.com/drive/1WhkKCrzFq5b7SNrbLUfdLVo5-WK5mLJh
End-to-end deep learning model for image captioning
Caption generator for live camera feed
A QT5 GUI to assist in captioning images for stablediffusion
Evaluating Visual Fidelity of Image Descriptions
Smart-I is an android application aimed at helping the visually impaired using artificial intelligence and cloud computing.
oCaption: Leveraging OpenAI's GPT-4 Vision for Advanced Image Captioning
Aid for blinds. This AI will describe the surrounding, it will tell who is in front of him (if that person is a known person to AI using Facial Recognition) and it will also help him to know what is written (Optical Character Recognition)
The BlipProcessor and BlipForConditionalGeneration are likely classes specific to a model called "Blip," which seems to be a transformer-based model for conditional text generation.
Public repo for the paper: "COSMic: A Coherence-Aware Generation Metric for Image Descriptions" by Mert İnan, Piyush Sharma, Baber Khalid, Radu Soricut, Matthew Stone, Malihe Alikhani
Add a description, image, and links to the captioning-images topic page so that developers can more easily learn about it.
To associate your repository with the captioning-images topic, visit your repo's landing page and select "manage topics."