🖼️ Enhance image understanding with this project for image captioning and visual question answering using BLIP and LLaVA, complete with reproducible setup and demos.
nlp benchmark computer-vision spice pytorch coco image-captioning pretrained-models vlm visual-question-answering visual-question huggingface hugging-face vqav2 llm vision-language-model llava pycocoevalcap
-
Updated
Sep 8, 2025 - Python