You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Developed an image captioning system using the BLIP model to generate detailed, context-aware captions. Achieved an average BLEU score of 0.72, providing rich descriptions that enhance accessibility and inclusivity.
Bootstrapping Language-Image Pretraining (BLIP): Leverages both text and image data to enhance AI models' understanding and generation of image descriptions. It bridges the gap between natural language and visual content, or NLP & Computer Vision.
This project processes videos by extracting frames, generating detailed visual descriptions for each frame using the BLIP model, and then summarizing these descriptions with the BART model.
CaptionCraft is an innovative tool that leverages generative AI to create catchy, engaging captions for any image. Users can upload an image or provide an image URL, and CaptionCraft will analyze the visual content to generate a tailored caption to make your images stand out.
This project revolves around the development of a simplistic toy programming language dubbed "Blip". In its initial phase (Phase A), the focus is on parsing and converting basic straight-line code featuring fundamental assignment (var, set) and output (output, text) commands.
MultiCLIP: A framework for multimodal-multilabel-multistage classification utilizing advanced pretrained models like CLIP and BLIP. 一个多模态多标签多阶段分类框架,利用像CLIP和BLIP这样的先进预训练模型。
PictoSymphony is an innovative image processing application that leverages cutting-edge technologies to provide a unique and artistic experience. It combines the power of Salesforce's blip-image-captioning-large model for image description with CompVis's stable-diffusion-v1-4 for image generation.