You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an innovative project aimed at enhancing the visual experience for individuals with impairments. Leveraging machine learning and natural language processing, this repository houses the codebase for generating efficient and coherent natural language descriptions of captured images. The project integrates seamlessly with image recognition,
[CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion model with additional semantic prior.