<a href="https://colab.research.google.com/github/codeREXus/langchain-learnings/blob/main/mini_projs/image_captioning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🖼️ Image Captioning with Salesforce BLIP

In this mini project, we build an **Image Captioning System** that can automatically generate descriptive captions for any given image.  
We leverage the **Salesforce BLIP (Bootstrapping Language-Image Pre-training) model** and create a simple **Gradio interface** for interactive use.

---

## 📌 Project Overview
- Load the **BLIP pre-trained model** from Hugging Face Transformers.
- Provide an input image.
- Generate a **natural language caption** that describes the image.
- Deploy a **Gradio demo** for an easy-to-use interface.

---

## 🛠️ Tools & Libraries Used
- **[Transformers](https://huggingface.co/docs/transformers/index)** – For BLIP model & processor  
- **[Torch](https://pytorch.org/)** – Deep learning backend  
- **[Gradio](https://www.gradio.app/)** – Web interface for interaction  




In [None]:
%%capture
!pip install transformers torch gradio

In [None]:
import gradio as gr
from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image

processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

def generate_caption(image):
    # Now directly using the PIL Image object
    inputs = processor(images=image, return_tensors="pt")
    outputs = model.generate(**inputs)
    caption = processor.decode(outputs[0], skip_special_tokens=True)
    return caption

def caption_image(image):
    """
    Takes a PIL Image input and returns a caption.
    """
    try:
        caption = generate_caption(image)
        return caption
    except Exception as e:
        return f"An error occurred: {str(e)}"

iface = gr.Interface(
    fn=caption_image,
    inputs=gr.Image(type="pil"),
    outputs="text",
    title="Image Captioning with BLIP",
    description="Upload an image to generate a caption."
)

iface.launch(server_name="127.0.0.1", server_port= 7860)