# 🖼️📝 Image Caption Generator
This activity uses a pre-trained vision-language model to describe an image with a text caption.

## ✅ Skills Covered
- Vision-language model usage
- Image understanding
- Caption generation
- Using `transformers` for image-to-text tasks

In [None]:
# 📦 Install required libraries
!pip install transformers pillow requests --quiet

In [None]:
# 📥 Load an image from the web
from PIL import Image
import requests
from transformers import BlipProcessor, BlipForConditionalGeneration

url = "https://images.pexels.com/photos/1108099/pexels-photo-1108099.jpeg"
image = Image.open(requests.get(url, stream=True).raw)
image.show()

In [None]:
# 🧠 Generate caption using BLIP model
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

inputs = processor(images=image, return_tensors="pt")
out = model.generate(**inputs)

print("\nGenerated Caption:", processor.decode(out[0], skip_special_tokens=True))