<a href="https://colab.research.google.com/github/SanketDevmunde/GEN_AI_Assignment/blob/main/GenAi_6%267_new.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



### 📝 **Assignment Summary: AI Image Generation & Captioning**

This project integrates two powerful AI services to create a mini image-to-caption pipeline:

1. **Image Generation**  
   - Uses Hugging Face's `Stable Diffusion 3.5` model.  
   - Generates an image based on a text prompt (e.g., *"Doraemon and Nobita in anime style"*).  
   - The output image is saved and displayed using Python’s PIL library.

2. **Image Captioning**  
   - Uses Google's `Gemini 1.5 Flash` model.  
   - Takes the generated image and produces a one-sentence description.  
   - Demonstrates multimodal capabilities by combining vision and language models.

3. **Secure API Management**  
   - API keys (for Hugging Face and Gemini) are securely stored in Google Colab Secrets.

---


In [4]:
import requests
from PIL import Image
from io import BytesIO
import google.generativeai as genai
from google.colab import userdata

# 🔐 Load API Keys from Colab Secrets
HF_API_TOKEN = userdata.get('HF_TOKEN')  # Hugging Face
GEMINI_API_KEY = userdata.get('Gemini_API_Key')       # Gemini

In [5]:
# -------------------- IMAGE GENERATION (Hugging Face) --------------------

HF_API_URL = "https://api-inference.huggingface.co/models/stabilityai/stable-diffusion-3.5-large"
headers = {"Authorization": f"Bearer {HF_API_TOKEN}"}
prompt = "Doraemon, a cute blue robot cat from Japanese anime, standing next to Nobita, a young schoolboy wearing glasses, in a colorful cartoon city background, anime style"

# Send request to Hugging Face
response = requests.post(HF_API_URL, headers=headers, json={"inputs": prompt})

# Save image if response is OK
if response.status_code == 200:
    image = Image.open(BytesIO(response.content))
    image_path = "hf_generated_image.png"
    image.save(image_path)
    image.show()
else:
    raise Exception(f"❌ Hugging Face API Error: {response.status_code} - {response.text}")


In [6]:
# -------------------- IMAGE CAPTIONING (Gemini) --------------------

# Configure Gemini API
genai.configure(api_key=GEMINI_API_KEY)

def generate_caption_with_gemini(image_path):
    """Generate a caption using Gemini 1.5 Flash model."""
    try:
        image = Image.open(image_path)
        model = genai.GenerativeModel("gemini-1.5-flash")

        response = model.generate_content([
            image,
            "Describe this image in one sentence."
        ])

        if response and hasattr(response, "text"):
            return response.text.strip()

        return "No caption generated."

    except Exception as e:
        return f"❌ Error: {str(e)}"

# 🔎 Generate caption
caption = generate_caption_with_gemini(image_path)
print(f"🖼️ Generated Caption: {caption}")

🖼️ Generated Caption: A stylized image shows Doraemon, the famous robotic cat, standing next to Nobita Nobi in a vibrant, bustling city street.
