<a href="https://colab.research.google.com/github/AnkitaGuptaIndia/Course_6_Image_Caption/blob/main/Image_Caption/Image_Caption_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Creating a clone of the Repo

In [1]:
!git clone https://github.com/AnkitaGuptaIndia/Course_6_Image_Caption.git

Cloning into 'Course_6_Image_Caption'...
remote: Enumerating objects: 21, done.[K
remote: Counting objects: 100% (21/21), done.[K
remote: Compressing objects: 100% (17/17), done.[K
remote: Total 21 (delta 5), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (21/21), 166.69 KiB | 2.98 MiB/s, done.
Resolving deltas: 100% (5/5), done.


In [2]:
%cd Course_6_Image_Caption

/content/Course_6_Image_Caption


In [3]:
!ls

image_1.jpg  Image_Caption  notebooks  README.md  requirements.txt


## Basic Image Generation Code - 1

In [None]:
from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image

In [None]:
# Initializing the processor and model from HuggingFace
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

In [6]:
# Load an image
image = Image.open("image_1.jpg")

In [7]:
# Prepare the Image
inputs = processor(image, return_tensors="pt")

In [8]:
# Generate caption
outputs = model.generate(**inputs)
caption = processor.decode(outputs[0], skip_special_tokens=True)

In [9]:
print("Generated Caption:", caption)

Generated Caption: a view of the mountains from an airplane


## Basic Image Generation Code - 2

In [16]:
import requests

In [None]:
# Initializing the processor and model from HuggingFace
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")

In [17]:
img_url = 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg'
raw_image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')
question = "What is in the image?"

In [20]:
inputs = processor(raw_image, question, return_tensors="pt")
out = model.generate(**inputs)
answer = processor.decode(out[0], skip_special_tokens=True)
print(f"Answer: {answer}")

Answer: what is in the image? a woman and her dog on the beach


In [21]:
inputs = processor(image, question, return_tensors="pt")
out = model.generate(**inputs)
answer = processor.decode(out[0], skip_special_tokens=True)
print(f"Answer: {answer}")

Answer: what is in the image? a view of a mountain range


## Gradio Introduction

In [22]:
import gradio as gr

In [23]:
def greet(name, intensity):
  return "Hello, " + name + "!" * int(intensity)

demo = gr.Interface(
    fn = greet,
    inputs = ["text", "slider"],
    outputs = ["text"]
)

In [None]:
demo.launch(server_name="127.0.0.1", server_port=7860)

In [41]:
demo.close()

Closing server running on port: 7860


## Gradio For Image Captioning

In [25]:
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

In [26]:
def generate_caption(image):
  inputs = processor(images=image, return_tensors="pt")
  outputs = model.generate(**inputs)
  caption = processor.decode(outputs[0], skip_special_tokens=True)
  return caption

In [27]:
def caption_image(image):
  try:
    caption = generate_caption(image)
    return caption
  except Exception as e:
    return f"An error occurred: {str(e)}"

In [29]:
iface = gr.Interface(
    fn = caption_image,
    inputs = gr.Image(type="pil"),
    outputs="text",
    title = "Image Captioning with BLIP",
    description = "Upload an image to generate a caption."
)

In [32]:
iface.launch(server_name="127.0.0.1", server_port= 5000)

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://185d261e2e361dbc59.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [40]:
iface.close()

Closing server running on port: 5000


## Image Classification in PyTorch

In [33]:
import torch

In [None]:
model = torch.hub.load('pytorch/vision:v0.6.0', 'resnet18', pretrained=True).eval()

In [35]:
from torchvision import transforms

In [36]:
# Download human readable labels for ImageNet
response = requests.get("https://git.io/JJkYN")
labels = response.text.split("\n")

In [37]:
def predict(inp):
  inp = transforms.ToTensor()(inp).unsqueeze(0)
  with torch.no_grad():
    prediction = torch.nn.functional.softmax(model(inp)[0], dim=0)
    confidences = {labels[i]: float(prediction[i]) for i in range(1000)}
  return confidences

In [38]:
iface_2 = gr.Interface(
    fn = predict,
    inputs = gr.Image(type = "pil"),
    outputs = gr.Label(num_top_classes=3),
    examples = ["/content/lion.jpg", "/content/cheetah.jpg"]
)

In [42]:
iface_2.launch(server_name="127.0.0.1", server_port=7860)

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://fccd52ccf0e5face71.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [43]:
iface_2.close()

Closing server running on port: 7860


## Pushing the code back to the repo

In [10]:
!git add .

In [11]:
!git config --global user.email "ankitagpt32.ag@gmail.com"
!git config --global user.name "AnkitaGuptaIndia"

In [12]:
!git commit -m "Basic Image Generation Code"

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean


In [14]:
!git push <token_url>

Everything up-to-date
