In [1]:
import torch

##Setting up the image classification model

In [2]:
model = torch.hub.load('pytorch/vision:v0.6.0', 'resnet18', pretrained=True).eval()

Downloading: "https://github.com/pytorch/vision/zipball/v0.6.0" to /root/.cache/torch/hub/v0.6.0.zip
Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth
100%|██████████| 44.7M/44.7M [00:00<00:00, 180MB/s]


## Defining the predict function

Define a function that takes in the user input, which in this case is an image, and returns the prediction. The prediction should be returned as a dictionary whose keys are class name and values are confidence probabilities. We will load the class names from this text file.

In [5]:
import requests
from PIL import Image
from torchvision import transforms

In [26]:
#Get human readable labels

response = requests.get("https://git.io/JJkYN")
labels = response.text.split("\n")
print(labels[2])
print("Length of labels: ",len(labels))

great white shark
Length of labels:  1001


The predict function is designed to take an input image, preprocess it, pass it through a pretrained neural network model, and return the model's prediction confidences for each class.

**transforms.ToTensor()**: This part converts the input image (which could be in formats like PIL Image or NumPy array) to a PyTorch tensor. The tensor will have values scaled between 0 and 1.

**.unsqueeze(0**): This adds a new dimension at the 0th position, converting the tensor from shape (C, H, W) to (1, C, H, W). This is necessary because PyTorch models typically expect a batch dimension as the first dimension. Here, C is the number of channels (e.g., 3 for RGB images), H is the height, and W is the width.

**with torch.no_grad():** This context manager disables gradient calculation, which is useful during inference to save memory and computations since we don't need to backpropagate.

**model(inp)**This passes the preprocessed image tensor through the neural network model. The model is expected to output a tensor of logits (raw prediction values) for each class.

**torch.nn.functional.softmax(..., dim=0):** The softmax function is applied to the logits to convert them into probabilities. The dim=0 argument specifies that the softmax should be applied across the class dimension.




In [35]:
def predict(inp):
  inp = transforms.ToTensor()(inp).unsqueeze(0)
  with torch.no_grad():
    prediction = torch.nn.functional.softmax(model(inp)[0], dim=0)
    confidences = {labels[i]: float(prediction[i]) for i in range(1000)}
  return confidences

# Gradio Interface

In [28]:
!pip install gradio




In [29]:
import gradio as gr

In [36]:
gr.Interface(fn=predict,
             inputs = gr.Image(type ='pil'),
             outputs = gr.Label(num_top_classes=3),
             examples= ['/content/360_F_179693709_ObozAm8nXPQJ5bPvgC529qdu8uBChJAv.jpg','/content/FELV-cat.jpg']).launch() # set launch(debug=True) if any error

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://a1a9aebfe39f230df9.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


