# Plant Verification Model

### Overview:

This Jupyter notebook is specifically designed to demonstrate the application of OpenAI's CLIP (Contrastive Language–Image Pretraining) model in verifying whether a given image contains a plant. Through a series of cells, we utilize the CLIP model's ability to assess the correlation between images and text descriptions, focusing on distinguishing images of plants from those of other subjects.

### Environment Setup:

- The notebook requires Python 3.x.
- Necessary libraries: PIL (Python Imaging Library), requests, and transformers.
- Internet connection is required for model downloading and image retrieval.

### Workflow:

- **Model Initialization**: Load the CLIP model and its processor.
- **Image Processing**: Retrieve and process two distinct images - one of a plant and another of a non-plant subject (dog).
- **Input Preparation**: Pair each image with two textual descriptions, one accurate and one inaccurate.
- **Model Prediction**: Use the CLIP model to evaluate the similarity between the images and each of the text descriptions.
- **Result Analysis**: Analyze the model's predictions to determine its accuracy in identifying the subject of the images.

In [5]:
from PIL import Image
import requests

from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")


In [21]:
url = "https://hips.hearstapps.com/hmg-prod/images/indoor-plants-1643136651.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(text=["the subject of the photo is a plant", "the subject of the photo is not a plant"], images=image, return_tensors="pt", padding=True)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) 

In [22]:
probs[0][0] > probs[0][1]

tensor(True)

In [23]:
url = "https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(text=["the subject of the photo is a plant", "the subject of the photo is not a plant"], images=image, return_tensors="pt", padding=True)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) 

In [24]:
probs[0][0] > probs[0][1]

tensor(False)