## 📚 1. Installing Essential Libraries

We'll start by installing the necessary libraries. The **`transformers`** library is a powerful, multi-modal tool that supports not only text but also state-of-the-art computer vision models.

- **`transformers`**: The main library from Hugging Face, providing the `pipeline` API for easy access to pre-trained models.
- **`sentencepiece` & `sacremoses`**: Common dependencies in the Hugging Face ecosystem, often used for tokenization in text-based models.

In [None]:
!pip install -U transformers
!pip install -U sentencepiece
!pip install -U sacremoses

## 📂 2. Setting a Custom Cache Directory (Optional)

Hugging Face models, especially for vision, can be quite large. This code block sets a custom directory for the library to download and cache these models.

By setting the **`HF_HOME`** environment variable, you can better manage your computer's storage and keep all downloaded model assets in one organized place.

In [None]:
import os
new_cache_dir = """X:\AI-learin\courss\Fine-Tuning-LLM-with-HuggingFace-main\models"""
os.environ['HF_HOME'] = new_cache_dir

## 📦 3. Importing Computer Vision and Helper Libraries

For this task, we need libraries to handle images and web requests, in addition to our `transformers` pipeline.

- **`Image` from `PIL`**: The Python Imaging Library (Pillow) is the standard for opening and manipulating images in Python.
- **`requests`**: A simple HTTP library that we'll use to download an image from a URL.
- **`pipeline` from `transformers`**: Our main tool for running models on specific tasks with just a few lines of code.

In [None]:
from PIL import Image
import requests
from transformers import pipeline

## 🖼️ 4. Fetching and Loading an Image

Before we can perform segmentation, we need an image. This code fetches an image from a URL and loads it into a format our pipeline can work with.

The code defines the image **`url`**, uses **`requests.get()`** to download the raw image data, and then uses **`Image.open()`** to load that data as a Pillow `Image` object.

In [None]:
url = "https://img.freepik.com/free-photo/young-bearded-man-with-striped-shirt_273609-5677.jpg"

image = Image.open(requests.get(url, stream=True).raw)

## 🎨 5. Performing Image Segmentation

This is where we perform **Image Segmentation**. Unlike image classification (which tells you *what* is in an image), segmentation tells you *where* it is by classifying every single pixel.

1.  **Creating the Pipeline**: We initialize a pipeline for **`"image-segmentation"`**.
2.  **Model Selection**: We use a specialized model, **`"mattmdjaga/segformer_b2_clothes"`**. This is a SegFormer model that has been specifically fine-tuned to identify and outline different types of clothing and features on a person (like hair, skin, etc.).
3.  **Inference**: We pass our `image` to the `segmenter`.
4.  **Output**: The result is a list of dictionaries. Each dictionary represents a detected object (e.g., 'T-shirt') and contains its **`label`**, confidence **`score`**, and a **`mask`**. The mask is a new image where only the pixels belonging to that object are highlighted.

In [None]:
model="briaai/RMBG-2.0"
model="nvidia/segformer-b0-finetuned-ade-512-512"
model="mattmdjaga/segformer_b2_clothes"

segmenter = pipeline("image-segmentation", model=model)
outputs = segmenter(image)

outputs

## 🎭 6. Visualizing a Single Mask

To better understand the output, we can isolate and view a single mask. This line of code accesses one of the items from the `outputs` list (in this case, the fourth item at index 3) and extracts its **`'mask'`**.

Displaying this mask shows a black-and-white image where the white pixels precisely outline the location and shape of the detected object, providing a clear visualization of the segmentation result.

In [None]:
outputs[3]['mask']