# 📊 Код ранжирования изображений по тексту с Florence-2

* принимает **папку `/content/images`** с сохранёнными PNG/JPG,

* читает **`prompts.csv`** или `prompts.json` со столбцами/ключами
  `image, prompt, prompt2, negative, negative2`,

* вычисляет для каждого изображения-строки четыре оценки:

| модель                  | что считает                                  | почему                             |
|------------------------|----------------------------------------------|------------------------------------|
| **SigLIP 2 (ViT-L/14)** | глобальное cos( I, T )                       | надёжная мультиязычная «семантика» |
| **Florence-2**          | число фраз из prompt, найденных на картинке | уточнённое локальное выравнивание  |
| **CLIP-IQA (ViT-B/16)** | вероятность «качественного» фото             | штраф за артефакты                 |
| **DINOv2 (ViT-L/14)**   | визуальный «aesthetic score» (ℓ₂-норма CLS) | доп. сигнал «детальности/качества» |

* собирает ансамбль:
  `S = α · SigLIP – α · SigLIP_neg + β · Florence – β · Florence_neg + γ · IQA + δ · DINO`,

* выдаёт `ranking.csv` с итоговым порядком изображений.

| ячейка | что делает                                                                                 |
|--------|--------------------------------------------------------------------------------------------|
| **0**  | установка зависимостей: PyTorch, Transformers, CLIP-IQA, Florence-2 и др.                   |
| **1**  | загрузка моделей: SigLIP2, Florence-2, DINOv2, CLIP-IQA                                    |
| **2**  | обёртки-функции для получения индивидуальных скорингов                                     |
| **3**  | функция `rank_folder`: ранжирует изображения по агрегированному скору                      |
| **4**  | запуск с указанием папки и prompt-файла, сохранение результата в `ranking.csv`             |

> **Совет по весам:**
> Для стабильности нормализуйте оценки (`z-score` по каждому столбцу) — так проще подбирать α, β, γ, δ.

---

### 🧠 Что считают модели:

* **SigLIP 2** — глобальная семантика между изображением и всей подсказкой;
* **Florence-2** — phrase grounding: доля слов/фраз prompt, для которых найдены регионы на изображении.;
* **CLIP-IQA** — вероятность того, что изображение качественное (vs «bad photo»);
* **DINOv2** — L2-норма CLS-вектора, коррелирующая с визуальной выразительностью.

---

Загрузите картинки в `/content/images` и подготовьте `prompts.csv` вида:

```csv
image,prompt,prompt2,negative,negative2
0001.png,"A serene landscape with mountains","A calm river under pink sunset","low quality","distortions"
0002.png,"A serene landscape with mountains","","low quality",""
```

После выполнения всех ячеек получите `ranking.csv` с отсортированными изображениями.
"""

In [1]:
!pip install --upgrade pip setuptools wheel



In [3]:
%cd rank_images_project
!pip install --pre \
  --extra-index-url https://download.pytorch.org/whl/nightly/cu121 \
  -e .

/content/rank_images_project
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/nightly/cu121
Obtaining file:///content/rank_images_project
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: rank_images
  Building editable for rank_images (pyproject.toml) ... [?25l[?25hdone
  Created wheel for rank_images: filename=rank_images-0.1.0-0.editable-py3-none-any.whl size=5813 sha256=2e986b338e27210b628ea47fd6665fca30bec14e9dc2149d35ce4716a69270c0
  Stored in directory: /tmp/pip-ephem-wheel-cache-c_0dahjd/wheels/18/44/46/5787710d98191fa8e596c65b5c1e1e623d554349b5f4ae47df
Successfully built rank_images
Installing collected packages: rank_images
Successfully installed rank_images-0.1.0


In [1]:
%cd rank_images_project
!ls -R

/content/rank_images_project
.:
data  pyproject.toml  README.md  requirements.txt  setup.py  src

./data:
demo_images

./data/demo_images:
1010.png  222.png  444.png  666.png  888.png  prompts.json
111.png   333.png  555.png  777.png  999.png  ranking.csv

./src:
rank_images  rank_images.egg-info

./src/rank_images:
cli.py	   data_processing.py  __init__.py  models.py	 ranking.py
config.py  device_utils.py     metrics.py   __pycache__

./src/rank_images/__pycache__:
cli.cpython-311.pyc		 __init__.cpython-311.pyc
config.cpython-311.pyc		 metrics.cpython-311.pyc
data_processing.cpython-311.pyc  models.cpython-311.pyc
device_utils.cpython-311.pyc	 ranking.cpython-311.pyc

./src/rank_images.egg-info:
dependency_links.txt  PKG-INFO	    SOURCES.txt
entry_points.txt      requires.txt  top_level.txt


После успешной установки вы можете протестировать работу CLI

In [2]:
!rank-images --help

2025-07-23 19:06:08.586050: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-07-23 19:06:08.604081: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1753297568.625368   27150 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1753297568.631820   27150 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-07-23 19:06:08.653585: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

In [3]:
!rank-images --demo

2025-07-23 19:06:24.428486: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-07-23 19:06:24.445342: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1753297584.466151   27254 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1753297584.472496   27254 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-07-23 19:06:24.493060: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

In [4]:
!rank-images data/demo_images --prompts data/demo_images/prompts.json

2025-07-23 19:11:56.247531: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-07-23 19:11:56.264943: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1753297916.285947   28712 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1753297916.292321   28712 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-07-23 19:11:56.313415: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

In [3]:
# Первая ячейка в Jupyter-ноутбуке
import logging
import sys

# Настройка логирования
logging.getLogger().handlers.clear()
logging.getLogger().setLevel(logging.NOTSET)

root_logger = logging.getLogger()
handler = logging.StreamHandler(sys.stdout)
# Можно немного изменить формат для ноутбука, если хотите
formatter = logging.Formatter("[%(levelname)s] %(name)s: %(message)s")
handler.setFormatter(formatter)
root_logger.addHandler(handler)
root_logger.setLevel(logging.INFO)

# Подавляем шум
noisy_loggers = [
    "urllib3", "PIL", "matplotlib", "transformers", "torch", "torchvision",
    "torchaudio", "tokenizers", "datasets", "huggingface_hub", "filelock",
    "fsspec", "asyncio", "openai", "httpx", "httpcore", "tensorflow", "timm"
]
for lib_name in noisy_loggers:
    logging.getLogger(lib_name).setLevel(logging.WARNING)

print("Логирование настроено. Будут отображаться только сообщения уровня INFO и выше от rank_images и WARNING и выше от других библиотек.")

Логирование настроено.


In [None]:
%cd rank_images_project
import sys
from pathlib import Path
# Предполагаем, что текущая директория - корень проекта
project_root = Path.cwd()
sys.path.insert(0, str(project_root / "src")) # Добавляем src в путь импорта

from rank_images.models import load_models
from rank_images.ranking import rank_folder

# Загружаем модели один раз
print("Загружаю модели...")
load_models()
print("Модели загружены.")


In [None]:
# Выполняем ранжирование
demo_images_dir = project_root / "data" / "demo_images"
prompts_path = str(demo_images_dir / "prompts.json") # <-- str

print("Начинаю ранжирование...")
result_df = rank_folder(
    img_dir=demo_images_dir,
    prompts_in="An advertising image for a credit card, featuring prominent 3D rendered numbers representing a high deposit percentage. The scene is set against a stunning natural backdrop of a serene blue ocean meeting majestic, sun-drenched mountains. The overall style should be a vibrant and dynamic 3D render, capturing the feeling of opportunity and natural beauty."
    #prompts_in= prompts_path # <-- str
)
print("Ранжирование завершено.")
print(result_df.head())

In [None]:
# В ячейке Jupyter-ноутбука
import sys
from pathlib import Path

%cd rank_images_project
# Предполагаем, что текущая директория - корень проекта
project_root = Path.cwd()
sys.path.insert(0, str(project_root / "src")) # Добавляем src в путь импорта

# --- Настройка логирования (если ещё не настроена) ---
import logging
logging.basicConfig(level=logging.DEBUG) # Или INFO
# ---

from rank_images.models import load_models # Импортируем load_models
from rank_images.ranking import rank_folder



# Загружаем модели один раз
print("Загружаю модели...")
load_models() # <-- ЭТО ВАЖНО
print("Модели загружены.")

# Выполняем ранжирование
demo_images_dir = project_root / "data" / "demo_images"
# prompts_path = str(demo_images_dir / "prompts.json") # <-- str

print("Начинаю ранжирование...")
result_df = rank_folder(
    img_dir=demo_images_dir,
    # prompts_in=prompts_path # <-- str
    # Или просто текст, как в вашем последнем примере:
    prompts_in="An advertising image for a credit card, featuring prominent 3D rendered numbers representing a high deposit percentage. The scene is set against a stunning natural backdrop of a serene blue ocean meeting majestic, sun-drenched mountains. The overall style should be a vibrant and dynamic 3D render, capturing the feeling of opportunity and natural beauty."
)
print("Ранжирование завершено.")
print(result_df.head())

/content/rank_images_project
Загружаю модели...


Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
