StyleID : A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition
SIGGRAPH 2026 / ACM TOG Journal Track
StyleID is a CLIP-based image encoder designed to produce identity embeddings that remain robust under stylization.
It can be used for identity similarity measurement, retrieval, evaluation, and identity-aware conditioning in generative pipelines.
- (23.April) Quick start recognition model release
- (24.April) ArXiv released
- Dataset release
- Data generation code release
import torch
from transformers import CLIPModel, CLIPProcessor
from PIL import Image
device = "cuda" if torch.cuda.is_available() else "cpu"
model = CLIPModel.from_pretrained("kwanY/styleid").to(device)
processor = CLIPProcessor.from_pretrained("kwanY/styleid")
img_path = "example.jpg"
img = Image.open(img_path).convert("RGB")
inputs = processor(images=img, return_tensors="pt").to(device)
with torch.no_grad():
emb = model.get_image_features(**inputs)
emb = emb / emb.norm(dim=-1, keepdim=True) # optional but recommended- Not suitable for images with multiple faces
- Rough center crop near the face is recommended for better performance
StyleID embeddings can be used for:
- Identity similarity comparison
- Image retrieval
- Stylized identity evaluation
- Identity-aware conditioning for generative models
- Research on face recognition under domain shift and stylization
- StyleID is released for non-commercial research use.
- Do not use FFHQ-derived data for biometric human recognition
If you find this work useful, please cite the paper:
