- 
                Notifications
    
You must be signed in to change notification settings  - Fork 307
 
Open
Labels
type:featureNew feature or requestNew feature or request
Description
I am interested in contributing to Keras by implementing DINOv3 (Distillation with No labels v3), a state-of-the-art self-supervised Vision Transformer model, as an example/tutorial. Before proceeding, I would like to confirm if this aligns with the project's goals and if there are any existing implementations or guidelines I should be aware of.
Why DINOv3?
- State-of-the-art performance: DINOv3 achieves top-tier results on various vision tasks without requiring labeled data, making it a valuable addition to Keras examples.
 - Versatility: It serves as a strong backbone for tasks like image classification, segmentation, and object detection.
 - Alignment with Keras 3: Given Keras 3's multi-backend support (TensorFlow, JAX, PyTorch), implementing DINOv3 would showcase the framework's flexibility.
 
Implementation Plan:
- Model Architecture: Implement the Vision Transformer (ViT) backbone with self-supervised learning using DINOv3.
 - Training: Utilize standard datasets such as CIFAR-10 or ImageNet for training.
 - Backend Compatibility: Ensure the implementation is compatible with TensorFlow, JAX, and PyTorch backends.
 - Documentation: Provide clear instructions on how to use the model, including training and evaluation scripts.
 
Metadata
Metadata
Assignees
Labels
type:featureNew feature or requestNew feature or request