Skip to content

A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but training-effective) batch of data is generated from the large dataset.

Notifications You must be signed in to change notification settings

peterljq/Tutorial-of-Data-Distillation-and-Condensation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Tutorial-of-Data-Distillation-and-Condensation

Main Idea

Data Distillation and Condensation (DDC) is a data-centric task where a representative (i.e., small but training-effective) batch of data is generated from the large dataset. Models trained on this small batch can obtain similar test performance compared to the models trained on the full dataset. Sometimes, the distilled images preserve certains aspects of semantics corresponding to the annotated objects in the full dataset, which are explainable to human users. A brief demostration of task is shown below:

DDC Basic Pipeline

Some recommended works in this domain include:

Vision Approaches

NLP Approaches

Privacy

(ICML 2022 Oral) Privacy for Free: How does Dataset Condensation Help Privacy [ArXiv] [Poster]

Research Groups

About

A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but training-effective) batch of data is generated from the large dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published