Tutorial-of-Data-Distillation-and-Condensation

Main Idea

Data Distillation and Condensation (DDC) is a data-centric task where a representative (i.e., small but training-effective) batch of data is generated from the large dataset. Models trained on this small batch can obtain similar test performance compared to the models trained on the full dataset. Sometimes, the distilled images preserve certains aspects of semantics corresponding to the annotated objects in the full dataset, which are explainable to human users. A brief demostration of task is shown below:

DDC Basic Pipeline

Some recommended works in this domain include:

(2018) Dataset Distillation [ArXiv] [Code] [Project]
(CVPR 2022) Dataset Distillation by Matching Training Trajectories [ArXiv] [Code] [Project] [Workshop Version]

Vision Approaches

NLP Approaches

Privacy

(ICML 2022 Oral) Privacy for Free: How does Dataset Condensation Help Privacy [ArXiv] [Poster]

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
image		image
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image

image

.DS_Store

.DS_Store

README.md

README.md

Repository files navigation

Tutorial-of-Data-Distillation-and-Condensation

Main Idea

Vision Approaches

NLP Approaches

Privacy

Research Groups

About

Releases

Packages

peterljq/Tutorial-of-Data-Distillation-and-Condensation

Folders and files

Latest commit

History

Repository files navigation

Tutorial-of-Data-Distillation-and-Condensation

Main Idea

Vision Approaches

NLP Approaches

Privacy

Research Groups

About

Resources

Stars

Watchers

Forks