Models

Diffusers contains pretrained models for popular algorithms and modules for creating the next set of diffusion models. The primary function of these models is to denoise an input sample, by modeling the distribution $p_\theta(\mathbf{x}_{t-1}|\mathbf{x}_t)$. The models are built on the base class ['ModelMixin'] that is a torch.nn.module with basic functionality for saving and loading models both locally and from the HuggingFace hub.

API

Models should provide the def forward function and initialization of the model. All saving, loading, and utilities should be in the base ['ModelMixin'] class.

Examples

The ['UNetModel'] was proposed in TODO and has been used in paper1, paper2, paper3.
Extensions of the ['UNetModel'] include the ['UNetGlideModel'] that uses attention and timestep embeddings for the GLIDE paper, the ['UNetGradTTS'] model from this paper for text-to-speech, ['UNetLDMModel'] for latent-diffusion models in this paper, and the ['TemporalUNet'] used for time-series prediciton in this reinforcement learning paper.
TODO: mention VAE / SDE score estimation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!