Skip to content

Latest commit

 

History

History
28 lines (19 loc) · 1.82 KB

models.mdx

File metadata and controls

28 lines (19 loc) · 1.82 KB

Models

Diffusers contains pretrained models for popular algorithms and modules for creating the next set of diffusion models. The primary function of these models is to denoise an input sample, by modeling the distribution p θ ( x t 1 | x t ) . The models are built on the base class ['ModelMixin'] that is a torch.nn.module with basic functionality for saving and loading models both locally and from the HuggingFace hub.

API

Models should provide the def forward function and initialization of the model. All saving, loading, and utilities should be in the base ['ModelMixin'] class.

Examples

  • The ['UNetModel'] was proposed in TODO and has been used in paper1, paper2, paper3.
  • Extensions of the ['UNetModel'] include the ['UNetGlideModel'] that uses attention and timestep embeddings for the GLIDE paper, the ['UNetGradTTS'] model from this paper for text-to-speech, ['UNetLDMModel'] for latent-diffusion models in this paper, and the ['TemporalUNet'] used for time-series prediciton in this reinforcement learning paper.
  • TODO: mention VAE / SDE score estimation