Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UNet 1d for RL model for planning + colab #105

Merged
merged 69 commits into from
Nov 14, 2022
Merged

Add UNet 1d for RL model for planning + colab #105

merged 69 commits into from
Nov 14, 2022

Conversation

natolambert
Copy link
Contributor

@natolambert natolambert commented Jul 19, 2022

Creating the PR to add back in the RL code when ready.
This contains the model to run inference for Janner et. al's Diffuser, with the accompanying colab.

work in progress

TODO:

  • make sure old tests pass
  • make sure colab still works

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@ezhang7423
Copy link

Hi, is there any updates on this?

@natolambert
Copy link
Contributor Author

Hi @ezhang7423, we have mostly paused merging this until Stable Diffusion calms down (which can be changed if there is interest!).

That being said, this PR is very stable. What are you interested in? Have you looked at the colab? Happy to discuss anything here!

@ezhang7423
Copy link

Thank you so much for the helpful response! I'm curious about whether the 1D convolution class is actually necessary- theoretically, I think you could do everything with a 2D convolution with a non square kernel. Would it be easier to just create a wrapper class on top of the existing models that does this?

I'm also wondering if you have the code for training one of these models from scratch.

@natolambert
Copy link
Contributor Author

Ah, so this model is mostly a direct port from the original code. We would definitely welcome community contribution to help re-use the existing model components.

Part of the reason we did not is because training is complex. It involves a lot more RL infrastructure we don't really intend to support in diffusers. I think the first step would be to re-train in another repo with diffusers as a dependency.

I'm happy to discuss further if this is something you want to work on, but it's not really on my runway (because I know it will be very challenging, given my experience in RL).

@natolambert natolambert changed the title [post release] re-add RL model code RL model for planning + colab Sep 9, 2022
@github-actions
Copy link

github-actions bot commented Oct 3, 2022

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Oct 3, 2022
@natolambert
Copy link
Contributor Author

Fighting the stale bot! We haven't forgotten about this, and actually moved it up the priority list today. Soon!

@natolambert natolambert removed the stale Issues that haven't received updates label Oct 3, 2022
@natolambert
Copy link
Contributor Author

natolambert commented Oct 3, 2022

Okay, I just updated the Colab too so it works.

Now a couple final thoughts:

  1. I will add some documentation placeholders.
  2. When released I will add some issues people can address to unify some similar code snippets (like Upsample1D, RearrangeDim, and Conv1D
  3. I will update the colab in huggingface/notebooks when we release it and update links in the repo to that more permanent link.

t = self.time_mlp(timestep)
h = []

for resnet, resnet2, downsample in self.downs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's really try to mirror the design in src/diffusers/models/unet_2d.py here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try and make it work with the get_block function. The parameter re-naming script is getting quite ugly so we'll see how far I make it 🤗.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we try to morph the design more into what we have in unet_2d.py?

Also to be the unet looks quite general, so instead of having it in a specific unet_rl.py, I think we could put it in a more general unet_1d.py class no? What do you think @natolambert ?

@natolambert
Copy link
Contributor Author

natolambert commented Oct 4, 2022

I agree. Was expecting some time on improving code quality, I just wanted to know where to start @patrickvonplaten. I will take a deeper look and ask questions when I'm stuck.

Most of the code is still in the form of the original.

@natolambert
Copy link
Contributor Author

natolambert commented Oct 6, 2022

@patrickvonplaten regarding your last comment above too, yes I think this can become unet_1d. The code specific to RL was only used during training and has all been removed.

The tricky thing is that the model here uses three dimensional tensors [batch x time_dimension x state_dimension], so that may be different than some 1d use-cases. Maybe we have some optional batching? Or, the RL model becomes a wrapper of a 1d unet?

@patrickvonplaten
Copy link
Contributor

If it's a very special model, then indeed let's put it in its own unet class :-) Maybe easier to begin with anyways

@HuggingFaceDocBuilder
Copy link

HuggingFaceDocBuilder commented Nov 14, 2022

The documentation is not available anymore as the PR was closed or merged.

@@ -42,7 +42,7 @@ Training examples show how to pretrain or fine-tune diffusion models for a varie
| [**Text-to-Image fine-tuning**](./text_to_image) | ✅ | ✅ |
| [**Textual Inversion**](./textual_inversion) | ✅ | - | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb)
| [**Dreambooth**](./dreambooth) | ✅ | - | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_dreambooth_training.ipynb)

| [**Reinforcement Learning for Control**](https://github.com/huggingface/diffusers/blob/main/examples/rl/run_diffusers_locomotion.py) | - | - | coming soon.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!

import d4rl # noqa
import gym
import tqdm
from diffusers.experimental import ValueGuidedRLPipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect :-)

self.act = None
if act_fn == "silu":
self.act = nn.SiLU()
self.linear_2 = nn.Linear(time_embed_dim, time_embed_dim)
if act_fn == "mish":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if act_fn == "mish":
elif act_fn == "mish":

@@ -62,14 +62,21 @@ def get_timestep_embedding(


class TimestepEmbedding(nn.Module):
def __init__(self, channel: int, time_embed_dim: int, act_fn: str = "silu"):
def __init__(self, in_channels: int, time_embed_dim: int, act_fn: str = "silu", out_dim: int = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never use the keyword argument "channel=" anywhere no?

def __init__(self, inp_channels, out_channels, embed_dim, kernel_size=5):
super().__init__()

self.blocks = nn.ModuleList(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of using blocks here as it makes it very hard to adapt the class to future models. Sorry this is a bit annoying to do when converting the checkpoint, but could we maybe instead call the layers self.conv_in and self.conv_out? The idea here is that if in the future there are checkpoints which have an intermediate conv layer it'd be much easier to adapt this layer without breaking the previous checkpoitns

@@ -1,3 +1,17 @@
# Copyright 2022 The HuggingFace Team. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@@ -54,16 +75,20 @@ def __init__(
out_channels: int = 2,
extra_in_channels: int = 0,
time_embedding_type: str = "fourier",
freq_shift: int = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This we cannot remove anymore sadly as we have dance diffusion now which relies on this config parameter https://huggingface.co/harmonai/maestro-150k/blob/main/unet/config.json#L35 -> could we leave the name as is here please :-) Otherwise we'd break dance diffusion 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel super strongly about it we could also adapt the config of dance diffusion online and leave as is

flip_sin_to_cos: bool = True,
use_timestep_embedding: bool = False,
downscale_freq_shift: float = 0.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to use freq_shift I'm afraid

act_fn: str = None,
norm_num_groups: int = 8,
layers_per_block: int = 1,
always_downsample: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge fan of the naming here "always" - what does "always" mean. Maybe "downsample_each_block" or something instead?


# 5. post-process
if self.out_block:
sample = self.out_block(sample, timestep_embed)

if not return_dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool adaptions here! We just need to make sure the dance diffusion tests still passes :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I had been testing along the way, will double check before merge!

for layer in self.final_block:
hidden_states = layer(hidden_states)

return hidden_states
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that works for me!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice naming :-)

@@ -374,11 +598,71 @@ def __init__(self, in_channels, out_channels, mid_channels=None):

self.resnets = nn.ModuleList(resnets)

def forward(self, hidden_states, res_hidden_states_tuple):
def forward(self, hidden_states, res_hidden_states_tuple, temb=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for me!

@@ -204,6 +204,7 @@ def _get_variance(self, t, predicted_variance=None, variance_type=None):
# for rl-diffuser https://arxiv.org/abs/2205.09991
elif variance_type == "fixed_small_log":
variance = torch.log(torch.clamp(variance, min=1e-20))
variance = torch.exp(0.5 * variance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uff that's a pretty big change - do we need to do this? Is this not affecting any of our DDPM checkpoints/tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickvonplaten I added this variance_type condition specifically for the RL implementation. Double checked that nothing else in the code base uses it.


def test_forward_with_norm_groups(self):
# Not implemented yet for this UNet
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice tests!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Everything looks more or less good to me, just 2 questions:

  • 1.) Are we sure we're not breaking dance diffusion here? Let's make sure the tests pass and even if they do we have to adapt all the configs online if we want to change the naming
  • 2.) Are the changes to the DDPM scheduler safe? Would be nice to also run the DDPM pipeline tests here to be sure

It'd be super nice if we could rename the one "self.blocks" parameter to "self.conv_in" and "self.conv_out" as this is more extendable.

examples/rl/README.md Outdated Show resolved Hide resolved
@natolambert natolambert merged commit 7c5fef8 into main Nov 14, 2022
@ezhang7423
Copy link

Congrats!! Thank you for all the great work.

@patrickvonplaten
Copy link
Contributor

Great work - thanks a lot :-)

@kashif kashif deleted the rl branch March 19, 2023 12:20
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* re-add RL model code

* match model forward api

* add register_to_config, pass training tests

* fix tests, update forward outputs

* remove unused code, some comments

* add to docs

* remove extra embedding code

* unify time embedding

* remove conv1d output sequential

* remove sequential from conv1dblock

* style and deleting duplicated code

* clean files

* remove unused variables

* clean variables

* add 1d resnet block structure for downsample

* rename as unet1d

* fix renaming

* rename files

* add get_block(...) api

* unify args for model1d like model2d

* minor cleaning

* fix docs

* improve 1d resnet blocks

* fix tests, remove permuts

* fix style

* add output activation

* rename flax blocks file

* Add Value Function and corresponding example script to Diffuser implementation (huggingface#884)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* update post merge of scripts

* add mdiblock / outblock architecture

* Pipeline cleanup (huggingface#947)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* Update src/diffusers/models/unet_1d_blocks.py

* Update tests/test_models_unet.py

* RL Cleanup v2 (huggingface#965)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

* add specific vf block and update tests

* style

* Update tests/test_models_unet.py

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* fix quality in tests

* fix quality style, split test file

* fix checks / tests

* make timesteps closer to main

* unify block API

* unify forward api

* delete lines in examples

* style

* examples style

* all tests pass

* make style

* make dance_diff test pass

* Refactoring RL PR (huggingface#1200)

* init file changes

* add import utils

* finish cleaning files, imports

* remove import flags

* clean examples

* fix imports, tests for merge

* update readmes

* hotfix for tests

* quality

* fix some tests

* change defaults

* more mps test fixes

* unet1d defaults

* do not default import experimental

* defaults for tests

* fix tests

* fix-copies

* fix

* changes per Patrik's comments (huggingface#1285)

* changes per Patrik's comments

* update conversion script

* fix renaming

* skip more mps tests

* last test fix

* Update examples/rl/README.md

Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants