Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Role of coefficients #6

Closed
NotNANtoN opened this issue Nov 3, 2021 · 3 comments
Closed

Role of coefficients #6

NotNANtoN opened this issue Nov 3, 2021 · 3 comments

Comments

@NotNANtoN
Copy link

Hello again,

in these marked lines you initialize a set of coefficients to optimize over. As far as I can see, these are not mentioned in the paper. The coefficients are multiplied by the direction per source image, so I get that you want to optimize for a different scale of the direction vector per source vector. I have some questions on this:

  1. Did you try it without these coefficients?
  2. To what values do the coefficients converge to? Do they stay close to 1?
  3. You re-initialize the Adam optimizer for the coefficients for every step within the optimization, hence drastically changing the behavior of the optimizer. Is this intended or a misplacement? If it is intended, what is it used for?

Thanks again for your work! I hope I am not too picky on this - I'm just curious about the topic of semantics in these latent spaces :-)

coefficients = [None] * NUM_IMAGES
for n in range(NUM_IMAGES):
coefficient = torch.ones(1).to("cuda")
coefficient.requires_grad = True
coefficients[n] = coefficient
opt_loss = torch.Tensor([float("Inf")]).to("cuda")
pbar = tqdm(range(args.step))
for i in pbar:
# calculate learning rate
t = i / args.step
lr = get_lr(t, args.lr)
optimizer.param_groups[0]["lr"] = lr
optimizer_coeffs = optim.Adam(coefficients, lr=args.lr, weight_decay=0.01)
loss = torch.zeros(1).cuda()
target_semantic = torch.zeros(1).cuda()

@hila-chefer
Copy link
Owner

Hi @NotNANtoN :)
First, please feel free to ask anything, I'm happy to answer :)
Indeed, we do not mention the optimization of coefficients in our paper, since it's a short 4-page paper (+ no supplementary).
In addition, as you observed, the coefficients are used to allow for finer manipulation of each source. Intuitively, if the source and target are semantically close (say both have a beard), we would want to apply a smaller change to the source to resemble the target.

  1. yes, a few times, overall the results were very similar.
  2. from what I observed, usually in the range: 0.5-1.8, this is also around the same range we provide in our notebooks :)
  3. you are absolutely right, this is a misplacement. The optimizer should be initialized for each direction, but not for each step. I'll fix this in my next code update (soon), thanks for the catch!
    In the meantime, to address 3, I tried our joker training with the fix, and as you can see (results on the training set with the optimized coefficients) the difference in results isn't major.
    I hope I was able to answer all your questions :)
    image

@hila-chefer
Copy link
Owner

Hi @NotNANtoN, I’m closing this issue due to inactivity, but feel free to reopen if necessary.

@NotNANtoN
Copy link
Author

Thanks a lot, your answers were very insightful :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants