-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dead zone near origin of latent space #12
Comments
Here is how to load a model:
|
Thanks @udibr, that worked great! So I stand corrected, I tried to replicate this sampling near the origin, but did not see any difference. In fact, I strangely don't see any difference independent of range. Some examples below, keep in mind the RNG is run from the same state. Here is the default range: random_zmb = floatX(np_rng.uniform(-1., 1., size=(nvis, nz))) here is a scaled range: random_zmb = floatX(np_rng.uniform(-0.01, 0.01, size=(nvis, nz))) and here is a scaled and translated range: random_zmb1 = floatX(np_rng.uniform(0.25, 0.26, size=(nvis, nz))) and just in case you think my code is having no effect, here is a constant range: random_zmb1 = floatX(np_rng.uniform(0.0, 0.0, size=(nvis, nz))) This seems to imply that the absolute magnitude and position of the vectors don't matter at all, rather it is their direction from their mean. This is surprising to me - I'll definitely have to rethink what a slice of the latent space should be and might have implications for what constitutes a random walk. |
Here is one last one - this is a linear path from the (-1,-1..) corner of the cube to the (1,1...) corner of the hypercube (here nz=100): random_zmb = floatX(np_rng.uniform(-1., 1., size=(nvis, nz)))
for i in range(nvis):
frac = i / (nvis - 1.0)
for j in range(nz):
random_zmb[i][j] = frac I like this one because it shows:
|
this remindes me of Hinton's famous document clustering plot (fig 4C in http://www.cs.toronto.edu/~hinton/science.pdf ) Each cluster is ray from the center and the distance from the center can be interpreted with how sure the cluster is. A "random walk" will be a circular path around the center |
It wont be a circular path in this case because Z is sampled from uniform On Thursday, December 3, 2015, Ehud Ben-Reuven notifications@github.com
|
@dribnet The generator/sample code currently uses the minibatch to calculate statistics for batchnorm. This is the reason why changing the scale and mean of sampled Zs has no effect - batchnorm shifts and scales everything back to zero/unit. The current batchnorm code supports using cached/computed inference values - modifying the generator to pass in u (mean) and s (variance) to each call of batchnorm should fix the scale issues. The hack that also works is to have a large amount of "random" samples passed in alongside the points you want to sample - this was done for some of the figures in the paper. You should keep the visualization to random sample low to avoid significantly changing the batchnorm statistics. This could explain the deadzone, it could also not. I'm out of town right now but when I'm back late this weekend I can take a look on my end - retraining with Z sampled from a unit sphere or just random normed vectors may fix the issue. If I remember correctly some of this was experimented with in Ian's original code base. |
W_kx = k * Wx, BN(k_x) = sign(k)*BN(x) => generator(kx) = generator(x) if k>0 and you take the tanh from the end of the generator |
Thanks @Newmu - this makes sense now. I can try regenerating my images using your hack soon and compare results. And maybe if look at the BN code, I can figure out a more principled way to add just a few extra samples per sample that I want (eg, maybe: -sample, sample scaled to unit, -sample scaled to unit). |
Could someone explain why the middle output images are different in the following cases? Number of samples to visualize: nvis = 5 Generator: def gen(Z, w, g, b, w2, g2, b2, w3, g3, b3, w4, g4, b4, wx):
h = relu(batchnorm(T.dot(Z, w), g=g, b=b))
h = h.reshape((h.shape[0], ngf*8, 4, 4))
... Visualize samples: color_grid_vis(inverse_transform(samples), (1, nvis), 'samples.png') Case 1: (similar to what @dribnet posted above)Inputs: z = floatX(np_rng.uniform(0.0, 0.0, size=(nvis, nz))) Since Case 2: (almost similar to what @dribnet posted above)Inputs: step = 2.0 / (nvis - 1)
z = floatX(np_rng.uniform(0.0, 0.0, size=(nvis, nz)))
for i in range(nvis):
v = -1.0 + step * i
for j in range(nz):
random_zmb[i][j] = v Let The values of |
After working more with interpolation, I think this result is likely due to the fact that random distributions in high dimensional spaces are shaped more like hyperspheres and so areas close the origin are extremely unlikely. This is true if the prior is uniform or gaussian, though it might be possible to construct a prior where this is not the case. So happy to close this issue. A longer writeup of this reasoning is in this issue: soumith/dcgan.torch#14. @Newmu and @udibr - would be interested in your feedback on this idea. Note that this result also has consequences on the best way to do interpolation and averaging (eg: smiling woman) in the latent space. |
In looking at manifolds in my own experiments, I have noticed a consistent "dead zone" near the origin of the latent space. Here is an example generated with
faces/train_uncond_dcgan.py
and z=100:I can post the math later, but suffice to say that the area near the center of the image is proportionally near zero in all z dimensions.
My strong suspicion is that this could be replicated by replacing this line in
train_uncond_dcgan.py
:with
and seeing if this results in poor quality samples. I can followup and try this if that is useful - I haven't done so yet because I need to first implement the
load
operation to use one of the models that is being saved each epoch.This isn't causing me any consternation, but I thought I would mention it since it's an unexpected curiosity and so might be a bug or might just be something I don't understand about the nature of this latent space.
The text was updated successfully, but these errors were encountered: