You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am attempting to use this cGAN implementation for building the visual neural model of a specific person from photos or videos. One experiment that I did was to remove some features in the input image (say, an eye), and then see what it takes for cGAN to recover the missing features through learning. While monitoring the progress of such a training session, I am surprised to see that cGAN is attempting to move one or more whole eyes around seemingly looking for a fit, like this:
My question here is: what mechanism in this cGAN implementation would explain this behavior? I could imagine that the whole eye is actually represented as a high-level feature somewhere in the higher layer of the generator, and that the fractionally-strided convolution allows for some translation manipulation, but not sure why we would see two eyes sometimes, why the fitting process appears to be somewhat aimless and often off the mark by quite a bit (yes, it did take quite a while with the eye being moved all over the place) during training, why cGAN would decide that adding an eye way off position is worth trying (since the error from that would seem fairly high).
Anybody has insight on this?
The text was updated successfully, but these errors were encountered:
@phillipi You can find my dataset HERE. Somehow github won't allow me to attach it as a zip file here.
You can also find more details about my experiment in my blog HERE, mainly in the Experiment #2 section. There I have described some other observed phenomenon, such as the ghostly cross-identity repair of missing feature, the wrong eye being repaired, etc.
I am attempting to use this cGAN implementation for building the visual neural model of a specific person from photos or videos. One experiment that I did was to remove some features in the input image (say, an eye), and then see what it takes for cGAN to recover the missing features through learning. While monitoring the progress of such a training session, I am surprised to see that cGAN is attempting to move one or more whole eyes around seemingly looking for a fit, like this:
My question here is: what mechanism in this cGAN implementation would explain this behavior? I could imagine that the whole eye is actually represented as a high-level feature somewhere in the higher layer of the generator, and that the fractionally-strided convolution allows for some translation manipulation, but not sure why we would see two eyes sometimes, why the fitting process appears to be somewhat aimless and often off the mark by quite a bit (yes, it did take quite a while with the eye being moved all over the place) during training, why cGAN would decide that adding an eye way off position is worth trying (since the error from that would seem fairly high).
Anybody has insight on this?
The text was updated successfully, but these errors were encountered: