Does SIREN only work well with over parameterised network? #31

MengZephyr · 2020-12-18T23:46:56Z

Hello,

I spent one week on testing siren and try to introduce it to my system. In my knowledge, the Relu network at least is able to come out a result with the averaged or balanced patterns from the training data.

I did a small test like this: to jointly train the neural feature image and a CNN auto-encoder with Sin as the activation. With just 2 cat images as the reconstruction target, to my surprise, an over-parameterized network along with 2 corresponding neural images, very quickly (about 1000 iterations) outputs the results with beautiful high frequency patterns of hairs. Then I begin to reduce the size and dimension of the neural image. Consequently, the loss value still decrease fast at first, then at some iteration, the value suddenly jump high and come a very desperate result. Here I attach the processing result:

I just think very natively, is it because Sin activation is very sensitive to the gradient step, that any wrong step may lead the result to a bad local minimum? Have you tested the generalisation of Siren? Or does the Siren only work well with the over parameterised network?

VovaTch · 2022-02-15T14:49:07Z

I've only come across SIREN very recently, I've been experimenting on my own with a similar type of network independently so maybe I can help. Yes this is a problem that I've encountered, where my fitted images go haywire and dissolve into noise. What worked for me is using a specific set of training methods. I've used OneCycleLR (pytorch) in conjunction with AdamW. It seems using weight decay and amsgrad is essential (usually 1e-3 max learning rate and 1e-6 weight decay), else the output indeed seems to dissolve into noise. Setting a max_clip_norm of about 0.1 to 1 also helps.

Keep in mind that mathematically smaller images tend to require higher frequency sinusoid for fitting, and the derivative of a high-frequency sinusoid can grow very large if not multiplied by a small constant to compensate.

ivanstepanovftw · 2024-02-19T14:32:54Z

Try setting smaller c in initializer, that was initially proposed as 6, to lower value like 4 or 3.2, as it helps to stabilize gradient flow, i.e.

self.c = 4

if self.is_first:
    bound = 1 / fan_in
else:
    bound = math.sqrt(self.c / fan_in) / self.omega_0
x.uniform_(-bound, bound)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does SIREN only work well with over parameterised network? #31

Does SIREN only work well with over parameterised network? #31

MengZephyr commented Dec 18, 2020

VovaTch commented Feb 15, 2022

ivanstepanovftw commented Feb 19, 2024

Does SIREN only work well with over parameterised network? #31

Does SIREN only work well with over parameterised network? #31

Comments

MengZephyr commented Dec 18, 2020

VovaTch commented Feb 15, 2022

ivanstepanovftw commented Feb 19, 2024