You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to reproduce the NTK and LRC functions that you have in your code and when I run the NTK for 3 (or 5) repeated runs with the same settings and input model, I get vastly different results, ie:
We are calculating empirical NTK here. That means we need to sample both network initializations and input samples. This randomness makes each NTK calculation not the same. However, as shown in our Figure 1, the general trend is that good architectures have smaller NTK condition numbers.
Hi there,
I am trying to reproduce the NTK and LRC functions that you have in your code and when I run the NTK for 3 (or 5) repeated runs with the same settings and input model, I get vastly different results, ie:
I would love to get a better sense of what the NTK actually does and how we can get consistent results.
Also do we need to initialize with kaiming? what is the point of this initialization and is there an alternative (ie. xavier, zero, none).
The text was updated successfully, but these errors were encountered: