You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have been working with the library, specially with NOTEARS, and I was making experiments and observed that a structure learned with the raw data can be different from the structure learned from normalized data (MinMaxScaler). Is there any reason to the method be affected by scale? By reading the paper I couldn't figure a reason. Or is some mistake in my application?
The text was updated successfully, but these errors were encountered:
Thanks for the question, its a quite interesting point.
The original paper doesn't say much. However, you can interpret the Frobenius norm as a multivariate normal distribution with equal variance. So in theory, normalisation to unit-variance as in StandardScaler makes sense.
The problem is that the weights are penalised by absolute size through the Lasso regularisation and the DAG constraint. Hence, normalising to unit-variance converts the "unit" of the weights to "standard deviation" differences, if X -> Y, a 1 SD change in X causes a W SD change in Y, which in theory should mitigate this problem.
However, and similar to your observation, on generated data we see that this can reduce the accuracy of learning the "true" model, i.e. we learn a different structure. I don't really have a good explanation though. Removing the mean certainty helps but with regard to the variance, it's an open question.
Hi, I have been working with the library, specially with NOTEARS, and I was making experiments and observed that a structure learned with the raw data can be different from the structure learned from normalized data (MinMaxScaler). Is there any reason to the method be affected by scale? By reading the paper I couldn't figure a reason. Or is some mistake in my application?
The text was updated successfully, but these errors were encountered: