Normalization of data #43

GiovaniValdrighi · 2020-05-01T02:11:34Z

Hi, I have been working with the library, specially with NOTEARS, and I was making experiments and observed that a structure learned with the raw data can be different from the structure learned from normalized data (MinMaxScaler). Is there any reason to the method be affected by scale? By reading the paper I couldn't figure a reason. Or is some mistake in my application?

qbphilip · 2020-05-01T09:19:08Z

Hello Giovani,

Thanks for the question, its a quite interesting point.

The original paper doesn't say much. However, you can interpret the Frobenius norm as a multivariate normal distribution with equal variance. So in theory, normalisation to unit-variance as in StandardScaler makes sense.

The problem is that the weights are penalised by absolute size through the Lasso regularisation and the DAG constraint. Hence, normalising to unit-variance converts the "unit" of the weights to "standard deviation" differences, if X -> Y, a 1 SD change in X causes a W SD change in Y, which in theory should mitigate this problem.

However, and similar to your observation, on generated data we see that this can reduce the accuracy of learning the "true" model, i.e. we learn a different structure. I don't really have a good explanation though. Removing the mean certainty helps but with regard to the variance, it's an open question.

What do you think?

Philip

qbphilip added the question Further information is requested label May 1, 2020

qbphilip self-assigned this May 1, 2020

qbphilip closed this as completed Jun 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalization of data #43

Normalization of data #43

GiovaniValdrighi commented May 1, 2020

qbphilip commented May 1, 2020

Normalization of data #43

Normalization of data #43

Comments

GiovaniValdrighi commented May 1, 2020

qbphilip commented May 1, 2020