Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization of data #43

Closed
GiovaniValdrighi opened this issue May 1, 2020 · 1 comment
Closed

Normalization of data #43

GiovaniValdrighi opened this issue May 1, 2020 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@GiovaniValdrighi
Copy link

Hi, I have been working with the library, specially with NOTEARS, and I was making experiments and observed that a structure learned with the raw data can be different from the structure learned from normalized data (MinMaxScaler). Is there any reason to the method be affected by scale? By reading the paper I couldn't figure a reason. Or is some mistake in my application?

@qbphilip
Copy link
Contributor

qbphilip commented May 1, 2020

Hello Giovani,

Thanks for the question, its a quite interesting point.

The original paper doesn't say much. However, you can interpret the Frobenius norm as a multivariate normal distribution with equal variance. So in theory, normalisation to unit-variance as in StandardScaler makes sense.

The problem is that the weights are penalised by absolute size through the Lasso regularisation and the DAG constraint. Hence, normalising to unit-variance converts the "unit" of the weights to "standard deviation" differences, if X -> Y, a 1 SD change in X causes a W SD change in Y, which in theory should mitigate this problem.

However, and similar to your observation, on generated data we see that this can reduce the accuracy of learning the "true" model, i.e. we learn a different structure. I don't really have a good explanation though. Removing the mean certainty helps but with regard to the variance, it's an open question.

What do you think?

Philip

@qbphilip qbphilip added the question Further information is requested label May 1, 2020
@qbphilip qbphilip self-assigned this May 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants