some clarifications about the output from a quantitaive model prediction #10

pritwish · 2019-04-10T08:43:51Z

Hi! So considering that I am doing quantitative prediction, I have a few questions:
a. is there a place I can get the coefficients and intercepts for models other than the top ranked model?
b. Does it do scaling and standardization internally?
c. Does it consider the values in an absolute sense internally, because i ran two datasets with absolute values same and the sign (poisitive and negative) changed in some instances of the the two and the output models were the same. But this could be a special instance of the dataset too.
d. I understand that the output model should be put in the form of y'=mX+c where X is the value evaluated from the descriptor, and this finally would give me the predicted output variable. Is there any way I can change the linear function to a different function, say a polynomial function of order two?
e. Are two different descriptors linked in any way with each other (incase they are a multi dimensional descriptor and also incase they are not). really naive question, but bugs me a lot. :p

rouyang2017 · 2019-04-10T14:32:03Z

a. No, only for the top ranked model in the SISSO.out. However, you can do it this way for other models:

Get the Feature_IDs for any model you want from the folder "models".
Find the ID_corresponding feature formulas from the file "Uspace.name" in the folder "feature_space". The ID is the line number in the file 'Uspace.name'. E.g., ID:50 means the feature at line 50 in the file Uspace.name
Find the ID_corresponding feature data from the file "Uspace_pxxx.dat" (here pxxx denotes property xxx) in the folder "feature_space". If you are doing single-task (one target property) learning, then the first column being the original data of your target property, and the second column being the feature ID:1, third column being feature ID:2, ...
copy these features to create a new train.dat, and set rung=0 (and also corresponding nsf, subs_sis) in the SISSO.in. Then, run SISSO to get a new SISSO.out.

So, in short, take the features of the model you want and do SISSO again, without further feature transformation, to get the coefficients.

rouyang2017 · 2019-04-10T14:58:33Z

b. No any scaling are done internally during feature construction so that the physical meaning of primary features are preserved.
c. No. Probably your models are insensitive to that feature? I would check the models why difference sign of that feature values does not make changes of the results.
d. There is no such restriction in SISSO to make the models have to be polynomial. However, I expect that polynomial models could appear if they are really important (strongly correlated with your data). Or you can try only power operators ^2, ^3, ... for feature construction, right?
e. Good question. This also bug me a lot :), and I think this is a general issue for any data-driven method, not just SISSO. We have some remarks in the large paragraph (General remarks on the descriptor-property relationship identified by SISSO) of the SISSO paper.

pritwish closed this as completed Jan 22, 2020

jungsdao mentioned this issue Dec 27, 2020

About cross validation procedure in SISSO #35

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some clarifications about the output from a quantitaive model prediction #10

some clarifications about the output from a quantitaive model prediction #10

pritwish commented Apr 10, 2019

rouyang2017 commented Apr 10, 2019 •

edited

rouyang2017 commented Apr 10, 2019

some clarifications about the output from a quantitaive model prediction #10

some clarifications about the output from a quantitaive model prediction #10

Comments

pritwish commented Apr 10, 2019

rouyang2017 commented Apr 10, 2019 • edited

rouyang2017 commented Apr 10, 2019

rouyang2017 commented Apr 10, 2019 •

edited