Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time indices, concatenated blocks with model_output with stats_only=FALSE #37

Closed
nonlinearnature opened this issue May 6, 2019 · 1 comment

Comments

@nonlinearnature
Copy link

If I remember, there was briefly an argument for "short_output" that truncated model_output data.frame to just the pred set. Since that has been removed, it looks like the code is padding NAs to get NROW(model_output) = NROW(block), but not putting the NaNs in the right place and not giving them time indices.

block <- data.frame(time=1:10,x=sin((1:10)/pi),y=cos((1:10)/pi))
out <- block_lnlp(block,tp=2,columns=c("x","y"),target_column = "x",stats_only = FALSE)

out$model_output[[1]]

time obs pred pred_var
1 3 0.81627311 0.9655879 0.0003981692
2 4 0.95605566 0.9135897 0.0072468536
3 5 0.99978466 0.9284073 0.0024075621
4 6 0.94306673 0.8425109 0.0241078565
5 7 0.79160024 0.7335355 0.0534826032
6 8 0.56060280 0.5976110 0.0790172007
7 9 0.27328240 0.3439944 0.1140423509
8 10 -0.04149429 0.3950724 0.0318934212
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

Additionally, if you give a split library this does not appear to deter block_lnlp from making predictions across the gaps.

out <- block_lnlp(block,lib=rbind(c(1,5),c(6,10)),tp=2,columns=c("x","y"),target_column = "x",stats_only = FALSE)
out$model_output[[1]]
time obs pred pred_var
1 3 0.81627311 0.9593595 0.003831564
2 4 0.95605566 0.8972787 0.011777949
3 5 0.99978466 0.8827621 0.014781495
4 6 0.94306673 0.8959425 0.031103864
5 7 0.79160024 0.5932581 0.058161169
6 8 0.56060280 0.2666997 0.076164443
7 9 0.27328240 0.2824588 0.104255915
8 10 -0.04149429 0.3657842 0.024814822
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

Although if you do something similar with simplex() you get the correct breaks in the predictions corresponding to the breaks given in the library.

out_simplex <- simplex(block$x,lib=rbind(c(1,5),c(6,10)),E=1,tp=2,stats_only = FALSE)
out_simplex$model_output[[1]]
time obs pred pred_var
1 3 0.81627311 0.42321690 0.2476161338
2 4 0.95605566 -0.03897166 0.0007877017
3 5 0.99978466 0.27779014 0.0012748452
4 6 0.94306673 NaN NaN
5 7 0.79160024 NaN NaN
6 8 0.56060280 0.67176509 0.1307101216
7 9 0.27328240 0.99722448 0.0011178303
8 10 -0.04149429 0.95403246 0.0013772935
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

[[Although the padding at the end of the time series is still wrong]].

Finally, just as a double check, if you put a break in the time-series, the NaNs are for that are correct.

block_broken <- data.frame(time=1:10,x=sin((1:10)/pi),y=cos((1:10)/pi))
block_broken[5,c('x','y')] <- NA
out_broken <- block_lnlp(block_broken,tp=2,columns=c("x","y"),target_column = "x",stats_only = FALSE)
out_broken$model_output[[1]]
time obs pred pred_var
1 3 0.81627311 0.9443627 0.003857110
2 4 0.95605566 0.8381385 0.006627352
3 5 NA 0.9284073 0.002407562
4 6 0.94306673 0.7074185 0.055627882
5 7 0.79160024 NaN NaN
6 8 0.56060280 0.3496210 0.111945403
7 9 0.27328240 0.3071343 0.114579057
8 10 -0.04149429 0.3781828 0.030487316
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

@SoftwareLiteracy
Copy link
Contributor

Thanks for illustrating these issues.

Since this was done on an old version (pre 1.7.5), and encapsulates multiple issues, I'm going to close it and reopen a new issue focusing on the library break predictiion output.

Please note it is recommended to use the new API, as it is consistent across implementations (C++, Python, R) and directly interfaces to cppEDM. Nonetheless, we certainly want to address the legacy (0.7.X) API compatibility.

I believe that the NaN alignment issue has been resolved. Here's output of the version 1.8

> block <- data.frame( time=1:10, x=sin((1:10)/pi), y=cos((1:10)/pi) )
> out <- block_lnlp( block, tp=2, columns=c("x","y"), target_column = "x", stats_only = FALSE )
> out $ model_output
   Index Observations Predictions Pred_Variance Const_Predictions
1      1      0.31296         NaN           NaN               NaN
2      2      0.59448         NaN           NaN               NaN
3      3      0.81627     0.96559     0.0003982           0.31296
4      4      0.95606     0.91359     0.0072469           0.59448
5      5      0.99978     0.92841     0.0024076           0.81627
6      6      0.94307     0.84251     0.0241079           0.95606
7      7      0.79160     0.79119     0.0389745           0.99978
8      8      0.56060     0.59761     0.0790172           0.94307
9      9      0.27328     0.34399     0.1140424           0.79160
10    10     -0.04149     0.39507     0.0318934           0.56060
11    11          NaN     0.09529     0.0413000           0.27328
12    12          NaN     0.17593     0.0557830          -0.04149

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants