-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[R] Consider make model structure compatible with Rdata #362
Comments
This might involve a bit more complicated wrapping of the model, to make it an S4 structure, so the symbol, arg.params can be get out like property function of Rcpp, which allows the chance of nullptr checking and recovery |
@topepo I guess this thread will be related to what you are trying to do |
Just think about this for a little while. What about saving the model as a json string in Rdata. When loading, we parse the json string. This can be done by adding two helper functions. |
My viewpoint is mostly around prediction using the model object so I'll focus on how we could export the prediction function. I can think of two options:
Thanks, Max |
It is never hard to explicit save something that is serializable in Rdata form. We have ways to save graph to json string, and ndarrays(parameters) to raw type in R. Problem is on eagerly saving things to object everytime an object is generated. Which could be costly for object such as array(means need to dump things from GPU to CPU everytime an operation is calculated).
|
Let me write |
@thirdwing - Is this implemented? Can we close this issue? |
I don't think so (at least as of Apr 14). What I was requesting was to be able to save the fitted model object (in its native class) so that it can be re-used in future sessions. A workflow like: model <- mx.mlp(data = x, label = y)
save(mode, file = "model.RData"")
q("no")
## new R session
load("model.RData")
predict(model, newdata) Having an export function doesn't really solve that problem. Maybe an intermediate step of model <- serialize(model) prior to saving would be a solution. |
@topepo I have added two helper functions. The network symbol and parameters will be saved using I restart the R session after saving, so the external pointer won't work. Can you give some advice on this? require(mlbench)
require(mxnet)
data(Sonar, package = "mlbench")
Sonar[,61] <- as.numeric(Sonar[,61])-1
train.ind <- c(1:50, 100:150)
train.x <- data.matrix(Sonar[train.ind, 1:60])
train.y <- Sonar[train.ind, 61]
test.x <- data.matrix(Sonar[-train.ind, 1:60])
test.y <- Sonar[-train.ind, 61]
mx.set.seed(0)
model <- mx.mlp(train.x, train.y, hidden_node=10, out_node=2, out_activation="softmax",
num.round=20, array.batch.size=15, learning.rate=0.07, momentum=0.9,
eval.metric=mx.metric.accuracy)
mx.model.save.RData(model = model, filename = "test.RData")
#### restart the R session
require(mlbench)
require(mxnet)
data(Sonar, package = "mlbench")
Sonar[,61] <- as.numeric(Sonar[,61])-1
train.ind <- c(1:50, 100:150)
train.x <- data.matrix(Sonar[train.ind, 1:60])
train.y <- Sonar[train.ind, 61]
test.x <- data.matrix(Sonar[-train.ind, 1:60])
test.y <- Sonar[-train.ind, 61]
model <- mx.model.load.RData("test.RData")
preds <- predict(model, test.x)
pred.label <- max.col(t(preds)) - 1
table(pred.label, test.y) |
This will help a lot. Has this been merged upstream and are binaries ready? Also, people have really started to use RDS files or RData. Extending this functionality to RDS would be huge if you can. |
@jaredlander This has been merged in #6494 . After |
OK, will look into this. Not sure that function is in version 0.10.1. |
…#6494) * [R] mx.serialize/mx.unserialize (close apache#362)
* More Windows compile fixes * Expand timer delay in order to reduce error due to slow CI machine (Mac) * fix signed/unsigned warnings
Currently most things are Rcpp based, which means things are not compatible with Rdata. Due to the limitation of current R's serialization system without a customized loading function.
The only way to make things Rdata compatible is to eagerly dump state into raw state every-time a new object is returned, and in all functions, check externalptr, if it is null, load things from the raw state. This is how things are handled in xgboost.
Such way is dumb, and cost a lot of overhead and do not make sense for low level API. However, it might be possible to support this way for https://github.com/dmlc/mxnet/blob/master/R-package/R/model.R#L71 model structure. So at least user can save the model in Rdata, while things still won't be perfect, this might be helpful.
This is not an urgent thing, but maybe worth considering
The text was updated successfully, but these errors were encountered: