-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add .toPMML spark methods for MLlib into sparklyr #60
Comments
You should be able to do this 'by hand', with something like:
Note that Is this a common enough action that it would make sense to expose a function / API on the |
I'll keep playing with Instead of saving to xml, I just tried the simplest option, to print to the console. I created a kmeans cluster model:
Should conceptually print the PMML model to the console, but instead I get:
I'm sure many use cases will have people keep the model in spark and make new predictions there. In which case |
Gotcha -- it sounds like we should think of implementing something like I'm less sure about the spark log problem -- perhaps that file is being creating by Spark, which is being running with elevated permissions / alternate permissions and hence isn't user accessible. @javierluraschi, does that sound correct? |
The log problem is a red herring (on Windows we can't currently access the In terms of saving and loading models, I believe that Spark 2.0 now enables On Thu, Jun 30, 2016 at 11:16 PM, Andrew Taylor notifications@github.com
|
@TaylorAndrew The call should not contain a |
Javier, shouldn't our model objects implement the spark_jobj S3 method so On Fri, Jul 1, 2016 at 7:02 AM, Javier Luraschi notifications@github.com
|
@jjallaire yes, that would be a good addition. Opened: #61 |
Hrm, to my surprise, even though this appears to be exported on the Scala side, it seems like we can't access it from the RBackendHandler. Perhaps because PMMLExportable is part of the so-called 'developer API'? We might have to dig a bit more to make this work, unfortunately. |
Ahhh, I know what's going on. We use the It looks like it's still possible to save these ML model objects; just not as PMML. :-/ |
Marking as feature request since as Kevin mentions, we are based on the newer BTW. Wouldn't it be possible to get the predictive model and manually map the output into the pmml package? |
Maybe, but I would strongly prefer filing a feature request on the Spark side to add this, rather than trying to implement it ourselves. |
K, marking as feature request here as well. |
This would be a good feature to add. Also the ability to save and load spark models, which is one of the main features in 2.0. |
@kevinushey @javierluraschi perhaps the easiest way for this would be to leverage this: |
@kevinushey are there any updates on exporting sparklyr models to pmml? is this already supported by the package? |
Unfortunately no. As far as I can see, these still haven't been ported from https://issues.apache.org/jira/browse/SPARK-11171 As far as I can see, the work here has unfortunately fallen off the radar somewhat. |
@mrjoseph84 just curious what's your use case with pmml? |
@kevinushey Thanks for the update and links - I will be monitoring these issues |
Closing this since this is unlikely to be supported in Spark and we support saving/loading pipelines and models now. |
I'm trying export my models as PMML. In scala one would use
myModel.toPMML("./myModelPMML.xml")
I'm not currently seeing a way to do this, nor a way to 'inject' raw scala so that I could do it myself, similar to how I can use
ft_sql_transformer()
to submit raw sql to the SparkSQL database instead of using dplyr language.Are either of these two options possible currently, or are they on the horizon?
The text was updated successfully, but these errors were encountered: