Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openscoring for Regression Pmml model prediction #8

Closed
taiwotman opened this issue Jan 10, 2019 · 3 comments
Closed

Openscoring for Regression Pmml model prediction #8

taiwotman opened this issue Jan 10, 2019 · 3 comments

Comments

@taiwotman
Copy link

taiwotman commented Jan 10, 2019

I am trying to run linear regression model using my input and pmml files; unfortunately, it seems openscoring only support classification models.

def openscoring():
    os = Openscoring("http://localhost:8080/openscoring")
    # # A dictionary of user-specified parameters
    kwargs = {"auth": ("admin", "adminadmin")}
    pmml_file = "./resources/LinearRegression.pmml"
    input_file = "./resources/test.csv"
    output_file = "./resources/result.csv"

    os.deployFile("regression", pmml_file, **kwargs)
    os.evaluateCsvFile("regression", input_file, output_file)

    os.undeploy("regression", **kwargs)

if __name__ == "__main__":
    openscoring()

My input file just contain class

And I get the following information on the SERVER terminal

org.openscoring.service.ModelResource evaluate
INFO: Returned EvaluationResponse{id=null, result={class=null}}
@vruusmann
Copy link
Member

it seems openscoring only support classification models.

Openscoring is a thin REST wrapper around the JPMML-Evaluator library: https://github.com/jpmml/jpmml-evaluator

See the list of supported and unsupported models under the features section:
https://github.com/jpmml/jpmml-evaluator#features

The RegressionModel element, in both its regression and classification variants, is one of the simplest model types, and is definitely 100% supported.

And I get the following information on the SERVER terminal

See the log messages right before and after that INFO message. There should be a description of the associated EvaluationRequest object (were all the inputs correctly received?), and if some Java exception was thrown (in case of unsupported PMML markup, it should be either an org.jpmml.evaluator.UnsupportedElementException or o.j.e.UnsupportedAttributeException), its full stack trace.

My input file just contain class

I would recommend you to first run your PMML + CSV combo using the org.jpmml.evaluator.EvaluationExample command-line application as described here:
https://github.com/jpmml/jpmml-evaluator#example-applications

@taiwotman
Copy link
Author

taiwotman commented Jan 10, 2019

Thanks for your reply. I will test it out using the JPMML albeit my simple project is in python. In fact, I see that the server successfully reads the argument but the regression output is None.
Could you, please, provide an illustration of how the python client can be used for regression task like the Iris data classification example? Also, please, what does the first parameter in the os.evaluateCsvFile represent - is it the model or target name? Yours was Iris .

@vruusmann
Copy link
Member

Could you, please, provide an illustration of how the python client can be used for regression task like the Iris data classification example?

The type of mining function - regression, classification, clustering etc - is an PMML-internal implementation detail, which is not reflected in Openscoring server or client APIs. In other words, you should be able to deploy and score any PMML document using a standardized workflow/code, without bothering what's actually inside it.

Also, please, what does the first parameter in the os.evaluateCsvFile represent - is it the model or target name? Yours was Iris

It's the model's identifier.

You assign the identifier using the deployFile command. Whatever ID you chose, you need to keep using the same ID (case sensitive) throughout the remainder of your Python script. If you are pulling models from an external source where they already have identifiers, then your best best is to keep (re)using the same identifiers with Openscoring. For example, you can replace "Iris" with "Iris_Species_20190111-v1".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants