-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integer constants that don't fit into java.lang.Integer value range #107
Comments
If I read this PMML fragment correctly, then this <Constant dataType="string">7680517950697</Constant> The JPMML-Converter library currently sets the
In PMML there is only one integer data type. It was my arbitrary decision to map it to This decision can be changed any time, but based on the "principle of least astonishment", it should happen during a major version upgrade (eg. from 1.4.X to 1.5.X), not between minor version upgrades (eg. from 1.4.1 to 1.4.2).
Moved to a separate issue jpmml/jpmml-sparkml#41 |
Yes, the constant element would be String (from StringIndexerModel). Setting the data type in jpmml-sparkml for constants would also definitely make sense. My first thought was this would bloat the PMML file, but considering those constants are probably limited in numbers it wouldn't matter that much. This would probably solve my problem best. The decision to take Integer rather than Long is OK for me, I didn't find any Information about the length of Integer in the PMML-Spec, so it is perfectly valid and no problem for me. But it should be consistent. |
You can always post-process PMML class model objects using the Visitor API. For example, setting the Visitor visitor = new org.jpmml.model.visitors.AbstractSimpleVisitor(){
@Override
public VisitorAction visit(Constant constant){
if(constant.getDataType() == null){
if(isStringValue(constant.getValue())){
constant.setDataType(DataType.STRING);
}
}
}
};
PMML pmml = ...;
visitor.applyTo(pmml); |
Oh that's a good idea, I'll try this on Monday! Thanks! |
Thanks Villu! |
Hi Villu
I found a small inconsistency in TypeUtil which causes Problems with with large numeric String constants.
In the method getConstantDataType (Line 862) the String is checked against Long and Returns INTEGER aus Datatype.
Int the methods cast/toInteger the DataType INTEGER a range check for integer is done and if whithin range it will be transformed to an Integer.
In my opinion getConstantDataType should check against Integer (Integer.parseInt instead of Long.parseLong).
JPMML-Evaluator used: 1.4.0 (the official one, not my java6 backport)
As for the PMML: this has been generated with jpmml-sparkml 1.3.3
The PMML Fragment causing trouble looks like this:
<DerivedField name="handleInvalid(my_field)" optype="categorical" dataType="string"> <Apply function="if"> <Apply function="isIn"> <FieldRef field="my_field"/> <Constant>7680517950697</Constant>
it's generated by a StringIndexer with setHandleInvalid("keep").
btw: it would also be nice if setHandleInvalid("skip") would be supported by jpmml-sparkml
The text was updated successfully, but these errors were encountered: