-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-2876]Fix Avro decimal datatype with precision and scale #2687
Conversation
Retest this please |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8257/ |
a6e5149
to
aece78c
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8260/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/189/ |
out = (decimalValue.round( | ||
new MathContext(((LogicalTypes.Decimal) avroField.getLogicalType()).getPrecision()))) | ||
.setScale(((LogicalTypes.Decimal) avroField.getLogicalType()).getScale(), | ||
RoundingMode.HALF_UP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace bigdecimal conversion with below code at all places in the code
BigDecimal bigDecimal = new BigDecimal(new String(((ByteBuffer) fieldValue).array(), CarbonCommonConstants.DEFAULT_CHARSET_CLASS) .setScale(dimension.getColumnSchema().getScale(), RoundingMode.HALF_UP);
aece78c
to
576a9e3
Compare
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/227/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8297/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/6/ |
576a9e3
to
ecec394
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/119/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8357/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/287/ |
ecec394
to
142e9b4
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/121/ |
142e9b4
to
21ab8f5
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/122/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8360/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/290/ |
21ab8f5
to
8869fe5
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/138/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/306/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8376/ |
8869fe5
to
cd64696
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/147/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8386/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/315/ |
cd64696
to
08fc944
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/155/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8394/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/323/ |
checkExistence(sql("select * from sdkOutputTable"), true, "12.80") | ||
} | ||
|
||
test("test logical type decimal with data having greater precision") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test to verify boundary values like negative and large big decimal values (Big precission needs long to store unscaled value).
store/sdk/src/main/java/org/apache/carbondata/sdk/file/AvroCarbonWriter.java
Outdated
Show resolved
Hide resolved
| "name": "Union_data3", | ||
| "fields": [ | ||
| { | ||
| "name": "emp_id", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use related names for field. emp_id and union types are not related
* @return | ||
*/ | ||
private boolean checkFieldValueType(Schema.Type type, Object fieldValue) { | ||
private boolean checkFieldValueType(Schema.Type type, Object fieldValue, Schema unionField) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename method to validateUnionFieldValue()
| "type": "record", | ||
| "fields": [ | ||
| { "name": "street", "type": "string"}, | ||
| { "name": "city", "type": {"type" : "bytes", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
city and decimal is not correct example
@@ -525,6 +559,11 @@ private static Field prepareFields(Schema.Field avroField) { | |||
int precision = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getPrecision(); | |||
int scale = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getScale(); | |||
return new Field(fieldName, DataTypes.createDecimalType(precision, scale)); | |||
} else if (logicalType == null && childSchema.getObjectProp("logicalType") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Why this specific check is required?
If this case is for invalid schema, AVRO should validate and throw error. If avro consideres this case just like Bytes, we can also consider same. carbon needs not have this specific check. -
In the else case we should throw exception or leave to default case to throw bytes is not supported exception.
@@ -621,6 +660,11 @@ private static StructField prepareSubFields(String fieldName, Schema childSchema | |||
int precision = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getPrecision(); | |||
int scale = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getScale(); | |||
return new StructField(fieldName, DataTypes.createDecimalType(precision, scale)); | |||
} else if (logicalType == null && childSchema.getObjectProp("logicalType") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the previous comment in this case also.
@@ -714,6 +758,11 @@ private static DataType getMappingDataTypeForCollectionRecord(Schema childSchema | |||
int precision = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getPrecision(); | |||
int scale = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getScale(); | |||
return DataTypes.createDecimalType(precision, scale); | |||
} else if (logicalType == null && childSchema.getObjectProp("logicalType") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the previous comment for decimal here also.
08fc944
to
9634c4f
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/175/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8414/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/343/ |
case BYTES: | ||
// DECIMAL type is defined in Avro as a BYTE type with the logicalType property | ||
// set to "decimal" and a specified precision and scale | ||
// As binary type is not supported yet,value will be null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove comment. Its misleading.
// set to "decimal" and a specified precision and scale | ||
// As binary type is not supported yet,value will be null | ||
if (logicalType instanceof LogicalTypes.Decimal) { | ||
BigDecimal dataValue = new BigDecimal(new BigInteger(((ByteBuffer) fieldValue).array()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please extract logic of constructing the BigDecimal value and validation to a separate method and reuse in both places
Seq(Row(Row(Row(null,null),Row("ab"))))) | ||
} | ||
|
||
test("test union with multiple Enum type") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add test case to read using data source file format. (Syntax "using carbon" with schema Refer SparkCarbonDataSourceTest ).
This will help users on how to define schema for avro logical types & union .
9634c4f
to
d38b4a7
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/178/ |
LGTM |
d38b4a7
to
d673027
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/179/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8418/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/347/ |
LGTM |
Why is PR for?
1.Add precision and scale for fieldvalue for Avro Decimal logical type.
2.If Avro schema is of union type with multiple record or multiple enum, then add check for schema.
Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.