Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-2876]Fix Avro decimal datatype with precision and scale #2687

Closed
wants to merge 1 commit into from

Conversation

Indhumathi27
Copy link
Contributor

@Indhumathi27 Indhumathi27 commented Sep 3, 2018

Why is PR for?

1.Add precision and scale for fieldvalue for Avro Decimal logical type.
2.If Avro schema is of union type with multiple record or multiple enum, then add check for schema.

  • Any interfaces changed?

  • Any backward compatibility impacted?

  • Document update required?

  • Testing done

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@Indhumathi27
Copy link
Contributor Author

Retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8257/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8260/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/189/

out = (decimalValue.round(
new MathContext(((LogicalTypes.Decimal) avroField.getLogicalType()).getPrecision())))
.setScale(((LogicalTypes.Decimal) avroField.getLogicalType()).getScale(),
RoundingMode.HALF_UP);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace bigdecimal conversion with below code at all places in the code
BigDecimal bigDecimal = new BigDecimal(new String(((ByteBuffer) fieldValue).array(), CarbonCommonConstants.DEFAULT_CHARSET_CLASS) .setScale(dimension.getColumnSchema().getScale(), RoundingMode.HALF_UP);

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/227/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8297/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/6/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/119/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8357/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/287/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/121/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/122/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8360/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/290/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/138/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/306/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8376/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/147/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8386/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/315/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/155/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8394/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/323/

checkExistence(sql("select * from sdkOutputTable"), true, "12.80")
}

test("test logical type decimal with data having greater precision") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test to verify boundary values like negative and large big decimal values (Big precission needs long to store unscaled value).

| "name": "Union_data3",
| "fields": [
| {
| "name": "emp_id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use related names for field. emp_id and union types are not related

* @return
*/
private boolean checkFieldValueType(Schema.Type type, Object fieldValue) {
private boolean checkFieldValueType(Schema.Type type, Object fieldValue, Schema unionField) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename method to validateUnionFieldValue()

| "type": "record",
| "fields": [
| { "name": "street", "type": "string"},
| { "name": "city", "type": {"type" : "bytes",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

city and decimal is not correct example

@@ -525,6 +559,11 @@ private static Field prepareFields(Schema.Field avroField) {
int precision = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getPrecision();
int scale = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getScale();
return new Field(fieldName, DataTypes.createDecimalType(precision, scale));
} else if (logicalType == null && childSchema.getObjectProp("logicalType")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why this specific check is required?
    If this case is for invalid schema, AVRO should validate and throw error. If avro consideres this case just like Bytes, we can also consider same. carbon needs not have this specific check.

  2. In the else case we should throw exception or leave to default case to throw bytes is not supported exception.

@@ -621,6 +660,11 @@ private static StructField prepareSubFields(String fieldName, Schema childSchema
int precision = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getPrecision();
int scale = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getScale();
return new StructField(fieldName, DataTypes.createDecimalType(precision, scale));
} else if (logicalType == null && childSchema.getObjectProp("logicalType")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the previous comment in this case also.

@@ -714,6 +758,11 @@ private static DataType getMappingDataTypeForCollectionRecord(Schema childSchema
int precision = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getPrecision();
int scale = ((LogicalTypes.Decimal) childSchema.getLogicalType()).getScale();
return DataTypes.createDecimalType(precision, scale);
} else if (logicalType == null && childSchema.getObjectProp("logicalType")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the previous comment for decimal here also.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/175/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8414/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/343/

case BYTES:
// DECIMAL type is defined in Avro as a BYTE type with the logicalType property
// set to "decimal" and a specified precision and scale
// As binary type is not supported yet,value will be null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove comment. Its misleading.

// set to "decimal" and a specified precision and scale
// As binary type is not supported yet,value will be null
if (logicalType instanceof LogicalTypes.Decimal) {
BigDecimal dataValue = new BigDecimal(new BigInteger(((ByteBuffer) fieldValue).array()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extract logic of constructing the BigDecimal value and validation to a separate method and reuse in both places

Seq(Row(Row(Row(null,null),Row("ab")))))
}

test("test union with multiple Enum type") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add test case to read using data source file format. (Syntax "using carbon" with schema Refer SparkCarbonDataSourceTest ).
This will help users on how to define schema for avro logical types & union .

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/178/

@KanakaKumar
Copy link
Contributor

LGTM

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/179/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8418/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/347/

@manishgupta88
Copy link
Contributor

LGTM

@asfgit asfgit closed this in 9ebab57 Sep 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants