New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-2876]AVRO datatype support through SDK #2671
Conversation
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6462/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8149/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/78/ |
eae8752
to
00a23e1
Compare
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6466/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8153/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/82/ |
if (logicalType instanceof LogicalTypes.Decimal) { | ||
out = new BigDecimal(new String(((ByteBuffer) fieldValue).array(), | ||
CarbonCommonConstants.DEFAULT_CHARSET_CLASS)); | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this semi-colon
CarbonCommonConstants.DEFAULT_CHARSET_CLASS)); | ||
; | ||
} else { | ||
out = fieldValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else case handling is not required as Binary data type is not supported...so write a proper comment to say that only decimal logical type is support for avro Byte data type. Once binary data type is supported we can add the else block
} | ||
} | ||
Object[] values; | ||
values = new Object[countIfNotNull]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Rename countIfNotNull to notNullUnionFieldsCount
- merge above 2 lines 'Object[] values = new Object[countIfNotNull];'
- Check union behavior for only NULL type and handle if any special handling is required
int j = 0; | ||
for (Schema unionField : fieldsUnion) { | ||
if (!unionField.getType().equals(Schema.Type.NULL)) { | ||
values[j] = avroFieldToObjectForUnionType(unionField, fieldValue, avroField); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Try to reuse the code as much as possible. You can extract the primitives types computation in a separate method and override only complex types as different methods for union and rest of the data types
- Write a function for union type to identify the schema for which the computation need to be done. Do not call the computation for all the union types as at one time only one type of value will exists
@Indhumathi27 ..please modify the code as per the comments then we can continue with further review of code |
00a23e1
to
9ed6bf7
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8182/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/111/ |
values[j] = avroFieldToObjectForUnionType(unionField, fieldValue, avroField); | ||
} else { | ||
values[j] = null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Remove else block
- Combine above 2 if conditions into 1 using && operator
- break the loop once if check is success
CarbonCommonConstants.DEFAULT_CHARSET_CLASS)); | ||
} else { | ||
// As binary type is not supported yet, fill value as null | ||
out = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this else block
// Get union types and store as Struct<type> | ||
ArrayList<StructField> unionFields = new ArrayList<>(); | ||
for (Schema avroSubField : avroField.schema().getTypes()) { | ||
StructField unionField = prepareSubFields(avroField.name() + i++, avroSubField); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check for NULL schema here
unionFields.add(unionField); | ||
} | ||
} | ||
if (unionFields.size() != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace 'unionFields.size()' with 'unionFields.isEmpty()'
9ed6bf7
to
66ce1a4
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8188/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/117/ |
if (notNullUnionFieldsCount != 1) { | ||
j++; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify the code as below
int j = 0;
for (Schema unionField : unionFields) {
if (unionField.getType().equals(Schema.Type.NULL) ) {
continue;
}
if (checkFieldValueType(unionField.getType(), fieldValue)) {
values[j] = avroFieldToObjectForUnionType(unionField, fieldValue, avroField);
break;
}
j++;
}
66ce1a4
to
371c699
Compare
LGTM...can be merged once build passes |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/8197/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/126/ |
This PR supports following Avro DataTypes to carbon format through SDK. Avro datatypes include,
Please refer JIRA CARBONDATA-2876 for further detail.
Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Test file has been added
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.