New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-3017] Map DDL Support #2980
Conversation
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1657/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9917/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1869/ |
(String[]) configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS); | ||
Queue<String> complexDelimiters = new LinkedList<>(); | ||
for (int i = 0; i < 4; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“i < 4” the 4 is not clear, it's better to exchage is by constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -338,12 +338,15 @@ public static CarbonLoadModel getLoadModel(Configuration conf) throws IOExceptio | |||
SKIP_EMPTY_LINE, | |||
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SKIP_EMPTY_LINE))); | |||
|
|||
String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":"); | |||
String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":" + "," + "\\\\003"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move all the default delimters to constants and use it everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,2 @@ | |||
1\002Nalla\0012\002Singh\0011\002Gupta\0014\002Kumar | |||
10\002Nallaa\00120\002Sissngh\001100\002Gusspta\00140\002Kumar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the above newly added files to spark-common-test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
String[] split = complexDelim.split(","); | ||
model.setComplexDelimiterLevel1(split[0]); | ||
if (split.length > 1) { | ||
model.setComplexDelimiterLevel2(split[1]); | ||
} | ||
if (split.length > 2) { | ||
model.setComplexDelimiterLevel3(split[2]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify the above code as below
if (split.length > 2) { model.setComplexDelimiterLevel2(split[1]); model.setComplexDelimiterLevel3(split[2]); } else if (split.length > 1) { model.setComplexDelimiterLevel2(split[1]); }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if (dimensionRawColumnChunk.getDataChunkV3 | ||
.isSetLocal_dictionary) { | ||
isLocalDictionaryGenerated = true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can directly use scala filter operation to check for local dictionary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -188,11 +188,13 @@ case class CarbonLoadDataCommand( | |||
val carbonLoadModel = new CarbonLoadModel() | |||
val tableProperties = table.getTableInfo.getFactTable.getTableProperties | |||
val optionsFinal = LoadOption.fillOptionWithDefaultValue(options.asJava) | |||
// These two delimiters are non configurable and hardcoded for map type | |||
optionsFinal.put("complex_delimiter_level_3", "\003") | |||
optionsFinal.put("complex_delimiter_level_4", "\004") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove hardcoded values. Create one ComplexDelimeterEnum
and use the values from there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
new String[] { loadModel.getComplexDelimiterLevel1(), | ||
loadModel.getComplexDelimiterLevel2() }); | ||
new String[] { loadModel.getComplexDelimiterLevel1(), loadModel.getComplexDelimiterLevel2(), | ||
loadModel.getComplexDelimiterLevel3(), loadModel.getComplexDelimiterLevel4() }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding these many delimeters methods in loadmodel, create a list or array of complex delimeters in loadmodel and then use it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -631,7 +651,7 @@ public void setFactTimeStamp(long factTimeStamp) { | |||
} | |||
|
|||
public String[] getDelimiters() { | |||
return new String[] { complexDelimiterLevel1, complexDelimiterLevel2 }; | |||
return new String[] { complexDelimiterLevel1, complexDelimiterLevel2, complexDelimiterLevel3 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify this class and related changes in other classes as per the above comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
optionsFinal.put( | ||
"complex_delimiter_level_3", | ||
Maps.getOrDefault(options, "complex_delimiter_level_3", "\\\\003")); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove hardcoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
String nullFormat, int depth) { | ||
if (depth > 2) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should throw exception with proper error message if depth is more than 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
(String[]) configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS); | ||
Queue<String> complexDelimiters = new LinkedList<>(); | ||
for (int i = 0; i < 4; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using hardcoded values like 4
in the loops while coding....instead iterate over tempComplexDelimiters length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
4cc8ba1
to
8c186cc
Compare
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1705/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9965/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1915/ |
473f695
to
0ace143
Compare
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1916/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1706/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9966/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1917/ |
0ace143
to
d57e326
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1712/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9972/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1923/ |
@@ -65,8 +65,7 @@ | |||
private String csvHeader; | |||
private String[] csvHeaderColumns; | |||
private String csvDelimiter; | |||
private String complexDelimiterLevel1; | |||
private String complexDelimiterLevel2; | |||
private ArrayList<String> complexDelimiters = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do lazy initialization in the setter method. This will avoid extra memory being consumed for non complex type schemas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return new String[] { complexDelimiterLevel1, complexDelimiterLevel2 }; | ||
String[] delimiters = new String[complexDelimiters.size()]; | ||
delimiters = complexDelimiters.toArray(delimiters); | ||
return delimiters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is not required. Conversion can be done whereever required
1d18435
to
b0fa65c
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1716/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9976/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1928/ |
b0fa65c
to
7843195
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1719/ |
LGTM...check for test case failures if any and fix |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9979/ |
7843195
to
72ec8c3
Compare
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1931/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1722/ |
72ec8c3
to
c8ffa07
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1723/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9983/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1935/ |
c8ffa07
to
0a84d52
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1727/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1939/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9987/ |
0a84d52
to
db02b72
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1757/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1968/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10016/ |
db02b72
to
fe4fd98
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1758/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10017/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1969/ |
@manishnalla1994 ...Please rebase and remove the commit for PR 2979 as it is already merged |
fe4fd98
to
d200a65
Compare
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1764/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10023/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1976/ |
LGTM |
Support Create DDL for Map type. This closes #2980
Support Create DDL for Map type. This closes apache#2980
Support Create DDL for Map type.
This PR is dependant on PR#2979 for the change of delimiters.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.