Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-3196] [CARBONDATA-3203]Fixed Compaction for Complex types with Dictionary Include and also supported Compaction for restructured table #3022

Closed

Conversation

manishnalla1994
Copy link
Contributor

@manishnalla1994 manishnalla1994 commented Dec 24, 2018

Problem1: Compaction Failing for Complex datatypes with Dictionary Include as KeyGenenrator was not being set in model for Dictionary Include Complex Columns and dictionary include complex columns were not handled for finding cardinality.

Solution: Handled both these issues by setting KeyGenerator and storing cardinality of Complex dictionary include columns.

Problem2: Compaction was failing for restructured table containing dictionary include complex columns.

Solution: Handled complex columns for this case by inserting correct indices of the columns.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed?

  • Any backward compatibility impacted?

  • Document update required?

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@manishnalla1994 manishnalla1994 force-pushed the ComplexCompactionIssue branch 2 times, most recently from 5e975de to 0045616 Compare December 24, 2018 12:32
@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1929/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1930/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10183/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2140/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1952/

@@ -1068,4 +1068,40 @@ class TestCompactionComplexType extends QueryTest with BeforeAndAfterAll {
sql("Drop table if exists adaptive")
}

test("Test major compaction for struct of array type") {
sql("DROP TABLE IF EXISTS carbon")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Include dropping of table in afterAll also
  2. Give the test case description as below
    Test major compaction with dictionary include for struct of array type

new int[segmentProperties.getDimColumnsCardinality().length + segmentProperties
.getComplexDimColumnCardinality().length];
for (int i = 0; i < segmentProperties.getDimColumnsCardinality().length; i++) {
dimAndComplexColumnCardinality[i] = segmentProperties.getDimColumnsCardinality()[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for a resturcture drop column case when few loads are done and then dictionary column is dropped and compaction is triggered. In that case segmentProperties will contain cardinality of dropped column also check finally what is the schema and cardinality written during compaction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restructure case is not handled for complex types compaction. I have raised the JIRA issue and I will handle it. Please find the JIRA link : 'https://issues.apache.org/jira/browse/CARBONDATA-3203' .

@@ -699,4 +701,33 @@ public static boolean isRawDataRequired(CarbonDataLoadConfiguration configuratio
return iterators;
}

public static int[] calcDimensionLengths(int numberOfSortColumns, int[] complexCardinality) {
if (!(numberOfSortColumns > 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Rewrite the condition if (!(numberOfSortColumns > 0)) as if (numberOfSortColumns == 0)
  2. The functionality of this method is not clear. Add a comment to explain the logic explanation and use of this method

for (int eachDimLen : complexCardinality) {
if (eachDimLen != 0) dimsLenList.add(eachDimLen);
}
int[] dimLens = new int[dimsLenList.size()];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion from arrayList to array can be done directly

return dimLens;
}

public static KeyGenerator[] createKeyGeneratorForComplexDimension(int numberOfSortColumns,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a method comment to explain the logic and method usage

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10205/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2227/

@manishnalla1994 manishnalla1994 changed the title [CARBONDATA-3196] Fixed Compaction for Complex types with Dictionary Include [CARBONDATA-3196] [CARBONDATA-3203]Fixed Compaction for Complex types with Dictionary Include and also support Compaction for restructured table Dec 27, 2018
@manishnalla1994 manishnalla1994 changed the title [CARBONDATA-3196] [CARBONDATA-3203]Fixed Compaction for Complex types with Dictionary Include and also support Compaction for restructured table [CARBONDATA-3196] [CARBONDATA-3203]Fixed Compaction for Complex types with Dictionary Include and also supported Compaction for restructured table Dec 27, 2018
@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2257/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2051/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10303/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2062/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2266/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10315/

@manishgupta88
Copy link
Contributor

LGTM

@asfgit asfgit closed this in 7c4e79f Dec 28, 2018
asfgit pushed a commit that referenced this pull request Jan 21, 2019
… with Dictionary Include and also supported Compaction for restructured table

Problem1: Compaction Failing for Complex datatypes with Dictionary Include as KeyGenenrator was not being set in model for Dictionary Include Complex
Columns and dictionary include complex columns were not handled for finding cardinality.

Solution: Handled both these issues by setting KeyGenerator and storing cardinality of Complex dictionary include columns.

Problem2: Compaction was failing for restructured table containing dictionary include complex columns.

Solution: Handled complex columns for this case by inserting correct indices of the columns.

This closes #3022
qiuchenjian pushed a commit to qiuchenjian/carbondata that referenced this pull request Jun 14, 2019
… with Dictionary Include and also supported Compaction for restructured table

Problem1: Compaction Failing for Complex datatypes with Dictionary Include as KeyGenenrator was not being set in model for Dictionary Include Complex
Columns and dictionary include complex columns were not handled for finding cardinality.

Solution: Handled both these issues by setting KeyGenerator and storing cardinality of Complex dictionary include columns.

Problem2: Compaction was failing for restructured table containing dictionary include complex columns.

Solution: Handled complex columns for this case by inserting correct indices of the columns.

This closes apache#3022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants