Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-3134] fixed null values when cachelevel is set as blocklet #2956

Closed
wants to merge 1 commit into from

Conversation

kunal642
Copy link
Contributor

Problem:
For each blocklet an object of SegmentPropertiesAndSchemaHolder is created to store the schema used for query. This object is created only if no other blocklet has the same schema. To check the schema we are comparing List, as the equals method in ColumnSchema does not check for columnUniqueId therefore this check is failing and the new restructured blocklet is using the schema of the old blocklet. Due to this the newly added column is being ignored as the old blocklet schema specifies that the column is delete(alter drop).

Solution:
Instead of checking the equality through equals and hashcode, write a new implementation for both and check based on columnUniqueId.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed?

  • Any backward compatibility impacted?

  • Document update required?

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1547/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1759/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9807/

@kunal642 kunal642 force-pushed the bug/CARBONDATA-3134 branch 2 times, most recently from 5468f98 to ff20746 Compare November 27, 2018 11:25
@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1551/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9811/

.equals(columnCardinality, other.columnCardinality);
}

private boolean checkColumnSchemaEquality(List<ColumnSchema> obj1, List<ColumnSchema> obj2) {
List<ColumnSchema> clonedObj1 = new ArrayList<>(obj1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add the first check for length in the first line of method...if length of 2 lists is not same then we can return false from here itself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think checkColumnSchemaEquality method need consider "obj2 == null"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return tableIdentifier.hashCode() + columnsInTable.hashCode() + Arrays
int hashCode = 0;
for (ColumnSchema columnSchema: columnsInTable) {
hashCode += columnSchema.hashCodeWithColumnId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename variable name to allColumnsHashCode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1763/

return o1.getColumnUniqueId().compareTo(o2.getColumnUniqueId());
}
});
Collections.sort(clonedObj2, new Comparator<ColumnSchema>() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can optimize the duplicate code of two comparators, because they are same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1557/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1769/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9816/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1561/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9819/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1772/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1562/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9820/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1773/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1565/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9823/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1776/

Problem:
For each blocklet an object of SegmentPropertiesAndSchemaHolder is created to store the schema used for query. This object is created only if no other blocklet has the same schema. To check the schema we are comparing List<ColumnSchema>, as the equals method in ColumnSchema does not check for columnUniqueId therefore this check is failing and the new restructured blocklet is using the schema of the old blocklet. Due to this the newly added column is being ignored as the old blocklet schema specifies that the column is delete(alter drop).

Solution:
Instead of checking the equality through equals and hashcode, write a new implementation for both and check based on columnUniqueId.
@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1567/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1778/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9825/

@kunal642
Copy link
Contributor Author

@manishgupta88 Please review

@manishgupta88
Copy link
Contributor

LGTM

@asfgit asfgit closed this in a5f080b Nov 28, 2018
asfgit pushed a commit that referenced this pull request Nov 30, 2018
Problem:
For each blocklet an object of SegmentPropertiesAndSchemaHolder is created to store the schema used for query. This object is created only if no other blocklet has the same schema. To check the schema we are comparing List<ColumnSchema>, as the equals method in ColumnSchema does not check for columnUniqueId therefore this check is failing and the new restructured blocklet is using the schema of the old blocklet. Due to this the newly added column is being ignored as the old blocklet schema specifies that the column is delete(alter drop).

Solution:
Instead of checking the equality through equals and hashcode, write a new implementation for both and check based on columnUniqueId.

This closes #2956
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants