Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-3223] Fixed Wrong Datasize and Indexsize calculation for old store using Show Segments #3047

Closed

Conversation

manishnalla1994
Copy link
Contributor

@manishnalla1994 manishnalla1994 commented Jan 2, 2019

Problem: Table Created and Loading on older version(1.1) was showing data-size and index-size 0B when refreshed on new version. This was because when the data-size was coming as "null" we were not computing it, directly assigning 0 value to it.

Solution: Showing the old datasize and indexsize as NA.

Also refactored SetQuerySegment code for better understandability.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed?

  • Any backward compatibility impacted?

  • Document update required?

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2124/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2330/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10378/

(dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_DATA_SIZE).toLong,
dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_INDEX_SIZE).toLong)
} else {
(load.getDataSize.toLong,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if one of load.getDataSize and load.getIndexSize is null, it will throw exception, i think this scene should be considered

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed it now.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2135/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10389/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2341/

@@ -46,9 +47,9 @@ object CarbonStore {

def showSegments(
limit: Option[String],
tablePath: String,
carbonTable: CarbonTable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move carbonTable as the first argument of method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (null == load.getDataSize || null == load.getIndexSize) {
// If either of datasize or indexsize comes to be null the we calculate the correct
// size and assign
val dataIndexSize = CarbonUtil.calculateDataIndexSize(carbonTable, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boolean flag in the method call is to update the data and index size in the table status file. Pass the flag as true so that it computes the size and update the table status file. This will avoid calculation for each Show Segment call

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2143/

@manishgupta88
Copy link
Contributor

LGTM...can be merged once build passes

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10397/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2349/

@manishnalla1994
Copy link
Contributor Author

retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2361/

if (null == load.getDataSize || null == load.getIndexSize) {
// If either of datasize or indexsize comes to be null the we calculate the correct
// size and assign
val dataIndexSize = CarbonUtil.calculateDataIndexSize(carbonTable, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show segments is a read only query. I think we should not perform write operation in a query.
So, I feel its better to calculate every time and show OR just display as not available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is a metadata function, we are just computing it once and saving it while passing TRUE in 'calculateDataIndexSize' this function. So the value computed can be used afterwards also.

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10401/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2148/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2165/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2379/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10420/

@manishgupta88
Copy link
Contributor

LGTM

@asfgit asfgit closed this in 72da334 Jan 7, 2019
asfgit pushed a commit that referenced this pull request Jan 21, 2019
…old store using Show Segments

Problem: Table Created and Loading on older version(1.1) was showing data-size and index-size 0B when refreshed on new version. This was
because when the data-size was coming as "null" we were not computing it, directly assigning 0 value to it.

Solution: Showing the old datasize and indexsize as NA.

Also refactored SetQuerySegment code for better understandability.

This closes #3047
qiuchenjian pushed a commit to qiuchenjian/carbondata that referenced this pull request Jun 14, 2019
…old store using Show Segments

Problem: Table Created and Loading on older version(1.1) was showing data-size and index-size 0B when refreshed on new version. This was
because when the data-size was coming as "null" we were not computing it, directly assigning 0 value to it.

Solution: Showing the old datasize and indexsize as NA.

Also refactored SetQuerySegment code for better understandability.

This closes apache#3047
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants