[CARBONDATA-3223] Fixed Wrong Datasize and Indexsize calculation for old store using Show Segments #3047

manishnalla1994 · 2019-01-02T12:46:09Z

Problem: Table Created and Loading on older version(1.1) was showing data-size and index-size 0B when refreshed on new version. This was because when the data-size was coming as "null" we were not computing it, directly assigning 0 value to it.

Solution: Showing the old datasize and indexsize as NA.

Also refactored SetQuerySegment code for better understandability.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

CarbonDataQA · 2019-01-02T13:01:43Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2124/

CarbonDataQA · 2019-01-02T13:53:35Z

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2330/

CarbonDataQA · 2019-01-02T14:10:11Z

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10378/

qiuchenjian · 2019-01-03T00:55:39Z

integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala

+              (dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_DATA_SIZE).toLong,
+                dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_INDEX_SIZE).toLong)
+            } else {
+              (load.getDataSize.toLong,


if one of load.getDataSize and load.getIndexSize is null, it will throw exception, i think this scene should be considered

Yes, fixed it now.

CarbonDataQA · 2019-01-03T04:54:24Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2135/

CarbonDataQA · 2019-01-03T05:53:22Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10389/

CarbonDataQA · 2019-01-03T05:54:21Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2341/

manishgupta88 · 2019-01-03T06:53:57Z

integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala

@@ -46,9 +47,9 @@ object CarbonStore {

  def showSegments(
      limit: Option[String],
-      tablePath: String,
+      carbonTable: CarbonTable,


Move carbonTable as the first argument of method

manishgupta88 · 2019-01-03T07:06:59Z

integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala

+            if (null == load.getDataSize || null == load.getIndexSize) {
+              // If either of datasize or indexsize comes to be null the we calculate the correct
+              // size and assign
+              val dataIndexSize = CarbonUtil.calculateDataIndexSize(carbonTable, false)


Boolean flag in the method call is to update the data and index size in the table status file. Pass the flag as true so that it computes the size and update the table status file. This will avoid calculation for each Show Segment call

CarbonDataQA · 2019-01-03T09:01:56Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2143/

manishgupta88 · 2019-01-03T09:31:35Z

LGTM...can be merged once build passes

CarbonDataQA · 2019-01-03T10:21:19Z

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10397/

CarbonDataQA · 2019-01-03T10:25:41Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2349/

manishnalla1994 · 2019-01-03T10:26:08Z

retest this please

CarbonDataQA · 2019-01-03T11:55:52Z

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2361/

KanakaKumar · 2019-01-03T12:06:48Z

integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala

+            if (null == load.getDataSize || null == load.getIndexSize) {
+              // If either of datasize or indexsize comes to be null the we calculate the correct
+              // size and assign
+              val dataIndexSize = CarbonUtil.calculateDataIndexSize(carbonTable, true)


Show segments is a read only query. I think we should not perform write operation in a query.
So, I feel its better to calculate every time and show OR just display as not available.

As it is a metadata function, we are just computing it once and saving it while passing TRUE in 'calculateDataIndexSize' this function. So the value computed can be used afterwards also.

CarbonDataQA · 2019-01-03T12:07:59Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10401/

CarbonDataQA · 2019-01-03T12:18:46Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2148/

CarbonDataQA · 2019-01-04T11:22:07Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2165/

CarbonDataQA · 2019-01-04T12:22:48Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2379/

CarbonDataQA · 2019-01-04T12:27:37Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10420/

manishgupta88 · 2019-01-07T04:58:55Z

LGTM

…old store using Show Segments Problem: Table Created and Loading on older version(1.1) was showing data-size and index-size 0B when refreshed on new version. This was because when the data-size was coming as "null" we were not computing it, directly assigning 0 value to it. Solution: Showing the old datasize and indexsize as NA. Also refactored SetQuerySegment code for better understandability. This closes #3047

…old store using Show Segments Problem: Table Created and Loading on older version(1.1) was showing data-size and index-size 0B when refreshed on new version. This was because when the data-size was coming as "null" we were not computing it, directly assigning 0 value to it. Solution: Showing the old datasize and indexsize as NA. Also refactored SetQuerySegment code for better understandability. This closes apache#3047

qiuchenjian reviewed Jan 3, 2019

View reviewed changes

manishnalla1994 force-pushed the Datasize0Issue branch from 6bf65d7 to 7380eaa Compare January 3, 2019 04:43

manishgupta88 suggested changes Jan 3, 2019

View reviewed changes

manishnalla1994 force-pushed the Datasize0Issue branch from 7380eaa to 31c682a Compare January 3, 2019 08:48

KanakaKumar reviewed Jan 3, 2019

View reviewed changes

manishnalla1994 force-pushed the Datasize0Issue branch from 31c682a to 49bd919 Compare January 4, 2019 11:05

Fixed Wrong Datasize and Indexsize calculation for old store

c78f9df

manishnalla1994 force-pushed the Datasize0Issue branch from 49bd919 to c78f9df Compare January 4, 2019 11:07

asfgit closed this in 72da334 Jan 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-3223] Fixed Wrong Datasize and Indexsize calculation for old store using Show Segments #3047

[CARBONDATA-3223] Fixed Wrong Datasize and Indexsize calculation for old store using Show Segments #3047

manishnalla1994 commented Jan 2, 2019 •

edited

CarbonDataQA commented Jan 2, 2019

CarbonDataQA commented Jan 2, 2019

CarbonDataQA commented Jan 2, 2019

qiuchenjian Jan 3, 2019

manishnalla1994 Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

manishgupta88 Jan 3, 2019

manishnalla1994 Jan 3, 2019

manishgupta88 Jan 3, 2019

manishnalla1994 Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

manishgupta88 commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

manishnalla1994 commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

KanakaKumar Jan 3, 2019

manishnalla1994 Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 4, 2019

CarbonDataQA commented Jan 4, 2019

CarbonDataQA commented Jan 4, 2019

manishgupta88 commented Jan 7, 2019

[CARBONDATA-3223] Fixed Wrong Datasize and Indexsize calculation for old store using Show Segments #3047

[CARBONDATA-3223] Fixed Wrong Datasize and Indexsize calculation for old store using Show Segments #3047

Conversation

manishnalla1994 commented Jan 2, 2019 • edited

CarbonDataQA commented Jan 2, 2019

CarbonDataQA commented Jan 2, 2019

CarbonDataQA commented Jan 2, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarbonDataQA commented Jan 3, 2019

manishgupta88 commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

manishnalla1994 commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 3, 2019

CarbonDataQA commented Jan 4, 2019

CarbonDataQA commented Jan 4, 2019

CarbonDataQA commented Jan 4, 2019

manishgupta88 commented Jan 7, 2019

manishnalla1994 commented Jan 2, 2019 •

edited