[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a #2876

ajantha-bhat · 2018-10-29T12:29:09Z

[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a with CarbonDictionaryDecoder.doConsume() codeGen

problem: In S3a environment, when queried the data which has dictionary files,
Dictionary file cannot be read in S3a with CarbonDictionaryDecoder.doConsume() codeGen even though file is present.

cause: CarbonDictionaryDecoder.doConsume() codeGen doesn't set hadoop conf in thread local variable, only doExecute() sets it.

Hence, when getDictionaryWrapper() called from doConsume() codeGen,

AbstractDictionaryCache.getDictionaryMetaCarbonFile() returns false for fileExists() operation.

solution:
In CarbonDictionaryDecoder.doConsume() codeGen, set hadoop conf in thread local variable

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

Any interfaces changed? NA
Any backward compatibility impacted? NA
Document update required? NA
Testing done
done
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA

ajantha-bhat · 2018-10-29T12:29:26Z

@ravipesala , @kunal642 : please review

CarbonDataQA · 2018-10-29T13:38:28Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1122/

CarbonDataQA · 2018-10-29T14:02:03Z

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1334/

CarbonDataQA · 2018-10-29T15:02:04Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9386/

ajantha-bhat · 2018-10-29T15:08:55Z

retest this please

CarbonDataQA · 2018-10-29T16:13:28Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1126/

CarbonDataQA · 2018-10-29T16:20:26Z

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1339/

CarbonDataQA · 2018-10-29T18:02:03Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9390/

ajantha-bhat · 2018-10-30T05:01:10Z

retest this please

CarbonDataQA · 2018-10-30T05:14:43Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1135/

CarbonDataQA · 2018-10-30T06:13:04Z

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1347/

CarbonDataQA · 2018-10-30T06:14:37Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9399/

CarbonDataQA · 2018-10-30T06:50:02Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1140/

CarbonDataQA · 2018-10-30T07:19:39Z

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9405/

CarbonDataQA · 2018-10-30T07:49:53Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1352/

manishgupta88 · 2018-10-30T08:13:47Z

integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDictionaryDecoder.scala


      val exprs = child.output.map { exp =>
        ExpressionCanonicalizer.execute(BindReferences.bindReference(exp, child.output))
      }
      ctx.currentVars = input
      val resultVars = exprs.zipWithIndex.map { case (expr, index) =>
        if (dicts(index) != null) {
+          ThreadLocalSessionInfo.setConfigurationToCurrentThread(conf.value.value)


Remove its usage from this place

manishgupta88 · 2018-10-30T08:15:33Z

integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDictionaryDecoder.scala


  var dictionary: Dictionary = null

  var dictionaryLoader: DictionaryLoader = _

  def getDictionaryValueForKeyInBytes (surrogateKey: Int): Array[Byte] = {
    if (dictionary == null) {
+      ThreadLocalSessionInfo.getOrCreateCarbonSessionInfo().getNonSerializableExtraInfo


Move this in doConsume method inside if (CarbonDictionaryDecoder.isRequiredToDecode(getDictionaryColumnIds)) { as wrapper is not the correct place for setting ThreadLocalSessionInfo

cannot me moved. If moved problem in serialization with code gen

…onDictionaryDecoder.doConsume() codeGen

manishgupta88 · 2018-10-30T09:54:06Z

LGTM...can be merged once build passes

CarbonDataQA · 2018-10-30T11:41:09Z

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9418/

CarbonDataQA · 2018-10-30T11:47:47Z

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1366/

CarbonDataQA · 2018-10-30T12:04:32Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1154/

CarbonDataQA · 2018-10-30T15:37:48Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1165/

CarbonDataQA · 2018-10-30T16:35:20Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9429/

CarbonDataQA · 2018-10-30T16:35:32Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1378/

manishgupta88 · 2018-10-31T05:16:04Z

LGTM

CarbonDataQA · 2018-10-31T05:52:40Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1173/

CarbonDataQA · 2018-10-31T06:49:48Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9437/

CarbonDataQA · 2018-10-31T07:07:45Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1386/

…onDictionaryDecoder.doConsume() codeGen problem: In S3a environment, when queried the data which has dictionary files, Dictionary file cannot be read in S3a with CarbonDictionaryDecoder.doConsume() codeGen even though file is present. cause: CarbonDictionaryDecoder.doConsume() codeGen doesn't set hadoop conf in thread local variable, only doExecute() sets it. Hence, when getDictionaryWrapper() called from doConsume() codeGen, AbstractDictionaryCache.getDictionaryMetaCarbonFile() returns false for fileExists() operation. solution: In CarbonDictionaryDecoder.doConsume() codeGen, set hadoop conf in thread local variable This closes #2876

ajantha-bhat changed the title ~~[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a with CarbonDictionaryDecoder.doConsume() codeGen~~ [CARBONDATA-3054] Fix Dictionary file cannot be read in S3a Oct 29, 2018

ajantha-bhat force-pushed the master_new branch from 4928236 to 9563057 Compare October 30, 2018 06:38

manishgupta88 reviewed Oct 30, 2018

View reviewed changes

manishgupta88 suggested changes Oct 30, 2018

View reviewed changes

[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a with Carb…

6f1da20

…onDictionaryDecoder.doConsume() codeGen

ajantha-bhat force-pushed the master_new branch from 9563057 to 6f1da20 Compare October 30, 2018 09:49

test case fix

4a369d4

added comments

3ca0258

asfgit closed this in bcf3e0f Oct 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a #2876

[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a #2876

ajantha-bhat commented Oct 29, 2018

ajantha-bhat commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

ajantha-bhat commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

ajantha-bhat commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

manishgupta88 Oct 30, 2018

ajantha-bhat Oct 30, 2018 •

edited

manishgupta88 Oct 30, 2018

ajantha-bhat Oct 30, 2018 •

edited

manishgupta88 commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

manishgupta88 commented Oct 31, 2018

CarbonDataQA commented Oct 31, 2018

CarbonDataQA commented Oct 31, 2018

CarbonDataQA commented Oct 31, 2018

[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a #2876

[CARBONDATA-3054] Fix Dictionary file cannot be read in S3a #2876

Conversation

ajantha-bhat commented Oct 29, 2018

ajantha-bhat commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

ajantha-bhat commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

CarbonDataQA commented Oct 29, 2018

ajantha-bhat commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

manishgupta88 Oct 30, 2018

Choose a reason for hiding this comment

ajantha-bhat Oct 30, 2018 • edited

Choose a reason for hiding this comment

manishgupta88 Oct 30, 2018

Choose a reason for hiding this comment

ajantha-bhat Oct 30, 2018 • edited

Choose a reason for hiding this comment

manishgupta88 commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

CarbonDataQA commented Oct 30, 2018

manishgupta88 commented Oct 31, 2018

CarbonDataQA commented Oct 31, 2018

CarbonDataQA commented Oct 31, 2018

CarbonDataQA commented Oct 31, 2018

ajantha-bhat Oct 30, 2018 •

edited

ajantha-bhat Oct 30, 2018 •

edited