[CARBONDATA-3337] Implemented Hadoop RPC framework for index server #3171

kunal642 · 2019-04-03T07:10:58Z

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

CarbonDataQA · 2019-04-03T07:20:03Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2838/

CarbonDataQA · 2019-04-03T07:24:11Z

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/11098/

CarbonDataQA · 2019-04-03T07:26:53Z

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3069/

CarbonDataQA · 2019-04-03T08:25:05Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2839/

CarbonDataQA · 2019-04-03T08:40:00Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2840/

CarbonDataQA · 2019-04-03T09:48:52Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3071/

CarbonDataQA · 2019-04-03T09:55:41Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/11100/

qiuchenjian · 2019-04-03T12:12:13Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala

+        .setBindAddress(serverIp)
+        .setPort(serverPort)
+        .setProtocol(classOf[ServerInterface]).build
+      server.start()


Does start function have return code to judge the status of startting?

No, return type is void

jackylk · 2019-04-23T07:07:42Z

core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java

@@ -2104,4 +2104,17 @@ private CarbonCommonConstants() {
   */
  public static final String CARBON_QUERY_DATAMAP_BLOOM_CACHE_SIZE_DEFAULT_VAL = "512";

-}
+  public static final String CARBON_INDEX_SERVER_POLICY = "carbon.index.server.policy";


please write description for this config in comment, like in what scenario this config is used, why user would config it, what valid values are, whether it is dynamic configurable, etc

jackylk · 2019-04-23T07:53:43Z

core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java

+
+  public static final String CARBON_DRIVER_PRUNE_POLICY = "driver";
+
+  public static final String CARBON_INDEX_SERVER_IP = "carbon.index.server.ip";


please write comment for all new configuration. Same comment we should put to document

jackylk · 2019-04-23T07:55:23Z

core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java

@@ -1601,4 +1601,47 @@ private void validateDetailQueryBatchSize() {
      }
    }
  }
-}
+
+  public String getIndexServerPolicy() {


please add comment for all public function

jackylk · 2019-04-23T07:56:22Z

core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java

+
+  public static final String CARBON_DISTRIBUTED_PRUNE_POLICY = "distributed";
+
+  public static final String CARBON_EMBEDDED_PRUNE_POLICY = "embedded";


create a "CARBON_INDEX_SERVER_POLICY_DEFAULT" constant

jackylk · 2019-04-23T07:56:44Z

core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java

+      LOGGER.info("The configured value for " + CarbonCommonConstants.CARBON_INDEX_SERVER_POLICY
+          + " is not valid, therefore taking " + CarbonCommonConstants.CARBON_EMBEDDED_PRUNE_POLICY
+          + " as the index server prune policy");
+      return CarbonCommonConstants.CARBON_EMBEDDED_PRUNE_POLICY;


use CARBON_INDEX_SERVER_POLICY_DEFAULT

jackylk · 2019-04-23T07:59:26Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedDataMapJob.scala

+import org.apache.carbondata.core.indexstore.ExtendedBlocklet
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf
+
+class DistributedDataMapJob extends AbstractDataMapJob {


Why is this class in spark2 integration module? Can't put in carbondata-core?

jackylk · 2019-04-23T07:59:47Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala

+import org.apache.carbondata.spark.rdd.CarbonRDD
+
+
+class DistributedPruneRDD(@transient private val ss: SparkSession,


write comment please

jackylk · 2019-04-23T08:05:13Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala

+
+  override def internalCompute(split: Partition,
+      context: TaskContext): Iterator[(String, ExtendedBlocklet)] = {
+    Nil.iterator


No implementation?

jackylk · 2019-04-23T08:09:05Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala

+      throw new InvalidConfigurationException("Spark master URL is not set.")
+    }
+    val spark = SparkSession
+      .builder().config(new SparkConf())


move config together

jackylk · 2019-04-23T08:13:21Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala

+}
+
+object IndexServer extends ServerInterface {
+  val prunePolicy: String = CarbonProperties.getInstance().getIndexServerPolicy


use these variables in main directly, no need to declare local variables

jackylk · 2019-04-23T08:14:01Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala

+  }
+
+  def getClient: ServerInterface = {
+    import org.apache.hadoop.ipc.RPC


move to head

jackylk · 2019-04-23T08:15:59Z

integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala

+
+  private val serverPort: Int = CarbonProperties.getInstance().getIndexServerPort
+
+  def getSplits(request: DistributableDataMapFormat): Array[(String, ExtendedBlocklet)] = {


no override?

kunal642 · 2019-04-23T08:24:20Z

@jackylk This is the old PR..Can you review the new PR #3177 instead of this. I will handle the comments of this PR in 3177

kunal642 force-pushed the task/CARBONDATA-3337 branch from 48c9cc3 to c8c79b9 Compare April 3, 2019 08:15

Implemented Hadoop RPC framework

d3edfd7

kunal642 force-pushed the task/CARBONDATA-3337 branch from c8c79b9 to d3edfd7 Compare April 3, 2019 08:29

qiuchenjian reviewed Apr 3, 2019

View reviewed changes

jackylk reviewed Apr 23, 2019

View reviewed changes

kunal642 closed this Apr 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-3337] Implemented Hadoop RPC framework for index server #3171

[CARBONDATA-3337] Implemented Hadoop RPC framework for index server #3171

kunal642 commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

qiuchenjian Apr 3, 2019

kunal642 Apr 3, 2019

jackylk Apr 23, 2019 •

edited

jackylk Apr 23, 2019 •

edited

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

jackylk Apr 23, 2019

kunal642 commented Apr 23, 2019


		public static final String CARBON_DRIVER_PRUNE_POLICY = "driver";

		public static final String CARBON_INDEX_SERVER_IP = "carbon.index.server.ip";


		public static final String CARBON_DISTRIBUTED_PRUNE_POLICY = "distributed";

		public static final String CARBON_EMBEDDED_PRUNE_POLICY = "embedded";

		import org.apache.carbondata.spark.rdd.CarbonRDD


		class DistributedPruneRDD(@transient private val ss: SparkSession,


		private val serverPort: Int = CarbonProperties.getInstance().getIndexServerPort

		def getSplits(request: DistributableDataMapFormat): Array[(String, ExtendedBlocklet)] = {

[CARBONDATA-3337] Implemented Hadoop RPC framework for index server #3171

[CARBONDATA-3337] Implemented Hadoop RPC framework for index server #3171

Conversation

kunal642 commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

CarbonDataQA commented Apr 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackylk Apr 23, 2019 • edited

Choose a reason for hiding this comment

jackylk Apr 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kunal642 commented Apr 23, 2019

jackylk Apr 23, 2019 •

edited

jackylk Apr 23, 2019 •

edited