[HOTFIX]Fixed Join Query Performance issue

Problem Join Query is taking for time in case of carbon as for some of the Join Query it is not reusing the exchange plan(Scanning+Snuffle) this is because in ReuseExchange it checking same result of two plan is same or not, in case of carbon it always giving false as SparkCarbonTable equals method is not overridden and comparison is failing. Solution: Added equals method in SparkCarbonTableFormat Tested TPCH query Query: select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 300 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate; This closes #2650
apache · Aug 26, 2018 · f81543e · f81543e
1 parent 17a4b48
commit f81543e
Showing 1 changed file with 1 addition and 0 deletions.
diff --git a/...k2/src/main/scala/org/apache/spark/sql/execution/datasources/SparkCarbonTableFormat.scala b/...k2/src/main/scala/org/apache/spark/sql/execution/datasources/SparkCarbonTableFormat.scala
@@ -221,6 +221,7 @@ with Serializable {
 
     }
   }
+  override def equals(other: Any): Boolean = other.isInstanceOf[SparkCarbonTableFormat]
 }
 
 case class CarbonSQLHadoopMapReduceCommitProtocol(jobId: String, path: String, isAppend: Boolean)