Skip to content

Commit

Permalink
[HOTFIX]Fixed Join Query Performance issue
Browse files Browse the repository at this point in the history
Problem
Join Query is taking for time in case of carbon as for some of the Join Query it is not reusing the exchange plan(Scanning+Snuffle) this is because in ReuseExchange it checking same result of two plan is same or not, in case of carbon it always giving false as SparkCarbonTable equals method is not overridden and comparison is failing.

Solution: Added equals method in SparkCarbonTableFormat

Tested TPCH query
Query:
select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 300 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate;

This closes #2650
  • Loading branch information
kumarvishal09 authored and ravipesala committed Aug 26, 2018
1 parent 17a4b48 commit f81543e
Showing 1 changed file with 1 addition and 0 deletions.
Expand Up @@ -221,6 +221,7 @@ with Serializable {

}
}
override def equals(other: Any): Boolean = other.isInstanceOf[SparkCarbonTableFormat]
}

case class CarbonSQLHadoopMapReduceCommitProtocol(jobId: String, path: String, isAppend: Boolean)
Expand Down

0 comments on commit f81543e

Please sign in to comment.