[SPARK-17247][SQL]: when calcualting size of a relation from hdfs, th… #14817

Parth-Brahmbhatt · 2016-08-25T22:42:37Z

What changes were proposed in this pull request?

when calcualting size of a relation from hdfs, the size calculation should be aborted once size is established to be > broadcast threshold. This will be really helpful when the config is enabled and one of the tables in query is partitioned and fairly large.

How was this patch tested?

Unit test modified.

…e size calculation should be aborted once size is established to be > broadcast threshold.

hvanhovell · 2016-08-25T23:11:11Z

@Parth-Brahmbhatt we are currently working Cost Based Optimization in Spark. An important input will be the actual size of the table. Having partial statistics (what you are suggestion) will not make this as good as it can be.

The other issue is that the size of the table is currently used to determine if we are doing a hash join or a sort merge join. In order to do a hash join, the build side must be at least 3x smaller than the stream side and the build side should not occupy more memory per worker than the broadcast threshold. This will break that to.

Parth-Brahmbhatt · 2016-08-25T23:32:29Z

@hvanhovell The behavior in case fallbackToHdfs is not enabled ( and by default it is not enabled for performance reason) is to return the value specified via spark.sql.defaultSizeInBytes (default Long.MaxValue) which is also wrong and I am fine with returning that value instead of returning the partial value.

This patch is trying to reduce the perf penalty that we have to pay when the fall back is enabled and in that case trying to get an accurate size is just too expansive when one of the 2 tables being joined is huge. I could also add a config to enable/disable this behavior but not even having the option just makes this fallback useless in most cases.

srowen · 2016-08-26T07:33:35Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala

    }
  )

+  private def getSize(f: Path): Long = {


I see what you're doing but this method no longer really does what it says. I think this is going to be error-prone.

Renamed the method, added documentation and also returning the default bytes when size > broadcast threshold.

…nstead of returning any value > broadcast threshold.

Parth-Brahmbhatt · 2016-08-26T18:33:56Z

@hvanhovell can you also point me at the design doc/discuss thread for CBO work? Thanks.

Parth-Brahmbhatt · 2016-08-31T17:10:47Z

Can one of the committers take a look at this PR?

hvanhovell · 2016-08-31T17:53:10Z

@Parth-Brahmbhatt here is the CBO ticket: https://issues.apache.org/jira/browse/SPARK-16026

Could you explain why this is so slow? Is this because of listing the files? Or because of the amount? We might be able to speed up the first one, see AlterTableRecoverPartitionsCommand for an example of this.

Parth-Brahmbhatt · 2016-08-31T19:03:46Z

@hvanhovell its because of listing and gets worst as amount increases.

hvanhovell · 2016-08-31T19:06:54Z

@Parth-Brahmbhatt would the approach taken in AlterTableRecoverPartitionsCommand help?

Parth-Brahmbhatt · 2016-08-31T19:17:23Z

@hvanhovell I will take a look at it and update this PR.

Parth-Brahmbhatt · 2016-09-01T19:42:03Z

@hvanhovell I looked at AlterTableRecoverPartitionsCommand and the parallelism in listing could help it will still cause huge perf penalty. We have tables with millions of partitions and we use s3 for storage where listing is more expansive. I think it is much better to just stop listing once we know the stat used only for join optimization won't meet the threshold and I don't see the downside compared to what we currently offer.

Parth-Brahmbhatt · 2016-09-08T16:49:35Z

Can someone please review this PR? Thanks.

Parth-Brahmbhatt · 2016-09-14T16:59:25Z

Request for review one more time.

hvanhovell · 2016-09-14T19:27:24Z

@Parth-Brahmbhatt I am very curious why you have millions of partitions. What is the use case? You will be in a world of hurt as soon as you do any listing.

I am not going to merge this PR as-is, as it clashes with the work on CBO. I would support an analyze table command, that would allow for partial scanning, e.g. ANALYZE TABLE xyz NOSCAN BROADCAST ONLY. Lets discuss a design first, before pressing ahead.

AmplabJenkins · 2016-09-17T11:57:14Z

Can one of the admins verify this patch?

Parth-Brahmbhatt · 2016-09-28T18:49:14Z

@hvanhovell We have tables with 5-6 partition columns and data going back 4-5 years and given our data is stored in s3 the listing is paginated.

If you want to wait till CBO work is done, that is fine We can resume reviewing after the CBO work is done.

gatorsmile · 2017-05-23T18:06:51Z

Thanks for reporting it!

After CBO, the relation size is not only used for deciding whether a table can be broadcasted. Maybe we can close this PR now?

[SPARK-17247][SQL]: when calcualting size of a relation from hdfs, th…

563a2ef

…e size calculation should be aborted once size is established to be > broadcast threshold.

srowen reviewed Aug 26, 2016
View reviewed changes

Method renamed and adding doc to method. Returning the default size i…

6195ac9

…nstead of returning any value > broadcast threshold.

Parth-Brahmbhatt closed this May 23, 2017

[SPARK-17247][SQL]: when calcualting size of a relation from hdfs, th… #14817

[SPARK-17247][SQL]: when calcualting size of a relation from hdfs, th… #14817

Uh oh!

Conversation

Parth-Brahmbhatt commented Aug 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

hvanhovell commented Aug 25, 2016

Uh oh!

Parth-Brahmbhatt commented Aug 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen Aug 26, 2016

Choose a reason for hiding this comment

Uh oh!

Parth-Brahmbhatt Aug 26, 2016

Choose a reason for hiding this comment

Uh oh!

Parth-Brahmbhatt commented Aug 26, 2016

Uh oh!

Parth-Brahmbhatt commented Aug 31, 2016

Uh oh!

hvanhovell commented Aug 31, 2016

Uh oh!

Parth-Brahmbhatt commented Aug 31, 2016

Uh oh!

hvanhovell commented Aug 31, 2016

Uh oh!

Parth-Brahmbhatt commented Aug 31, 2016

Uh oh!

Parth-Brahmbhatt commented Sep 1, 2016

Uh oh!

Parth-Brahmbhatt commented Sep 8, 2016

Uh oh!

Parth-Brahmbhatt commented Sep 14, 2016

Uh oh!

hvanhovell commented Sep 14, 2016

Uh oh!

AmplabJenkins commented Sep 17, 2016

Uh oh!

Parth-Brahmbhatt commented Sep 28, 2016

Uh oh!

gatorsmile commented May 23, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Parth-Brahmbhatt commented Aug 25, 2016 •

edited

Loading

Parth-Brahmbhatt commented Aug 25, 2016 •

edited

Loading