[SPARK-1468] Modify the partition function used by partitionBy. #371

tyro89 · 2014-04-09T21:08:45Z

Make partitionBy use a tweaked version of hash as its default partition function
since the python hash function does not consistently assign the same value
to None across python processes.

Associated JIRA at https://issues.apache.org/jira/browse/SPARK-1468

…on function since the python hash function does not consistently assign the same value to None across python processes.

AmplabJenkins · 2014-04-09T21:12:23Z

Merged build triggered.

AmplabJenkins · 2014-04-09T21:12:30Z

Merged build started.

AmplabJenkins · 2014-04-09T22:07:39Z

Merged build finished.

AmplabJenkins · 2014-04-09T22:07:39Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13962/

tyro89 · 2014-04-10T14:58:57Z

Not sure why the build is failing as I'm pretty sure this change isn't touching any of those two things.

pwendell · 2014-04-10T17:33:56Z

Jenkins, retest this please.

AmplabJenkins · 2014-04-10T17:38:12Z

Merged build triggered.

AmplabJenkins · 2014-04-10T17:38:21Z

Merged build started.

AmplabJenkins · 2014-04-10T18:17:38Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-04-10T18:17:38Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14003/

pwendell · 2014-04-10T22:22:54Z

@tyro89 Thanks for the fix, makes sense. Would you mind creating a JIRA for this on the Spark issue tracker? Also if there is a symptom or error that this causes that would be helpful to know (I'd guess it's just seeing the None key in multiple places on the reduce side of the shuffle).

Otherwise if people run into this it will be hard for them to learn where/when it was fixed.

tyro89 · 2014-04-10T23:02:40Z

@pwendell opened jira https://issues.apache.org/jira/browse/SPARK-1468

mateiz · 2014-06-03T19:06:59Z

Jenkins, test this please

mateiz · 2014-06-03T19:07:37Z

Sorry for the delay, just re-testing this before merging it.

AmplabJenkins · 2014-06-03T19:08:02Z

Merged build triggered.

AmplabJenkins · 2014-06-03T19:19:03Z

Merged build started.

AmplabJenkins · 2014-06-03T20:14:01Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-06-03T20:14:01Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15395/

Make partitionBy use a tweaked version of hash as its default partition function since the python hash function does not consistently assign the same value to None across python processes. Associated JIRA at https://issues.apache.org/jira/browse/SPARK-1468 Author: Erik Selin <erik.selin@jadedpixel.com> Closes #371 from tyro89/consistent_hashing and squashes the following commits: 201c301 [Erik Selin] Make partitionBy use a tweaked version of hash as its default partition function since the python hash function does not consistently assign the same value to None across python processes. (cherry picked from commit 8edc9d0) Signed-off-by: Matei Zaharia <matei@databricks.com>

mateiz · 2014-06-03T20:33:57Z

Thanks Erik! Merged this into branch-0.9, 1.0 and master.

Make partitionBy use a tweaked version of hash as its default partition function since the python hash function does not consistently assign the same value to None across python processes. Associated JIRA at https://issues.apache.org/jira/browse/SPARK-1468 Author: Erik Selin <erik.selin@jadedpixel.com> Closes apache#371 from tyro89/consistent_hashing and squashes the following commits: 201c301 [Erik Selin] Make partitionBy use a tweaked version of hash as its default partition function since the python hash function does not consistently assign the same value to None across python processes.

…erals [SPARK-24151] fix case sensitive literals

1. do not use uuid directly, to get the id quering by name 2. can not create flavor in public clouds, so let the tests fail first 3. only add one playbook terraform-provider-openstack-acceptance-test-public-clouds for all public clouds 4. add post.yaml to clean up the resources after the acctests Closes: theopenlab/openlab#125 Closes: theopenlab/openlab#136

…-2.3.1 (apache#371)

Make partitionBy use a tweaked version of hash as its default partiti…

201c301

…on function since the python hash function does not consistently assign the same value to None across python processes.

tyro89 changed the title ~~Modify the partition function used by partitionBy.~~ [SPARK-1468] Modify the partition function used by partitionBy. Apr 10, 2014

asfgit closed this in 8edc9d0 Jun 3, 2014

mccheah pushed a commit to mccheah/spark that referenced this pull request Oct 3, 2018

Merge pull request apache#371 from palantir/jt/fix-case-sensitive-lit…

9c32b5b

…erals [SPARK-24151] fix case sensitive literals

arjunshroff pushed a commit to arjunshroff/spark that referenced this pull request Nov 24, 2020

[MSPARK-331] Remove snapshot versions of mapr dependencies from Spark…

0b21a8a

…-2.3.1 (apache#371)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-1468] Modify the partition function used by partitionBy. #371

[SPARK-1468] Modify the partition function used by partitionBy. #371

tyro89 commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

tyro89 commented Apr 10, 2014

pwendell commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

pwendell commented Apr 10, 2014

tyro89 commented Apr 10, 2014

mateiz commented Jun 3, 2014

mateiz commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

mateiz commented Jun 3, 2014

[SPARK-1468] Modify the partition function used by partitionBy. #371

[SPARK-1468] Modify the partition function used by partitionBy. #371

Conversation

tyro89 commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

AmplabJenkins commented Apr 9, 2014

tyro89 commented Apr 10, 2014

pwendell commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

pwendell commented Apr 10, 2014

tyro89 commented Apr 10, 2014

mateiz commented Jun 3, 2014

mateiz commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

AmplabJenkins commented Jun 3, 2014

mateiz commented Jun 3, 2014