Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-19660][CORE][SQL] Replace the configuration property names that are deprecated in the version of Hadoop 2.6 #16990

Closed
wants to merge 7 commits into from

Conversation

@SparkQA
Copy link

SparkQA commented Feb 19, 2017

Test build #73127 has finished for PR 16990 at commit b13ca22.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the principle, and all the changes do look as described. I am familiar with some of the properties and yes these are their newer counterparts. I agree we should avoid use of deprecated properties where possible.

@SparkQA
Copy link

SparkQA commented Feb 19, 2017

Test build #73128 has finished for PR 16990 at commit 39b3778.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, LGTM

@felixcheung
Copy link
Member

[info] *** 34 TESTS FAILED ***
[error] Failed: Total 2311, Failed 34, Errors 0, Passed 2277, Ignored 597
[error] Failed tests:
[error] 	org.apache.spark.sql.hive.execution.HiveCompatibilitySuite
[error] (hive/test:test) sbt.TestsFailedException: Tests unsuccessful

@wangyum
Copy link
Member Author

wangyum commented Feb 20, 2017

I'm working on the tests fail.

@SparkQA
Copy link

SparkQA commented Feb 21, 2017

Test build #73205 has started for PR 16990 at commit d8e5862.

@SparkQA
Copy link

SparkQA commented Feb 21, 2017

Test build #73207 has finished for PR 16990 at commit 97734c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Feb 22, 2017

@srowen @felixcheung
The SQL query is related to the file name, see:
https://github.com/apache/spark/blob/v2.1.0/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveComparisonTest.scala#L314

e.g;
set mapred.reduce.tasks=31's MD5 is 83c59d378571a6e487aa20217bd87817, set mapreduce.job.reduces=31's MD5 is be2c0b32a02a1154bfdee1a52530f387.

So I change the following file names:

mv input12_hadoop20-0-db1cd54a4cb36de2087605f32e41824f                          input12_hadoop20-0-2b9ccaa793eae0e73bf76335d3d6880
mv join14_hadoop20-1-db1cd54a4cb36de2087605f32e41824f                           join14_hadoop20-1-2b9ccaa793eae0e73bf76335d3d6880
mv auto_join14_hadoop20-2-db1cd54a4cb36de2087605f32e41824f                      auto_join14_hadoop20-2-2b9ccaa793eae0e73bf76335d3d6880
mv groupby4_noskew-2-83c59d378571a6e487aa20217bd87817                           groupby4_noskew-2-be2c0b32a02a1154bfdee1a52530f387
mv groupby4_map-2-83c59d378571a6e487aa20217bd87817                              groupby4_map-2-be2c0b32a02a1154bfdee1a52530f387                        
mv groupby4_map_skew-2-83c59d378571a6e487aa20217bd87817                         groupby4_map_skew-2-be2c0b32a02a1154bfdee1a52530f387                   
mv groupby7_map-3-83c59d378571a6e487aa20217bd87817                              groupby7_map-3-be2c0b32a02a1154bfdee1a52530f387                        
mv groupby2_limit-0-83c59d378571a6e487aa20217bd87817                            groupby2_limit-0-be2c0b32a02a1154bfdee1a52530f387                      
mv groupby6_map_skew-2-83c59d378571a6e487aa20217bd87817                         groupby6_map_skew-2-be2c0b32a02a1154bfdee1a52530f387                   
mv groupby5_map_skew-2-83c59d378571a6e487aa20217bd87817                         groupby5_map_skew-2-be2c0b32a02a1154bfdee1a52530f387                   
mv groupby5_noskew-2-83c59d378571a6e487aa20217bd87817                           groupby5_noskew-2-be2c0b32a02a1154bfdee1a52530f387                     
mv groupby2_map-2-83c59d378571a6e487aa20217bd87817                              groupby2_map-2-be2c0b32a02a1154bfdee1a52530f387                        
mv groupby7_noskew-3-83c59d378571a6e487aa20217bd87817                           groupby7_noskew-3-be2c0b32a02a1154bfdee1a52530f387                     
mv groupby1_map_skew-2-83c59d378571a6e487aa20217bd87817                         groupby1_map_skew-2-be2c0b32a02a1154bfdee1a52530f387                   
mv groupby8_map-2-83c59d378571a6e487aa20217bd87817                              groupby8_map-2-be2c0b32a02a1154bfdee1a52530f387                        
mv groupby6_noskew-2-83c59d378571a6e487aa20217bd87817                           groupby6_noskew-2-be2c0b32a02a1154bfdee1a52530f387                     
mv groupby7_map_skew-2-83c59d378571a6e487aa20217bd87817                         groupby7_map_skew-2-be2c0b32a02a1154bfdee1a52530f387                   
mv groupby7_noskew_multi_single_reducer-2-83c59d378571a6e487aa20217bd87817      groupby7_noskew_multi_single_reducer-2-be2c0b32a02a1154bfdee1a52530f387
mv groupby_map_ppr-2-83c59d378571a6e487aa20217bd87817                           groupby_map_ppr-2-be2c0b32a02a1154bfdee1a52530f387                     
mv groupby8_map_skew-2-83c59d378571a6e487aa20217bd87817                         groupby8_map_skew-2-be2c0b32a02a1154bfdee1a52530f387                   
mv groupby6_map-2-83c59d378571a6e487aa20217bd87817                              groupby6_map-2-be2c0b32a02a1154bfdee1a52530f387                        
mv groupby1_noskew-2-83c59d378571a6e487aa20217bd87817                           groupby1_noskew-2-be2c0b32a02a1154bfdee1a52530f387                     
mv groupby1_limit-0-83c59d378571a6e487aa20217bd87817                            groupby1_limit-0-be2c0b32a02a1154bfdee1a52530f387                      
mv groupby5_map-2-83c59d378571a6e487aa20217bd87817                              groupby5_map-2-be2c0b32a02a1154bfdee1a52530f387                        
mv groupby7_map_multi_single_reducer-2-83c59d378571a6e487aa20217bd87817         groupby7_map_multi_single_reducer-2-be2c0b32a02a1154bfdee1a52530f387   
mv groupby1_map-2-83c59d378571a6e487aa20217bd87817                              groupby1_map-2-be2c0b32a02a1154bfdee1a52530f387                        
mv groupby2_noskew-2-83c59d378571a6e487aa20217bd87817                           groupby2_noskew-2-be2c0b32a02a1154bfdee1a52530f387                     
mv groupby8_noskew-2-83c59d378571a6e487aa20217bd87817                           groupby8_noskew-2-be2c0b32a02a1154bfdee1a52530f387                     
mv groupby2_map_skew-2-83c59d378571a6e487aa20217bd87817                         groupby2_map_skew-2-be2c0b32a02a1154bfdee1a52530f387   
mv groupby1-3-d57ed4bbfee1ffaffaeba0a4be84c31d                                  groupby1-3-c8478dac3497697b4375ee35118a5c3e
mv merge2-2-c95dc367df88c9e5cf77157f29ba2daf                                    merge2-2-6142f47d3fcdd4323162014d5eb35e07
mv combine2-2-c95dc367df88c9e5cf77157f29ba2daf                                  combine2-2-6142f47d3fcdd4323162014d5eb35e07
mv combine1-2-c95dc367df88c9e5cf77157f29ba2daf                                  combine1-2-6142f47d3fcdd4323162014d5eb35e07
mv combine1-3-6e53a3ac93113f20db3a12f1dcf30e86                                  combine1-3-10266e3d5dd4c841c0d65030b1edba7c
mv combine2-3-6e53a3ac93113f20db3a12f1dcf30e86                                  combine2-3-10266e3d5dd4c841c0d65030b1edba7c
mv merge2-3-6e53a3ac93113f20db3a12f1dcf30e86                                    merge2-3-10266e3d5dd4c841c0d65030b1edba7c
mv combine1-4-84967075baa3e56fff2a23f8ab9ba076                                  combine1-4-9cbd6d400fb6c3cd09010e3dbd76601
mv combine2-4-84967075baa3e56fff2a23f8ab9ba076                                  combine2-4-9cbd6d400fb6c3cd09010e3dbd76601
mv merge2-4-84967075baa3e56fff2a23f8ab9ba076                                    merge2-4-9cbd6d400fb6c3cd09010e3dbd76601
mv merge2-5-2ee5d706fe3a3bcc38b795f6e94970ea                                    merge2-5-1ba2d6f3bb3348da3fee7fab4f283f34
mv combine1-5-2ee5d706fe3a3bcc38b795f6e94970ea                                  combine1-5-1ba2d6f3bb3348da3fee7fab4f283f34
mv combine2-5-2ee5d706fe3a3bcc38b795f6e94970ea                                  combine2-5-1ba2d6f3bb3348da3fee7fab4f283f34
mv groupby1-5-dd7bf298b8c921355edd8665c6b0c168                                  groupby1-5-c9cee6382b64bd3d71177527961b8be2
mv leftsemijoin_mr-7-8e9c2969b999557363e40f9ebb3f6d7c                           leftsemijoin_mr-7-6b9861b999092f1ea4fa1fd27a666af6
mv rcfile_lazydecompress-5-68975193b30cb34102b380e647d8d5f4                     rcfile_lazydecompress-5-dd959af1968381d0ed90178d349b01a7
mv input_testsequencefile-0-68975193b30cb34102b380e647d8d5f4                    input_testsequencefile-0-dd959af1968381d0ed90178d349b01a7
mv input_testsequencefile-1-1c0f3be2d837dee49312e0a80440447e                    input_testsequencefile-1-ddbb8d5e5dc0988bda96ac2b4aec8f94
mv rcfile_lazydecompress-11-3708198aac609695b22e19e89306034c                    rcfile_lazydecompress-11-25715870c569b0f8c3d483e3a38b3199
mv input_testsequencefile-5-3708198aac609695b22e19e89306034c                    input_testsequencefile-5-25715870c569b0f8c3d483e3a38b3199
mv parallel-0-23a4feaede17467a8cc26e4d86ec30f9                                  parallel-0-6dc30e2de057022e63bd2a645fbec4c2

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Given how it touches Hadoop config, maybe @vanzin or @steveloughran has a comment

@steveloughran
Copy link
Contributor

steveloughran commented Feb 25, 2017

LGTM, though you'd have to go do the full coverage to verify that there's not a typo in any of the strings. This is why although Spark has adopted the more readable inline strings, I'm more of a fan of the "refer to the constant" for both spelling and the ability to locate use though the IDE

That said in HDFS-9301, HDFS-10610 and HDFS-6418 I have expressed my concerns about HDFS constants, and invariably encountered resistance to fixing regressions. core and YARN are stable, and I'll happily revert anything there if people complain.

Note also that deprecation warnings go to a special log, "org.apache.hadoop.conf.Configuration.deprecation". Set that to WARN and you don't get told off any more.

Finally, I still have no idea why HDFS-531 changed fs.default.name to a new mixed case string. It does, well, nothing.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went over this again to check more carefully and have two small questions.

@@ -1515,12 +1515,12 @@ def test_oldhadoop(self):

conf = {
"mapred.output.format.class": "org.apache.hadoop.mapred.SequenceFileOutputFormat",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this key was supposed to be before; maybe mapreduce.outputformat.class? but it can be mapreduce.job.outputformat.class now?

Copy link
Member Author

@wangyum wangyum Feb 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen
mapred.output.format.class map to old API and mapreduce.job.outputformat.class map to new API. see:
https://github.com/wangyum/spark/blob/97734c5af3df4e6525e8015459af16ab193dfc24/python/pyspark/tests.py#L1664-L1679

@@ -19,7 +19,7 @@ select * from (select a.key from smb_bucket_1 a join smb_bucket_2 b on (a.key =

set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;
set hive.mapred.reduce.tasks.speculative.execution=false;
set hive.mapreduce.job.reduces.speculative.execution=false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be mapreduce.reduce.speculative? I'm looking at https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
Or maybe the hive.* version is different?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks likehive.mapred.reduce.tasks.speculative.execution in the [Hive wiki|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties].

But probably best to pull in a Hive developer, maybe @jcamachor. Jesus: could you look at thee hive config options and make sure they are the current set?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveloughran , I checked the code and property name in Hive is hive.mapred.reduce.tasks.speculative.execution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@wangyum
Copy link
Member Author

wangyum commented Feb 27, 2017

OK. I have reverted set hive.mapreduce.job.reduces.speculative.execution=false to set hive.mapred.reduce.tasks.speculative.execution=false.

@SparkQA
Copy link

SparkQA commented Feb 27, 2017

Test build #73511 has finished for PR 16990 at commit 21956db.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Feb 28, 2017

I merged this to master, but the script gave an error from git. I had experienced some intermittent Github errors, but I also have a new environment. The commit looks correct but hasn't synced immediately to Github. Not sure what's happened, but will monitor it.

@asfgit asfgit closed this in 9b8eca6 Feb 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants