Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3393] [SQL] Align the log4j configuration for Spark & SparkSQLCLI #2263

Closed
wants to merge 1 commit into from

Conversation

chenghao-intel
Copy link
Contributor

User may be confused for the HQL logging & configurations, we'd better provide a default templates.

Both files are copied from Hive.

@chenghao-intel
Copy link
Contributor Author

@liancheng, this is not change of any code, but kind of hint for the user of HQL, I was asked couple of times for the configuration files, hope it helpful. Any comments? :)

# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The license header is not exactly the same as other files, I'm not sure but maybe RAT may reject this one.

@liancheng
Copy link
Contributor

I know that hive-log4j.properties.template is copied from Hive, but this file really makes me itchy...

Besides that, I think these are generally good to have, thanks!

log4j.category.JPOX.General=ERROR,DRFA
log4j.category.JPOX.Enhancer=ERROR,DRFA
log4j.category.scheduler.DAGScheduler=ERROR,DRFA

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line

@chenghao-intel
Copy link
Contributor Author

Thanks for reviewing this, you're right, the log4j configuration template mainly for Hive usage (particularly for MapReduce), I've updated the code and keep it simple for demo purpose.

I also checked the other template files under conf, seems the licence header is not necessary for .properties files. Let's do it the same way?

@liancheng
Copy link
Contributor

Not all .properties files are ignored by RAT, it's controlled by .rat-excludes, you may add hive-log4j.properties.template there.

@chenghao-intel
Copy link
Contributor Author

Thank you @liancheng . I didn't notice that file previously. :)

test this please.

@marmbrus
Copy link
Contributor

marmbrus commented Sep 9, 2014

Its not really clear to me what this buys over the log4j.template that is already there. I'm also not sure what the xsl file is for. Finally an empty hive-site.xml file only saves you from adding <configuration> which most users are probably going to copy from wherever they found the hive options they want at. Given that I think this might be more clutter than its worth. What do you think?

@chenghao-intel
Copy link
Contributor Author

@marmbrus those files are just a hint for users if they want to change some of the default settings. Probably not everybody knows what files exactly they should put under the folder conf/
e.g.
hive-log4j.properties.template => hive-log4j.properties (not log4j.properties see https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala#L81)
hive-site.xml.template is for people familiar with Shark, they probably confused with the Hive dependency means. In Shark, Hive is also required to be installed, and the hive-site.xml is placed under $HIVE/conf.
configuration.xsl is the xml style file for display the hive-site.xml on browser.(http://www.w3.org/Style/XSL/WhatIsXSL.html)

I agree it's probably more clutter if we add those 3 files, but at least we should keep the hive-log4j.properties.template, or we have to change the code SparkSQLCLIDriver.scala to load log4j.properties instead of the hive-log4j.properties. What do you think?

@liancheng
Copy link
Contributor

I'd agree that at least hive-log4j.properties.template should be good to have. Partly because I myself had once been confused by this a lot...

@chenghao-intel
Copy link
Contributor Author

Thank you @liancheng , I've updated the code.

@SparkQA
Copy link

SparkQA commented Sep 9, 2014

QA tests have started for PR 2263 at commit e027d23.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 9, 2014

QA tests have finished for PR 2263 at commit e027d23.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

Oh, okay thanks for explaining this! My inclination would be to update our code to read from log4j.properties instead, unless there is some compelling reason not to. Otherwise, can we maybe add a comment to the template file that explains what it is for and why its needed in addition to the standard config file?

@chenghao-intel
Copy link
Contributor Author

2 logj4j properties files does make people confusing, I will try to update the code, let's see if we can find a simple way to reuse the log4j.properties.

@chenghao-intel chenghao-intel changed the title [SPARK-3393] [SQL] add configuration template for HQL users [SPARK-3393] [SQL] Align the log4j configuration for Spark & SparkSQLCLI Sep 12, 2014
@chenghao-intel
Copy link
Contributor Author

@liancheng @marmbrus I've updated the code by removing the log4j initialization code in SparkSQLCLIDriver.scala, now it will load the log4j.properties by default, and it works fine in my cluster testing.

@SparkQA
Copy link

SparkQA commented Sep 12, 2014

QA tests have started for PR 2263 at commit 53bffa9.

  • This patch merges cleanly.

@liancheng
Copy link
Contributor

Does this mean Hive log will never be initialized? Won't this prevent us from checking Hive logs?

@SparkQA
Copy link

SparkQA commented Sep 12, 2014

QA tests have finished for PR 2263 at commit 53bffa9.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@chenghao-intel
Copy link
Contributor Author

The Hive LogUtils just load the specified .properties file for Log4J, see https://github.com/apache/hive/blob/branch-0.12/common/src/java/org/apache/hadoop/hive/common/LogUtils.java#L117-L126

If we remove the logic in SparkSQLCliDriver, Log4J will load the log4j.properties under the classpath automatically.

@marmbrus
Copy link
Contributor

@liancheng so is this ready to merge?

@marmbrus
Copy link
Contributor

@liancheng ping

@liancheng
Copy link
Contributor

Sorry, this LGTM.

@marmbrus
Copy link
Contributor

Thanks! Merged to master.

@asfgit asfgit closed this in 7364fa5 Sep 26, 2014
@chenghao-intel chenghao-intel deleted the hive_template branch October 9, 2014 04:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants