New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-2000:cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without master option #952
Conversation
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
@CrazyJvm |
hi @witgo |
SparkSubmitArguments.scala#L127 We can refer to the code of MasterArguments.scala |
Hi witgo, I still cannot figure out what u mean... could u please give me some clues in detail? |
// Global defaults. These should be keep to minimum to avoid confusing behavior.
master = Option(master).getOrElse {
if (System.getenv("SPARK_MASTER_HOST") != null) {
val host = System.getenv("SPARK_MASTER_HOST")
val port = Option(System.getenv("SPARK_MASTER_PORT")).getOrElse("7077").toInt
s"spark://$host:$port"
}
else {
"local[*]"
}
} |
Merged build triggered. |
Merged build started. |
thanks, witgo. I think maybe your suggestion is better than my current solution since we do not need modify shell. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
… read "SPARK_MASTER_IP".
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Wouldn't this be fixed by having conf/spark-defaults.conf set correctly on each cluster node? I don't think we should look at the environment here, we can just recommend to create this config file. |
@mateiz YES, i agree. I was motivated by the "http://spark.apache.org/docs/latest/spark-standalone.html" , which says "Note that if you are running spark-shell from one of the spark cluster machines, the bin/spark-shell script will automatically set MASTER from the SPARK_MASTER_IP and SPARK_MASTER_PORT variables in conf/spark-env.sh." |
Oh, I see. It would be better if you send a patch to the guide then -- just tell users to add this stuff into the .conf file. |
QA tests have started for PR 952. This patch merges cleanly. |
ok, so I will close this PR and send another patch to the guide. thanks for your discussion. |
QA results for PR 952: |
…logRelation #### What changes were proposed in this pull request? Different from the other leaf nodes, `MetastoreRelation` and `SimpleCatalogRelation` have a pre-defined `alias`, which is used to change the qualifier of the node. However, based on the existing alias handling, alias should be put in `SubqueryAlias`. This PR is to separate alias handling from `MetastoreRelation` and `SimpleCatalogRelation` to make it consistent with the other nodes. It simplifies the signature and conversion to a `BaseRelation`. For example, below is an example query for `MetastoreRelation`, which is converted to a `LogicalRelation`: ```SQL SELECT tmp.a + 1 FROM test_parquet_ctas tmp WHERE tmp.a > 2 ``` Before changes, the analyzed plan is ``` == Analyzed Logical Plan == (a + 1): int Project [(a#951 + 1) AS (a + 1)apache#952] +- Filter (a#951 > 2) +- SubqueryAlias tmp +- Relation[a#951] parquet ``` After changes, the analyzed plan becomes ``` == Analyzed Logical Plan == (a + 1): int Project [(a#951 + 1) AS (a + 1)apache#952] +- Filter (a#951 > 2) +- SubqueryAlias tmp +- SubqueryAlias test_parquet_ctas +- Relation[a#951] parquet ``` **Note: the optimized plans are the same.** For `SimpleCatalogRelation`, the existing code always generates two Subqueries. Thus, no change is needed. #### How was this patch tested? Added test cases. Author: gatorsmile <gatorsmile@gmail.com> Closes apache#14053 from gatorsmile/removeAliasFromMetastoreRelation.
JIRA:https://issues.apache.org/jira/browse/SPARK-2000