Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-2000:cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without master option #952

Closed
wants to merge 4 commits into from

Conversation

CrazyJvm
Copy link
Contributor

@CrazyJvm CrazyJvm commented Jun 3, 2014

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15389/

@witgo
Copy link
Contributor

witgo commented Jun 3, 2014

@CrazyJvm
I think we should also modify SparkSubmitArguments.scala

@CrazyJvm
Copy link
Contributor Author

CrazyJvm commented Jun 3, 2014

hi @witgo
why we need modify this? "spark.master" is in the spark-defaults.conf , so there's no problem i think.
Any suggestions are appreciated if you have : )

@witgo
Copy link
Contributor

witgo commented Jun 3, 2014

SparkSubmitArguments.scala#L127

We can refer to the code of MasterArguments.scala

@CrazyJvm
Copy link
Contributor Author

CrazyJvm commented Jun 3, 2014

Hi witgo, I still cannot figure out what u mean... could u please give me some clues in detail?

@witgo
Copy link
Contributor

witgo commented Jun 4, 2014

    // Global defaults. These should be keep to minimum to avoid confusing behavior.
    master = Option(master).getOrElse {
      if (System.getenv("SPARK_MASTER_HOST") != null) {
        val host = System.getenv("SPARK_MASTER_HOST")
        val port = Option(System.getenv("SPARK_MASTER_PORT")).getOrElse("7077").toInt
        s"spark://$host:$port"
      }
      else {
        "local[*]"
      }
    }

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@CrazyJvm
Copy link
Contributor Author

CrazyJvm commented Jun 4, 2014

thanks, witgo. I think maybe your suggestion is better than my current solution since we do not need modify shell.
Another problem is that the spark-shell can not read spark-env.sh when submitting because it does not include shell 'load-spark-env.sh'.
I will modify and test, thanks a lot.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15429/

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15437/

@CrazyJvm CrazyJvm changed the title cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without master option #SPARK2000 SPARK2000:cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without master option Jun 6, 2014
@CrazyJvm CrazyJvm changed the title SPARK2000:cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without master option SPARK-2000:cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without master option Jun 6, 2014
@mateiz
Copy link
Contributor

mateiz commented Jul 29, 2014

Wouldn't this be fixed by having conf/spark-defaults.conf set correctly on each cluster node? I don't think we should look at the environment here, we can just recommend to create this config file.

@CrazyJvm
Copy link
Contributor Author

@mateiz YES, i agree. I was motivated by the "http://spark.apache.org/docs/latest/spark-standalone.html" , which says "Note that if you are running spark-shell from one of the spark cluster machines, the bin/spark-shell script will automatically set MASTER from the SPARK_MASTER_IP and SPARK_MASTER_PORT variables in conf/spark-env.sh."
So should I modify the guide rather than code ?

@mateiz
Copy link
Contributor

mateiz commented Jul 29, 2014

Oh, I see. It would be better if you send a patch to the guide then -- just tell users to add this stuff into the .conf file.

@SparkQA
Copy link

SparkQA commented Jul 30, 2014

QA tests have started for PR 952. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17400/consoleFull

@CrazyJvm
Copy link
Contributor Author

ok, so I will close this PR and send another patch to the guide. thanks for your discussion.

@CrazyJvm CrazyJvm closed this Jul 30, 2014
@SparkQA
Copy link

SparkQA commented Jul 30, 2014

QA results for PR 952:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17400/consoleFull

ghost pushed a commit to dbtsai/spark that referenced this pull request Jul 7, 2016
…logRelation

#### What changes were proposed in this pull request?
Different from the other leaf nodes, `MetastoreRelation` and `SimpleCatalogRelation` have a pre-defined `alias`, which is used to change the qualifier of the node. However, based on the existing alias handling, alias should be put in `SubqueryAlias`.

This PR is to separate alias handling from `MetastoreRelation` and `SimpleCatalogRelation` to make it consistent with the other nodes. It simplifies the signature and conversion to a `BaseRelation`.

For example, below is an example query for `MetastoreRelation`,  which is converted to a `LogicalRelation`:
```SQL
SELECT tmp.a + 1 FROM test_parquet_ctas tmp WHERE tmp.a > 2
```

Before changes, the analyzed plan is
```
== Analyzed Logical Plan ==
(a + 1): int
Project [(a#951 + 1) AS (a + 1)apache#952]
+- Filter (a#951 > 2)
   +- SubqueryAlias tmp
      +- Relation[a#951] parquet
```
After changes, the analyzed plan becomes
```
== Analyzed Logical Plan ==
(a + 1): int
Project [(a#951 + 1) AS (a + 1)apache#952]
+- Filter (a#951 > 2)
   +- SubqueryAlias tmp
      +- SubqueryAlias test_parquet_ctas
         +- Relation[a#951] parquet
```

**Note: the optimized plans are the same.**

For `SimpleCatalogRelation`, the existing code always generates two Subqueries. Thus, no change is needed.

#### How was this patch tested?
Added test cases.

Author: gatorsmile <gatorsmile@gmail.com>

Closes apache#14053 from gatorsmile/removeAliasFromMetastoreRelation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants