Skip to content

Conversation

@yutoacts
Copy link
Contributor

@srowen
Copy link
Member

srowen commented Jul 29, 2021

This doesn't quite feel like the right place to document this. How about docs/spark-standalone.md in the main Spark project docs?

@HyukjinKwon
Copy link
Member

I actually suggested to avoid documenting it in the main docs because local-cluster is test-only mode. But I am fine with doing it in docs/spark-standalone.md too. Looks like Tom wants to have it in main docs too.

@srowen
Copy link
Member

srowen commented Jul 29, 2021

I see, if this is really intended as a developer tool, this would be the right place. The very old SPARK-595 thread suggests it isn't totally for testing.

@tgravescs
Copy link
Contributor

so I guess it isn't that big of deal because we don't do document it now and doesn't seem to have been a big problem, but I assume this issue was filed for a reason and my concern is that people know about and use local-cluster mode so why not just clarify what its for. there are 2 things, one is obscure it by not documenting it which works for some people, but it doesn't work for others that know about it or find it but don't know its for unit testing only. If we document all the run modes in a common places seems like it would be easier to find for users.
While it does run like standalone mode it doesn't make sense to me to put under standalone mode docs, at least I wouldn't think to go there to look for it.

@tgravescs
Copy link
Contributor

Note if others disagree, I'm fine with leaving here in developer docs. I would rather see it go in common docs where we describe all run modes or in developer docs to obscure from users.

@HyukjinKwon
Copy link
Member

My only concern about documenting in the main docs is that it happens to force us to investigate/document it together whenever a cluster related feature (like archive, resource profile, etc.), and then it gives the dev some more overhead of investigation, for example, the one instance in the main document (https://spark.apache.org/docs/latest/configuration.html#custom-resource-scheduling-and-configuration-overview).

If we'll document once, and explicitly say there's no gurnatee on such features since that's a test only mode, I'm fine with doing it in the main docs too.

@HyukjinKwon
Copy link
Member

I am fine either way, no big deal. I will defer to @srowen and @tgravescs.

@yutoacts
Copy link
Contributor Author

yutoacts commented Jul 30, 2021

Thanks for the suggestions. If it goes in main docs, should it be documented in docs/spark-standalone.md or docs/submitting-applications.md (as my initial PR: apache/spark#33537)? IMHO documenting it in docs/spark-standalone.md might confusing people with standalone mode.


<p>When launching applications with spark-submit, besides options in
<a href="https://spark.apache.org/docs/latest/submitting-applications.html#master-urls">Master URLs</a>
, set local-cluster option to emulate a distributed cluster in a single JVM.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we explicitly say this is for unit testing only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it is not necessary as it's on a developer-tools page but I'm totally fine with explicitly saying that here.
BTW if local-cluster mode ends up written in main documents (apache/spark#33537), should it still be documented here?

@yutoacts
Copy link
Contributor Author

It ended up as apache/spark#33537.

@yutoacts yutoacts closed this Aug 19, 2021
@yutoacts yutoacts deleted the SPARK-36335 branch September 7, 2021 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants