Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46749][DOCS] Document SPARK_LOG_* and SPARK_PID_DIR #44774

Closed
wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jan 17, 2024

What changes were proposed in this pull request?

This PR aims to document the following three environment variables for Spark Standalone cluster.

  • SPARK_LOG_DIR
  • SPARK_LOG_MAX_FILES
  • SPARK_PID_DIR

Why are the changes needed?

So far, the users need to look at the spark-env.sh.template or spark-daemon.sh files to see the descriptions and the default values. We had better document it officially.

# - SPARK_LOG_DIR Where log files are stored. (Default: ${SPARK_HOME}/logs)
# - SPARK_LOG_MAX_FILES Max log files of Spark daemons can rotate to. Default is 5.
# - SPARK_PID_DIR Where the pid file is stored. (Default: /tmp)

# SPARK_LOG_DIR Where log files are stored. ${SPARK_HOME}/logs by default.
# SPARK_LOG_MAX_FILES Max log files of Spark daemons can rotate to. Default is 5.
# SPARK_MASTER host:path where spark code should be rsync'd from
# SPARK_PID_DIR The pid files are stored. /tmp by default.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Generate HTML docs.

Screenshot 2024-01-17 at 10 38 09 AM

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the DOCS label Jan 17, 2024
@HyukjinKwon
Copy link
Member

Merged to master.

@dongjoon-hyun
Copy link
Member Author

Thank you, @HyukjinKwon !

szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
### What changes were proposed in this pull request?

This PR aims to document the following three environment variables for `Spark Standalone` cluster.
- SPARK_LOG_DIR
- SPARK_LOG_MAX_FILES
- SPARK_PID_DIR

### Why are the changes needed?

So far, the users need to look at the `spark-env.sh.template` or `spark-daemon.sh` files to see the descriptions and the default values. We had better document it officially.

https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/conf/spark-env.sh.template#L67-L69

https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/sbin/spark-daemon.sh#L25-L28

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Generate HTML docs.

![Screenshot 2024-01-17 at 10 38 09 AM](https://github.com/apache/spark/assets/9700541/7b6106dc-5105-4653-94aa-0fc05af5a762)

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#44774 from dongjoon-hyun/SPARK-46749.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants