Skip to content

[SPARK-42731][CONNECT][DOCS] Document Spark Connect configurations#40416

Closed
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-42731
Closed

[SPARK-42731][CONNECT][DOCS] Document Spark Connect configurations#40416
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-42731

Conversation

@HyukjinKwon
Copy link
Copy Markdown
Member

@HyukjinKwon HyukjinKwon commented Mar 14, 2023

What changes were proposed in this pull request?

This PR proposes to document the configuration of Spark Connect defined in https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala

Why are the changes needed?

To let users know which configuration are supported for Spark Connect.

Does this PR introduce any user-facing change?

Yes, it documents the configurations for Spark Connect.

How was this patch tested?

Linters in CI should verify this change.

Also manually built the docs as below:

Screen Shot 2023-03-14 at 8 24 51 PM

@github-actions github-actions bot added the DOCS label Mar 14, 2023
<td>
4m
</td>
<td>When using Apache Arrow, limit the maximum size of one arrow batch that can be sent from server side to client side. Currently, we conservatively use 70% of it because the size is not accurate but estimated.</td>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

limit the maximum size of one arrow batch that can be sent from server side to client side. Currently, we conservatively use 70% of it because the size is not accurate but estimated.

->
limit the maximum size of each arrow batch sent from the server to the client. Currently, we are using a conservative estimate of 70% of the maximum size, since the actual size cannot be accurately determined.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually matches with the docs in https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala. I think we should better do that separately; otherwise it will require a full build because of the code change.

Please feel free to make a minor PR if you're interested in this :-).

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@HyukjinKwon
Copy link
Copy Markdown
Member Author

Merged to master and branch-3.4.

HyukjinKwon added a commit that referenced this pull request Mar 14, 2023
### What changes were proposed in this pull request?

This PR proposes to document the configuration of Spark Connect defined in https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala

### Why are the changes needed?

To let users know which configuration are supported for Spark Connect.

### Does this PR introduce _any_ user-facing change?

Yes, it documents the configurations for Spark Connect.

### How was this patch tested?

Linters in CI should verify this change.

Also manually built the docs as below:

![Screen Shot 2023-03-14 at 8 24 51 PM](https://user-images.githubusercontent.com/6477701/224986645-3e3abfe3-4f6b-4810-8887-24cf24532f5e.png)

Closes #40416 from HyukjinKwon/SPARK-42731.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit e986fb0)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
### What changes were proposed in this pull request?

This PR proposes to document the configuration of Spark Connect defined in https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala

### Why are the changes needed?

To let users know which configuration are supported for Spark Connect.

### Does this PR introduce _any_ user-facing change?

Yes, it documents the configurations for Spark Connect.

### How was this patch tested?

Linters in CI should verify this change.

Also manually built the docs as below:

![Screen Shot 2023-03-14 at 8 24 51 PM](https://user-images.githubusercontent.com/6477701/224986645-3e3abfe3-4f6b-4810-8887-24cf24532f5e.png)

Closes apache#40416 from HyukjinKwon/SPARK-42731.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit e986fb0)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon deleted the SPARK-42731 branch January 15, 2024 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants