Skip to content

Comments

[MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client#41015

Closed
juliuszsompolski wants to merge 3 commits intoapache:masterfrom
juliuszsompolski:proto-regen-doc
Closed

[MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client#41015
juliuszsompolski wants to merge 3 commits intoapache:masterfrom
juliuszsompolski:proto-regen-doc

Conversation

@juliuszsompolski
Copy link
Contributor

What changes were proposed in this pull request?

Figuring out how to generate connect grpc proto in python was surprisingly hard to figure out for me (not knowing much about python development though), so adding it to the README.

Why are the changes needed?

Improving internal documentation.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Not applicable.

@@ -21,6 +21,17 @@ user experience across all languages. Please follow the below guidelines:

Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.

Copy link
Contributor

@nija-at nija-at May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to add a README in this folder and link to this README?

[non blocking]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that folder is rst based and published to the web documentation and not markdown based, so I think README.md doesn't fit there.
But I was surprised that the documentation there doesn't mention getting dependencies from dev/requirements.txt. But I'm not really familiar with python dev and these docs, maybe someone more familiar should have a better idea what would be the best place for it? @HyukjinKwon ?

juliuszsompolski and others added 2 commits May 2, 2023 16:54
Co-authored-by: Niranjan Jayakar <nija@databricks.com>
Generate the Python files by running:

```
dev/connect-gen-protos.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know is it specific to me but this command fails on my site with error:

+ buf generate --debug -vvv
...
Failure: 403 Forbidden

A workaround is to fork https://github.com/HyukjinKwon/my-github-actions and run the workflow:
Screen Shot 2023-03-24 at 8 38 18 PM
and download the results:
Screen Shot 2023-03-24 at 8 38 45 PM

@MaxGekk
Copy link
Member

MaxGekk commented May 3, 2023

+1, LGTM. Merging to master.
Thank you, @juliuszsompolski and @grundprinzip @nija-at for review.

@MaxGekk MaxGekk changed the title [CONNECT][DOC] Add information on how to regenerate proto for python client [MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client May 3, 2023
@MaxGekk MaxGekk closed this in c609ae7 May 3, 2023
Install [buf](https://github.com/bufbuild/buf)

```
brew install bufbuild/buf/buf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generate the Python files by running:

```
dev/connect-gen-protos.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juliuszsompolski was not able to find them. So I suspect they are not in a very discoverable spot.

Dedupe in a follow up sounds good. We can also change these to be links to the correct place.

To generate the Python client code from the proto files:

First, make sure to have a Python environment with the installed dependencies.
Specifically, install `black` and dependencies from the "Spark Connect python proto generation plugin (optional)" section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no biggie but I suspect we won't necessarily need to mention this

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's clean up and dedup a little bit next time when we touch here.

LuciferYang pushed a commit to LuciferYang/spark that referenced this pull request May 10, 2023
…python client

### What changes were proposed in this pull request?

Figuring out how to generate connect grpc proto in python was surprisingly hard to figure out for me (not knowing much about python development though), so adding it to the README.

### Why are the changes needed?

Improving internal documentation.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Not applicable.

Closes apache#41015 from juliuszsompolski/proto-regen-doc.

Authored-by: Juliusz Sompolski <julek@databricks.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants