Skip to content

[SPARK-56135][DOCS] Add option to use venv in PySpark docs#54943

Closed
Yicong-Huang wants to merge 3 commits intoapache:masterfrom
Yicong-Huang:SPARK-56135/add-venv-docs
Closed

[SPARK-56135][DOCS] Add option to use venv in PySpark docs#54943
Yicong-Huang wants to merge 3 commits intoapache:masterfrom
Yicong-Huang:SPARK-56135/add-venv-docs

Conversation

@Yicong-Huang
Copy link
Contributor

What changes were proposed in this pull request?

Add a venv section in the PySpark documentation as an alternative to Conda for environment management:

  • python/docs/source/getting_started/install.rst - new "Using venv" section alongside "Using Conda"
  • python/docs/source/development/contributing.rst - new "venv" section alongside Conda and pip under Environment Setup

Why are the changes needed?

The PySpark docs currently only mention pip (without isolation) and Conda for environment setup. Python's built-in venv module is a lightweight alternative that doesn't require installing Conda. Adding this option gives users more choices for setting up their development environment.

Does this PR introduce any user-facing change?

Documentation only.

How was this patch tested?

N/A (documentation change).

Was this patch authored or co-authored using generative AI tooling?

Yes.


# Python 3.10+ is required
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not mention windows?
since we even don't have tests run on windows.
this comment suggest pyspark is well supported on windows, but I guess it is not true

@zhengruifeng
Copy link
Contributor

merged to master

terana pushed a commit to terana/spark that referenced this pull request Mar 23, 2026
### What changes were proposed in this pull request?

Add a `venv` section in the PySpark documentation as an alternative to Conda for environment management:

- `python/docs/source/getting_started/install.rst` - new "Using venv" section alongside "Using Conda"
- `python/docs/source/development/contributing.rst` - new "venv" section alongside Conda and pip under Environment Setup

### Why are the changes needed?

The PySpark docs currently only mention pip (without isolation) and Conda for environment setup. Python's built-in `venv` module is a lightweight alternative that doesn't require installing Conda. Adding this option gives users more choices for setting up their development environment.

### Does this PR introduce _any_ user-facing change?

Documentation only.

### How was this patch tested?

N/A (documentation change).

### Was this patch authored or co-authored using generative AI tooling?

Yes.

Closes apache#54943 from Yicong-Huang/SPARK-56135/add-venv-docs.

Authored-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants