Skip to content

Commit

Permalink
[SPARK-46131][PYTHON][INFRA] Install torchvision for Python 3.12 build
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR adds `torchvision` into the testing image for Python 3.12.

### Why are the changes needed?

To continue Python 3.12 build, and see what are failing. Currently it fails as below: https://github.com/apache/spark/actions/runs/7006848931/job/19059702169#step:12:4236

```
======================================================================
ERROR [0.001s]: test_end_to_end_run_distributedly (pyspark.ml.tests.connect.test_parity_torch_distributor.TorchDistributorDistributedUnitTestsOnConnect.test_end_to_end_run_distributedly)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/ml/torch/tests/test_distributor.py", line 495, in test_end_to_end_run_distributedly
    train_fn = create_training_function(self.mnist_dir_path)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/spark/spark/python/pyspark/ml/torch/tests/test_distributor.py", line 60, in create_training_function
    from torchvision import transforms, datasets
ModuleNotFoundError: No module named 'torchvision'

======================================================================
ERROR [0.001s]: test_end_to_end_run_locally (pyspark.ml.tests.connect.test_parity_torch_distributor.TorchDistributorLocalUnitTestsIIOnConnect.test_end_to_end_run_locally)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/ml/torch/tests/test_distributor.py", line 402, in test_end_to_end_run_locally
    train_fn = create_training_function(self.mnist_dir_path)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/spark/spark/python/pyspark/ml/torch/tests/test_distributor.py", line 60, in create_training_function
    from torchvision import transforms, datasets
ModuleNotFoundError: No module named 'torchvision'

======================================================================
ERROR [0.001s]: test_end_to_end_run_locally (pyspark.ml.tests.connect.test_parity_torch_distributor.TorchDistributorLocalUnitTestsOnConnect.test_end_to_end_run_locally)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/ml/torch/tests/test_distributor.py", line 402, in test_end_to_end_run_locally
    train_fn = create_training_function(self.mnist_dir_path)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/spark/spark/python/pyspark/ml/torch/tests/test_distributor.py", line 60, in create_training_function
    from torchvision import transforms, datasets
ModuleNotFoundError: No module named 'torchvision'

----------------------------------------------------------------------
Ran 23 tests in 50.860s
```

and this pr fixes it

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44045 from HyukjinKwon/SPARK-46131.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
  • Loading branch information
HyukjinKwon committed Nov 28, 2023
1 parent 158f876 commit 984e797
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions dev/infra/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -138,4 +138,5 @@ RUN python3.12 -m pip install numpy 'pyarrow>=14.0.0' 'six==1.16.0' 'pandas<=2.1
RUN python3.12 -m pip install 'grpcio==1.59.3' 'grpcio-status==1.59.3' 'protobuf==4.25.1' 'googleapis-common-protos==1.56.4'
# TODO(SPARK-46078) Use official one instead of nightly build when it's ready
RUN python3.12 -m pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu
RUN python3.12 -m pip install torchvision --index-url https://download.pytorch.org/whl/cpu
RUN python3.12 -m pip install torcheval

0 comments on commit 984e797

Please sign in to comment.