Skip to content

Commit

Permalink
[SPARK-46745][INFRA] Purge pip cache in dockerfile
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
Purge pip cache in dockerfile

### Why are the changes needed?
to save 4~5G disk space:

before

https://github.com/zhengruifeng/spark/actions/runs/7541725028/job/20530432798

```
#45 [39/39] RUN df -h
#45 0.090 Filesystem      Size  Used Avail Use% Mounted on
#45 0.090 overlay          84G   70G   15G  83% /
#45 0.090 tmpfs            64M     0   64M   0% /dev
#45 0.090 shm              64M     0   64M   0% /dev/shm
#45 0.090 /dev/root        84G   70G   15G  83% /etc/resolv.conf
#45 0.090 tmpfs           7.9G     0  7.9G   0% /proc/acpi
#45 0.090 tmpfs           7.9G     0  7.9G   0% /sys/firmware
#45 0.090 tmpfs           7.9G     0  7.9G   0% /proc/scsi
#45 DONE 2.0s
```

after

https://github.com/zhengruifeng/spark/actions/runs/7549204209/job/20552796796

```
#48 [42/43] RUN python3.12 -m pip cache purge
#48 0.670 Files removed: 392
#48 DONE 0.7s

#49 [43/43] RUN df -h
#49 0.075 Filesystem      Size  Used Avail Use% Mounted on
#49 0.075 overlay          84G   65G   19G  79% /
#49 0.075 tmpfs            64M     0   64M   0% /dev
#49 0.075 shm              64M     0   64M   0% /dev/shm
#49 0.075 /dev/root        84G   65G   19G  79% /etc/resolv.conf
#49 0.075 tmpfs           7.9G     0  7.9G   0% /proc/acpi
#49 0.075 tmpfs           7.9G     0  7.9G   0% /sys/firmware
#49 0.075 tmpfs           7.9G     0  7.9G   0% /proc/scsi
```
### Does this PR introduce _any_ user-facing change?
no, infra-only

### How was this patch tested?
ci

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #44768 from zhengruifeng/infra_docker_cleanup.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
zhengruifeng authored and dongjoon-hyun committed Jan 18, 2024
1 parent 44d2c86 commit da0c31c
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 5 deletions.
4 changes: 0 additions & 4 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -417,10 +417,6 @@ jobs:
- name: Free up disk space
shell: 'script -q -e -c "bash {0}"'
run: |
if [[ "$MODULES_TO_TEST" != *"pyspark-ml"* ]] && [[ "$BRANCH" != "branch-3.5" ]]; then
# uninstall libraries dedicated for ML testing
python3.9 -m pip uninstall -y torch torchvision torcheval torchtnt tensorboard mlflow deepspeed
fi
if [ -f ./dev/free_disk_space_container ]; then
./dev/free_disk_space_container
fi
Expand Down
8 changes: 7 additions & 1 deletion dev/infra/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
# See also in https://hub.docker.com/_/ubuntu
FROM ubuntu:focal-20221019

ENV FULL_REFRESH_DATE 20231117
ENV FULL_REFRESH_DATE 20240117

ENV DEBIAN_FRONTEND noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN true
Expand Down Expand Up @@ -104,6 +104,7 @@ RUN python3.9 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting $CONNECT_PIP
# Add torch as a testing dependency for TorchDistributor and DeepspeedTorchDistributor
RUN python3.9 -m pip install 'torch<=2.0.1' torchvision --index-url https://download.pytorch.org/whl/cpu
RUN python3.9 -m pip install deepspeed torcheval
RUN python3.9 -m pip cache purge

# Install Python 3.10 at the last stage to avoid breaking Python 3.9
RUN add-apt-repository ppa:deadsnakes/ppa
Expand All @@ -114,6 +115,7 @@ RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
RUN python3.10 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting $CONNECT_PIP_PKGS
RUN python3.10 -m pip install 'torch<=2.0.1' torchvision --index-url https://download.pytorch.org/whl/cpu
RUN python3.10 -m pip install deepspeed torcheval
RUN python3.10 -m pip cache purge

# Install Python 3.11 at the last stage to avoid breaking the existing Python installations
RUN add-apt-repository ppa:deadsnakes/ppa
Expand All @@ -124,6 +126,7 @@ RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.11
RUN python3.11 -m pip install $BASIC_PIP_PKGS unittest-xml-reporting $CONNECT_PIP_PKGS
RUN python3.11 -m pip install 'torch<=2.0.1' torchvision --index-url https://download.pytorch.org/whl/cpu
RUN python3.11 -m pip install deepspeed torcheval
RUN python3.11 -m pip cache purge

# Install Python 3.12 at the last stage to avoid breaking the existing Python installations
RUN add-apt-repository ppa:deadsnakes/ppa
Expand All @@ -137,3 +140,6 @@ RUN python3.12 -m pip install $BASIC_PIP_PKGS $CONNECT_PIP_PKGS lxml
RUN python3.12 -m pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu
RUN python3.12 -m pip install torchvision --index-url https://download.pytorch.org/whl/cpu
RUN python3.12 -m pip install torcheval
RUN python3.12 -m pip cache purge

RUN df -h

0 comments on commit da0c31c

Please sign in to comment.