Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Working again] New CI #2173

Merged
merged 31 commits into from
Nov 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
5a75304
Try merge tests
muellerzr Oct 26, 2023
0eb6fae
Fix
muellerzr Oct 26, 2023
47ce4a6
Checkout branch
muellerzr Oct 26, 2023
95e26e9
Fix pip install
muellerzr Oct 26, 2023
64b0595
rebase
muellerzr Nov 1, 2023
c335510
Colons
muellerzr Nov 1, 2023
0ea347c
right one
muellerzr Nov 1, 2023
d0c90ac
use master
muellerzr Nov 1, 2023
e090dcc
Rm
muellerzr Nov 1, 2023
1fe4f5e
Add needs
muellerzr Nov 1, 2023
de8fc59
Better clean
muellerzr Nov 1, 2023
be067d3
always
muellerzr Nov 1, 2023
bb1a57f
Forgot other
muellerzr Nov 1, 2023
4af61d0
test on AWS
glegendre01 Nov 2, 2023
a381293
update all labels
glegendre01 Nov 2, 2023
6511794
fix multi-gpu working directory
glegendre01 Nov 2, 2023
8e0c4f9
limit to 2 GPU
glegendre01 Nov 2, 2023
938b550
force run on kube
glegendre01 Nov 2, 2023
34b0fa3
Merge branch 'main' into new-CI
glegendre01 Nov 6, 2023
114bcab
move build docker image to new ci
glegendre01 Nov 6, 2023
09356c5
test build on CPU instance
glegendre01 Nov 6, 2023
20f7803
move build docker image release to new ci
glegendre01 Nov 6, 2023
230b5a6
move scheduled slow tests to new ci
glegendre01 Nov 14, 2023
aef0413
move integration test to new ci
glegendre01 Nov 14, 2023
b5071ef
Merge branch 'main' into new-CI
muellerzr Nov 17, 2023
56d1206
Comments
muellerzr Nov 20, 2023
0653214
Right CPU tags
muellerzr Nov 20, 2023
2c4ac5d
Right machines
muellerzr Nov 20, 2023
49fbe33
PR comments
muellerzr Nov 20, 2023
dad64a7
Fix issues
muellerzr Nov 20, 2023
ebfa910
Some trailers
muellerzr Nov 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/build-docker-images-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:

version-cpu:
name: "Latest Accelerate CPU [version]"
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, intel-cpu, 8-cpu, ci]
needs: get-version
steps:
- name: Set up Docker Buildx
Expand All @@ -41,7 +41,7 @@ jobs:

version-cuda:
name: "Latest Accelerate GPU [version]"
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, single-gpu, nvidia-gpu, t4, ci]
needs: get-version
steps:
- name: Set up Docker Buildx
Expand Down
15 changes: 2 additions & 13 deletions .github/workflows/build_docker_images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,9 @@ concurrency:
cancel-in-progress: false

jobs:
clean-storage:
name: "Clean docker image storage"
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
steps:
- name: Clean storage
run: |
docker image prune --all -f --filter "until=48h"
docker system prune --all -f --filter "until=48h"

latest-cpu:
name: "Latest Accelerate CPU [dev]"
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
needs: clean-storage
runs-on: [self-hosted, intel-cpu, 8-cpu, ci]
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
Expand All @@ -41,8 +31,7 @@ jobs:

latest-cuda:
name: "Latest Accelerate GPU [dev]"
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
needs: clean-storage
runs-on: [self-hosted, nvidia-gpu, t4, ci]
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
Expand Down
23 changes: 15 additions & 8 deletions .github/workflows/nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ env:

jobs:
run_all_tests_single_gpu:
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, single-gpu, nvidia-gpu, t4, ci]
env:
CUDA_VISIBLE_DEVICES: "0"
TEST_TYPE: "single_gpu"
Expand All @@ -22,37 +22,40 @@ jobs:
options: --gpus all --shm-size "16gb"
defaults:
run:
working-directory: accelerate/
shell: bash
steps:
- name: Update clone & pip install
run: |
source activate accelerate
git config --global --add safe.directory '*'
git fetch && git checkout ${{ github.sha }}
git clone https://github.com/huggingface/accelerate;
cd accelerate;
git checkout ${{ github.sha }};
pip install -e . --no-deps
pip install pytest-reportlog tabulate

- name: Run test on GPUs
working-directory: accelerate
run: |
source activate accelerate
make test

- name: Run examples on GPUs
working-directory: accelerate
if: always()
run: |
source activate accelerate
pip uninstall comet_ml -y
make test_examples

- name: Generate Report
working-directory: accelerate
if: always()
run: |
pip install slack_sdk tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY

run_all_tests_multi_gpu:
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, multi-gpu, nvidia-gpu, t4, ci]
env:
CUDA_VISIBLE_DEVICES: "0,1"
TEST_TYPE: "multi_gpu"
Expand All @@ -61,38 +64,42 @@ jobs:
options: --gpus all --shm-size "16gb"
defaults:
run:
working-directory: accelerate/
shell: bash
steps:
- name: Update clone
run: |
source activate accelerate
git config --global --add safe.directory '*'
git fetch && git checkout ${{ github.sha }}
git clone https://github.com/huggingface/accelerate;
cd accelerate;
git checkout ${{ github.sha }};
pip install -e . --no-deps
pip install pytest-reportlog tabulate

- name: Run core and big modeling tests on GPUs
working-directory: accelerate
run: |
source activate accelerate
make test_core
make test_big_modeling
make test_cli

- name: Run Integration tests on GPUs
working-directory: accelerate
if: always()
run: |
source activate accelerate
make test_integrations

- name: Run examples on GPUs
working-directory: accelerate
if: always()
run: |
source activate accelerate
pip uninstall comet_ml -y
make test_examples

- name: Generate Report
working-directory: accelerate
if: always()
run: |
pip install slack_sdk tabulate
Expand Down
59 changes: 34 additions & 25 deletions .github/workflows/run_merge_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,80 +10,89 @@ env:

jobs:
run_all_tests_single_gpu:
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, single-gpu, nvidia-gpu, t4, ci]
env:
CUDA_VISIBLE_DEVICES: "0"
container:
image: huggingface/accelerate-gpu:latest
options: --gpus all --shm-size "16gb"
defaults:
run:
working-directory: accelerate/
shell: bash
steps:
- name: Update clone & pip install
- name: Install accelerate
run: |
source activate accelerate
git config --global --add safe.directory '*'
git fetch && git checkout ${{ github.sha }}
pip install -e .[testing,test_trackers] -U
pip install pytest-reportlog tabulate
source activate accelerate;
git clone https://github.com/huggingface/accelerate;
cd accelerate;
git checkout ${{ github.sha }};
pip install -e .[testing,test_trackers] -U;
pip install pytest-reportlog tabulate ;

- name: Run CLI tests
- name: Run CLI tests (use make cli)
working-directory: accelerate
run: |
source activate accelerate
source activate accelerate;
make test_cli

- name: Run test on GPUs
working-directory: accelerate
if: always()
run: |
source activate accelerate
source activate accelerate;
make test
- name: Run examples on GPUs
working-directory: accelerate
if: always()
run: |
source activate accelerate
pip uninstall comet_ml -y
source activate accelerate;
pip uninstall comet_ml -y;
make test_examples

- name: Generate Report
working-directory: accelerate
if: always()
run: |
pip install tabulate
pip install tabulate;
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY

run_all_tests_multi_gpu:
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, multi-gpu, nvidia-gpu, t4, ci]
env:
CUDA_VISIBLE_DEVICES: 0,1
container:
image: huggingface/accelerate-gpu:latest
options: --gpus all --shm-size "16gb"
defaults:
run:
working-directory: accelerate/
shell: bash
steps:
- name: Update clone
run: |
source activate accelerate
git config --global --add safe.directory '*'
git fetch && git checkout ${{ github.sha }}
pip install -e .[testing,test_trackers] -U
source activate accelerate;
git clone https://github.com/huggingface/accelerate;
cd accelerate;
git checkout ${{ github.sha }};
pip install -e .[testing,test_trackers] -U;
pip install pytest-reportlog tabulate

- name: Run test on GPUs
working-directory: accelerate
run: |
source activate accelerate
source activate accelerate;
make test

- name: Run examples on GPUs
working-directory: accelerate
if: always()
run: |
source activate accelerate
pip uninstall comet_ml -y
source activate accelerate;
pip uninstall comet_ml -y;
make test_examples

- name: Generate Report
working-directory: accelerate
if: always()
run: |
pip install tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
source activate accelerate;
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
44 changes: 22 additions & 22 deletions .github/workflows/self_hosted_integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
container:
image: huggingface/accelerate-gpu:latest
options: --gpus all --shm-size "16gb"
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, multi-gpu, nvidia-gpu, t4, ci]
strategy:
fail-fast: false
matrix:
Expand All @@ -34,22 +34,22 @@ jobs:
"0,1"
]
steps:
- name: Update accelerate clone and pip install
working-directory: accelerate/
run:
- name: Install transformers
run: |
source activate accelerate;
git config --global --add safe.directory '*';
git checkout main && git fetch && git checkout ${{ github.sha }};
pip install -e .;
git clone https://github.com/huggingface/transformers --depth 1;
cd transformers;
pip install .[torch,deepspeed-testing];
cd ..;

- name: Update transformers clone & pip install
working-directory: transformers/
- name: Install accelerate
run: |
source activate accelerate
git config --global --add safe.directory '*'
git checkout main && git pull
pip install .[torch,deepspeed-testing]
pip uninstall comet_ml wandb -y
source activate accelerate;
git clone https://github.com/huggingface/accelerate;
cd accelerate;
git checkout ${{ github.sha }} ;
pip install -e .[testing];
cd ..;

- name: Show installed libraries
run: |
Expand Down Expand Up @@ -89,20 +89,20 @@ jobs:
container:
image: huggingface/accelerate-gpu:latest
options: --gpus all --shm-size "16gb"
runs-on: [self-hosted, docker-gpu, multi-gpu, gcp]
runs-on: [self-hosted, multi-gpu, nvidia-gpu, t4, ci]
strategy:
fail-fast: false
steps:
- name: Update accelerate clone and pip install
working-directory: accelerate/
- name: Install accelerate
run:
source activate accelerate;
git config --global --add safe.directory '*';
git checkout main && git fetch && git checkout ${{ github.sha }};
pip install -e .;
git clone https://github.com/huggingface/accelerate;
cd accelerate;
git checkout ${{ github.sha }};
pip install -e .[testing];
cd ..

- name: Update skorch clone & pip install
working-directory: skorch/
- name: Install skorch
run: |
source activate accelerate
git config --global --add safe.directory '*'
Expand Down
Loading