Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
174 commits
Select commit Hold shift + click to select a range
682b487
test
philschmid Feb 5, 2024
43dd281
to 4.36
philschmid Feb 5, 2024
6584b73
build image
philschmid Feb 5, 2024
1cda02f
fxi
philschmid Feb 5, 2024
b262224
Move GPU to EKS
glegendre01 Feb 13, 2024
8271cc7
cuda 12, remove conda
Feb 13, 2024
3887e21
Merge remote-tracking branch 'origin/move-gpu-to-eks' into cuda12
rafaelpierrehf Feb 13, 2024
f514a5e
integ test 2.0
rafaelpierrehf Feb 14, 2024
b164b31
2.0
rafaelpierrehf Feb 14, 2024
70d6003
fix
rafaelpierrehf Feb 14, 2024
b1cc6a2
indent
rafaelpierrehf Feb 14, 2024
c728190
docker buildx
rafaelpierrehf Feb 14, 2024
b5ba045
depends on
rafaelpierrehf Feb 14, 2024
10b62b7
name
rafaelpierrehf Feb 14, 2024
383dab3
indent
rafaelpierrehf Feb 14, 2024
0d162de
name
rafaelpierrehf Feb 14, 2024
b22370b
trigger
rafaelpierrehf Feb 14, 2024
db90673
colon
rafaelpierrehf Feb 14, 2024
90875ba
v4
rafaelpierrehf Feb 14, 2024
9066cc8
download
rafaelpierrehf Feb 14, 2024
6fa3cb0
ls
rafaelpierrehf Feb 14, 2024
667299a
indent
rafaelpierrehf Feb 14, 2024
e64a76a
cache
rafaelpierrehf Feb 14, 2024
6b6b33c
v4
rafaelpierrehf Feb 14, 2024
9c223fe
revert
rafaelpierrehf Feb 14, 2024
6036a44
path
rafaelpierrehf Feb 14, 2024
3731fd4
slash
rafaelpierrehf Feb 14, 2024
ecde720
tar
rafaelpierrehf Feb 14, 2024
85a2996
path
rafaelpierrehf Feb 14, 2024
cf2b0ae
reduce image size
rafaelpierrehf Feb 15, 2024
356c813
test_integ_new
rafaelpierrehf Feb 15, 2024
c2f6618
tenacity
rafaelpierrehf Feb 15, 2024
79ee67e
retry if
rafaelpierrehf Feb 15, 2024
12af852
retry config
rafaelpierrehf Feb 15, 2024
8697723
uv & venv
Feb 16, 2024
3587eda
fast unit tests passing
Feb 17, 2024
e0f5ea2
pass short unit
rafaelpierrehf Feb 17, 2024
cd50871
tensorflow
rafaelpierrehf Feb 17, 2024
600edb0
tox
rafaelpierrehf Feb 17, 2024
7e57085
cpu images
rafaelpierrehf Feb 19, 2024
16dc0f4
conversational passing integration
rafaelpierrehf Feb 19, 2024
286a877
tox multiprocess
rafaelpierrehf Feb 19, 2024
54d110b
remove tf from integration test in tox.ini
rafaelpierrehf Feb 19, 2024
749093b
local container tests
rafaelpierrehf Feb 20, 2024
3daec64
torch integ local passing
rafaelpierrehf Feb 20, 2024
de58ba5
tf local pass
rafaelpierrehf Feb 20, 2024
bbdd3a0
tf remote pass
rafaelpierrehf Feb 20, 2024
dde132e
tox
rafaelpierrehf Feb 20, 2024
daeae06
require_tf
rafaelpierrehf Feb 20, 2024
3c17452
workflow
rafaelpierrehf Feb 20, 2024
e01ea5c
gpu integ
rafaelpierrehf Feb 20, 2024
09adb51
unit
rafaelpierrehf Feb 20, 2024
6e11450
log level
rafaelpierrehf Feb 20, 2024
591ae0a
verbose
rafaelpierrehf Feb 20, 2024
73ae3fe
ffmpeg
rafaelpierrehf Feb 20, 2024
fa24df0
update
rafaelpierrehf Feb 20, 2024
2e5efd0
level:
rafaelpierrehf Feb 20, 2024
65c6f16
debug
rafaelpierrehf Feb 20, 2024
3014c04
true
rafaelpierrehf Feb 20, 2024
cff49c9
install command
rafaelpierrehf Feb 20, 2024
ca4a964
deps
rafaelpierrehf Feb 20, 2024
70fb401
torch
rafaelpierrehf Feb 20, 2024
1cecf47
runs on
rafaelpierrehf Feb 20, 2024
231efa5
unit
rafaelpierrehf Feb 20, 2024
c094365
workflow
rafaelpierrehf Feb 20, 2024
5f35e46
install
rafaelpierrehf Feb 20, 2024
e2691f5
cuda
rafaelpierrehf Feb 20, 2024
60cc692
cuda & transformers
rafaelpierrehf Feb 20, 2024
5c7d2db
dependencies
rafaelpierrehf Feb 20, 2024
6397b4c
nvidia & cache
rafaelpierrehf Feb 20, 2024
78d79da
cuda drivers
rafaelpierrehf Feb 20, 2024
c80d1aa
whisper tiny pass
rafaelpierrehf Feb 20, 2024
0ed51c1
pass
rafaelpierrehf Feb 21, 2024
73ba40b
pass
rafaelpierrehf Feb 21, 2024
29809bf
tf pass
rafaelpierrehf Feb 21, 2024
e4976a3
run unit tests inside docker
rafaelpierrehf Feb 21, 2024
edf8b98
tox
rafaelpierrehf Feb 21, 2024
45d5154
dockerfile
rafaelpierrehf Feb 21, 2024
ef8d5cf
uv
rafaelpierrehf Feb 21, 2024
8674cf0
docker images
rafaelpierrehf Feb 21, 2024
5988226
cache
rafaelpierrehf Feb 21, 2024
f517ef2
push
rafaelpierrehf Feb 21, 2024
eb2ac68
local registry
rafaelpierrehf Feb 21, 2024
f0aff35
make build
rafaelpierrehf Feb 21, 2024
dc9f4e4
container name
rafaelpierrehf Feb 21, 2024
264d6dd
dry run
rafaelpierrehf Feb 21, 2024
2787c23
remove -it
rafaelpierrehf Feb 21, 2024
7efa257
echo
rafaelpierrehf Feb 21, 2024
478e4a0
integration
rafaelpierrehf Feb 22, 2024
c92d27f
debug
rafaelpierrehf Feb 22, 2024
2bd0851
conversational
rafaelpierrehf Feb 22, 2024
51b2bc6
debug
rafaelpierrehf Feb 22, 2024
9e39ba2
device
rafaelpierrehf Feb 22, 2024
199099c
from_env
rafaelpierrehf Feb 22, 2024
e037c1a
debug level
rafaelpierrehf Feb 22, 2024
4387c80
socket
rafaelpierrehf Feb 22, 2024
d3b66f3
exception when starting container
rafaelpierrehf Feb 22, 2024
c658ad6
error
rafaelpierrehf Feb 22, 2024
f9e7daa
permissions
rafaelpierrehf Feb 22, 2024
bd8302f
order
rafaelpierrehf Feb 22, 2024
d2bc1b5
isolate
rafaelpierrehf Feb 22, 2024
4544b98
dry run
rafaelpierrehf Feb 22, 2024
bc6c5de
dry run
rafaelpierrehf Feb 22, 2024
fbfc7f8
fix dry run params
rafaelpierrehf Feb 22, 2024
0db0518
quotes
rafaelpierrehf Feb 22, 2024
be92d7c
check path
rafaelpierrehf Feb 22, 2024
d58ec57
backslash
rafaelpierrehf Feb 22, 2024
3049fed
change path
rafaelpierrehf Feb 22, 2024
c8945bc
host path
rafaelpierrehf Feb 22, 2024
825b933
look into cache
rafaelpierrehf Feb 23, 2024
741d4d0
path
rafaelpierrehf Feb 23, 2024
c5c4ed5
cache
rafaelpierrehf Feb 23, 2024
d828f61
env vars for cache
rafaelpierrehf Feb 23, 2024
1e592b0
dry run
rafaelpierrehf Feb 23, 2024
c51df3a
add volume
rafaelpierrehf Feb 23, 2024
6857477
path
rafaelpierrehf Feb 23, 2024
0aa64e0
cache dry run
rafaelpierrehf Feb 23, 2024
cce368b
install cli
rafaelpierrehf Feb 23, 2024
aa94250
model dir
rafaelpierrehf Feb 23, 2024
919ac71
config
rafaelpierrehf Feb 24, 2024
80bac49
-n 10
rafaelpierrehf Feb 24, 2024
bf8c429
pass cpu
rafaelpierrehf Feb 24, 2024
c46e85b
dry run local cpu
rafaelpierrehf Feb 24, 2024
b26522a
format
rafaelpierrehf Feb 24, 2024
a707458
review
rafaelpierrehf Feb 26, 2024
7673903
.vscode
rafaelpierrehf Feb 26, 2024
fac74d5
venv
rafaelpierrehf Feb 26, 2024
455c38e
-n 4
rafaelpierrehf Feb 26, 2024
027a781
readme.md
rafaelpierrehf Feb 26, 2024
b3c9905
contributing
rafaelpierrehf Feb 26, 2024
d9455ef
paths ignore
rafaelpierrehf Feb 26, 2024
68268c1
py version
rafaelpierrehf Feb 26, 2024
0149d03
comments
rafaelpierrehf Feb 26, 2024
557bd1b
comments
rafaelpierrehf Feb 26, 2024
6e34590
dialog model
rafaelpierrehf Feb 26, 2024
e8e896f
dockerfile
rafaelpierrehf Feb 26, 2024
2afeaad
dockerfile
rafaelpierrehf Feb 26, 2024
073f358
tox
rafaelpierrehf Feb 26, 2024
a77ed50
unit tests
rafaelpierrehf Feb 27, 2024
ab1f3f2
pip
rafaelpierrehf Feb 27, 2024
c353728
readme
rafaelpierrehf Feb 27, 2024
a2c3442
unit
rafaelpierrehf Feb 27, 2024
167018c
cache
rafaelpierrehf Feb 27, 2024
0ac2960
hub cache
rafaelpierrehf Feb 27, 2024
7b922a4
remove cache
rafaelpierrehf Feb 27, 2024
59db104
unit cache
rafaelpierrehf Feb 27, 2024
9d294d5
cache
rafaelpierrehf Feb 27, 2024
88787e4
passenv
rafaelpierrehf Feb 27, 2024
b1ee387
cleanup
rafaelpierrehf Feb 27, 2024
f3051ec
comments
rafaelpierrehf Feb 27, 2024
606e410
fix
rafaelpierrehf Feb 27, 2024
ef15995
remove tox
rafaelpierrehf Feb 27, 2024
a6f0781
fix
rafaelpierrehf Feb 27, 2024
68a87c1
path
rafaelpierrehf Feb 27, 2024
040d581
concurrency
rafaelpierrehf Feb 27, 2024
605c7f3
fix
rafaelpierrehf Feb 28, 2024
c7a3cd0
runs on;
rafaelpierrehf Feb 28, 2024
0cccf69
cpu
rafaelpierrehf Feb 28, 2024
8d8e68a
unit tests
rafaelpierrehf Feb 28, 2024
3996bd4
-r
rafaelpierrehf Feb 28, 2024
7251139
ignore
rafaelpierrehf Feb 28, 2024
819cd33
path
rafaelpierrehf Feb 28, 2024
b11741f
cache
rafaelpierrehf Feb 28, 2024
e8cab4b
backslash
rafaelpierrehf Feb 28, 2024
35f92bc
st, diffusers
rafaelpierrehf Feb 28, 2024
00503c3
cache test dir
rafaelpierrehf Feb 28, 2024
b34d991
gpus
rafaelpierrehf Feb 28, 2024
d8a60d1
custom pipeline path
rafaelpierrehf Feb 28, 2024
5b55a66
fix
rafaelpierrehf Feb 28, 2024
088a2d8
payload
rafaelpierrehf Feb 28, 2024
c628acb
final comments
rafaelpierrehf Feb 28, 2024
50bea98
Update dockerfiles/pytorch/Dockerfile
rafaelpierrehf Feb 28, 2024
0b93a74
Update README.md
rafaelpierrehf Feb 28, 2024
0096a3e
Update dockerfiles/pytorch/Dockerfile
rafaelpierrehf Feb 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.github
.pytest_cache
.ruff_cache
.tox
.venv
.gitignore
makefile
__pycache__
tests
.vscode
5 changes: 3 additions & 2 deletions .github/workflows/build-container.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ jobs:
uses: ./.github/workflows/docker-build-action.yaml
with:
image: inference-pytorch-cpu
dockerfile: dockerfiles/pytorch/cpu/Dockerfile
dockerfile: dockerfiles/pytorch/Dockerfile
build_args: "BASE_IMAGE=ubuntu:22.04"
secrets:
TAILSCALE_AUTHKEY: ${{ secrets.TAILSCALE_AUTHKEY }}
REGISTRY_USERNAME: ${{ secrets.REGISTRY_USERNAME }}
Expand All @@ -28,7 +29,7 @@ jobs:
uses: ./.github/workflows/docker-build-action.yaml
with:
image: inference-pytorch-gpu
dockerfile: dockerfiles/pytorch/gpu/Dockerfile
dockerfile: dockerfiles/pytorch/Dockerfile
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be the wrong place.

Copy link
Contributor

@rafaelpierrehf rafaelpierrehf Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have one image now. Only diff between gpu and cpu is the base image. CUDA Development is the default base image.

e.g. for CPU:

 build_args: "BASE_IMAGE=ubuntu:22.04"

secrets:
TAILSCALE_AUTHKEY: ${{ secrets.TAILSCALE_AUTHKEY }}
REGISTRY_USERNAME: ${{ secrets.REGISTRY_USERNAME }}
Expand Down
116 changes: 0 additions & 116 deletions .github/workflows/gpu-integ-test.yaml

This file was deleted.

51 changes: 0 additions & 51 deletions .github/workflows/integ-test.yaml

This file was deleted.

69 changes: 69 additions & 0 deletions .github/workflows/integration-test-action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
on:
workflow_call:
inputs:
region:
type: string
required: false
default: "us-east-1"
hf_home:
required: false
type: string
default: "/mnt/hf_cache/"
hf_hub_cache:
required: false
type: string
default: "/mnt/hf_cache/hub"
run_slow:
required: false
type: string
default: "True"
test_path:
type: string
required: true
test_parallelism:
type: string
required: false
default: "4"
build_img_cmd:
type: string
required: false
default: "make inference-pytorch-gpu"
log_level:
type: string
required: false
default: "ERROR"
log_format:
type: string
required: false
default: "%(asctime)s %(levelname)s %(module)s:%(lineno)d %(message)s"
runs_on:
type: string
required: false
default: '["single-gpu", "nvidia-gpu", "t4", "ci"]'

jobs:
pytorch-integration-tests:
runs-on: ${{ fromJson(inputs.runs_on) }}
env:
AWS_REGION: ${{ inputs.region }}
HF_HOME: ${{ inputs.hf_home }}
HF_HUB_CACHE: ${{ inputs.hf_hub_cache }}
RUN_SLOW: ${{ inputs.run_slow }}
steps:
- uses: actions/checkout@v4.1.1
- name: Docker Setup Buildx
uses: docker/setup-buildx-action@v3.0.0
- name: Docker Build
run: ${{ inputs.build_img_cmd }}
- name: Set up Python 3.11
uses: actions/setup-python@v2
with:
python-version: 3.11
- name: Install dependencies
run: pip install ".[torch, test]"
- name: Run local integration tests
run: |
python -m pytest \
${{ inputs.test_path }} -n ${{ inputs.test_parallelism }} \
--log-cli-level='${{ inputs.log_level }}' \
--log-format='${{ inputs.log_format }}'
44 changes: 44 additions & 0 deletions .github/workflows/integration-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Run Integration Tests

on:
push:
paths-ignore:
- 'README.md'
- '.github/workflows/unit-test.yaml'
- '.github/workflows/quality.yaml'
branches:
- main
pull_request:
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
pytorch-integration-local-gpu:
name: Local Integration Tests - GPU
uses: ./.github/workflows/integration-test-action.yaml
with:
test_path: "tests/integ/test_pytorch_local_gpu.py"
build_img_cmd: "make inference-pytorch-gpu"
pytorch-integration-remote-gpu:
name: Remote Integration Tests - GPU
uses: ./.github/workflows/integration-test-action.yaml
with:
test_path: "tests/integ/test_pytorch_remote_gpu.py"
build_img_cmd: "make inference-pytorch-gpu"
pytorch-integration-remote-cpu:
name: Remote Integration Tests - CPU
uses: ./.github/workflows/integration-test-action.yaml
with:
test_path: "tests/integ/test_pytorch_remote_cpu.py"
build_img_cmd: "make inference-pytorch-cpu"
runs_on: "['ci']"
pytorch-integration-local-cpu:
name: Local Integration Tests - CPU
uses: ./.github/workflows/integration-test-action.yaml
with:
test_path: "tests/integ/test_pytorch_local_cpu.py"
build_img_cmd: "make inference-pytorch-cpu"
runs_on: "['ci']"
6 changes: 4 additions & 2 deletions .github/workflows/quality.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ name: Quality Check

on:
push:
paths-ignore:
- 'README.md'
branches:
- main
pull_request:
Expand All @@ -16,10 +18,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
- name: Set up Python 3.11
uses: actions/setup-python@v2
with:
python-version: 3.9
python-version: 3.11
- name: Install Python dependencies
run: pip install -e .[quality]
- name: Run Quality check
Expand Down
Loading