-
Notifications
You must be signed in to change notification settings - Fork 23
Upgrade to Cuda12 and latest versions #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
174 commits
Select commit
Hold shift + click to select a range
682b487
test
philschmid 43dd281
to 4.36
philschmid 6584b73
build image
philschmid 1cda02f
fxi
philschmid b262224
Move GPU to EKS
glegendre01 8271cc7
cuda 12, remove conda
3887e21
Merge remote-tracking branch 'origin/move-gpu-to-eks' into cuda12
rafaelpierrehf f514a5e
integ test 2.0
rafaelpierrehf b164b31
2.0
rafaelpierrehf 70d6003
fix
rafaelpierrehf b1cc6a2
indent
rafaelpierrehf c728190
docker buildx
rafaelpierrehf b5ba045
depends on
rafaelpierrehf 10b62b7
name
rafaelpierrehf 383dab3
indent
rafaelpierrehf 0d162de
name
rafaelpierrehf b22370b
trigger
rafaelpierrehf db90673
colon
rafaelpierrehf 90875ba
v4
rafaelpierrehf 9066cc8
download
rafaelpierrehf 6fa3cb0
ls
rafaelpierrehf 667299a
indent
rafaelpierrehf e64a76a
cache
rafaelpierrehf 6b6b33c
v4
rafaelpierrehf 9c223fe
revert
rafaelpierrehf 6036a44
path
rafaelpierrehf 3731fd4
slash
rafaelpierrehf ecde720
tar
rafaelpierrehf 85a2996
path
rafaelpierrehf cf2b0ae
reduce image size
rafaelpierrehf 356c813
test_integ_new
rafaelpierrehf c2f6618
tenacity
rafaelpierrehf 79ee67e
retry if
rafaelpierrehf 12af852
retry config
rafaelpierrehf 8697723
uv & venv
3587eda
fast unit tests passing
e0f5ea2
pass short unit
rafaelpierrehf cd50871
tensorflow
rafaelpierrehf 600edb0
tox
rafaelpierrehf 7e57085
cpu images
rafaelpierrehf 16dc0f4
conversational passing integration
rafaelpierrehf 286a877
tox multiprocess
rafaelpierrehf 54d110b
remove tf from integration test in tox.ini
rafaelpierrehf 749093b
local container tests
rafaelpierrehf 3daec64
torch integ local passing
rafaelpierrehf de58ba5
tf local pass
rafaelpierrehf bbdd3a0
tf remote pass
rafaelpierrehf dde132e
tox
rafaelpierrehf daeae06
require_tf
rafaelpierrehf 3c17452
workflow
rafaelpierrehf e01ea5c
gpu integ
rafaelpierrehf 09adb51
unit
rafaelpierrehf 6e11450
log level
rafaelpierrehf 591ae0a
verbose
rafaelpierrehf 73ae3fe
ffmpeg
rafaelpierrehf fa24df0
update
rafaelpierrehf 2e5efd0
level:
rafaelpierrehf 65c6f16
debug
rafaelpierrehf 3014c04
true
rafaelpierrehf cff49c9
install command
rafaelpierrehf ca4a964
deps
rafaelpierrehf 70fb401
torch
rafaelpierrehf 1cecf47
runs on
rafaelpierrehf 231efa5
unit
rafaelpierrehf c094365
workflow
rafaelpierrehf 5f35e46
install
rafaelpierrehf e2691f5
cuda
rafaelpierrehf 60cc692
cuda & transformers
rafaelpierrehf 5c7d2db
dependencies
rafaelpierrehf 6397b4c
nvidia & cache
rafaelpierrehf 78d79da
cuda drivers
rafaelpierrehf c80d1aa
whisper tiny pass
rafaelpierrehf 0ed51c1
pass
rafaelpierrehf 73ba40b
pass
rafaelpierrehf 29809bf
tf pass
rafaelpierrehf e4976a3
run unit tests inside docker
rafaelpierrehf edf8b98
tox
rafaelpierrehf 45d5154
dockerfile
rafaelpierrehf ef8d5cf
uv
rafaelpierrehf 8674cf0
docker images
rafaelpierrehf 5988226
cache
rafaelpierrehf f517ef2
push
rafaelpierrehf eb2ac68
local registry
rafaelpierrehf f0aff35
make build
rafaelpierrehf dc9f4e4
container name
rafaelpierrehf 264d6dd
dry run
rafaelpierrehf 2787c23
remove -it
rafaelpierrehf 7efa257
echo
rafaelpierrehf 478e4a0
integration
rafaelpierrehf c92d27f
debug
rafaelpierrehf 2bd0851
conversational
rafaelpierrehf 51b2bc6
debug
rafaelpierrehf 9e39ba2
device
rafaelpierrehf 199099c
from_env
rafaelpierrehf e037c1a
debug level
rafaelpierrehf 4387c80
socket
rafaelpierrehf d3b66f3
exception when starting container
rafaelpierrehf c658ad6
error
rafaelpierrehf f9e7daa
permissions
rafaelpierrehf bd8302f
order
rafaelpierrehf d2bc1b5
isolate
rafaelpierrehf 4544b98
dry run
rafaelpierrehf bc6c5de
dry run
rafaelpierrehf fbfc7f8
fix dry run params
rafaelpierrehf 0db0518
quotes
rafaelpierrehf be92d7c
check path
rafaelpierrehf d58ec57
backslash
rafaelpierrehf 3049fed
change path
rafaelpierrehf c8945bc
host path
rafaelpierrehf 825b933
look into cache
rafaelpierrehf 741d4d0
path
rafaelpierrehf c5c4ed5
cache
rafaelpierrehf d828f61
env vars for cache
rafaelpierrehf 1e592b0
dry run
rafaelpierrehf c51df3a
add volume
rafaelpierrehf 6857477
path
rafaelpierrehf 0aa64e0
cache dry run
rafaelpierrehf cce368b
install cli
rafaelpierrehf aa94250
model dir
rafaelpierrehf 919ac71
config
rafaelpierrehf 80bac49
-n 10
rafaelpierrehf bf8c429
pass cpu
rafaelpierrehf c46e85b
dry run local cpu
rafaelpierrehf b26522a
format
rafaelpierrehf a707458
review
rafaelpierrehf 7673903
.vscode
rafaelpierrehf fac74d5
venv
rafaelpierrehf 455c38e
-n 4
rafaelpierrehf 027a781
readme.md
rafaelpierrehf b3c9905
contributing
rafaelpierrehf d9455ef
paths ignore
rafaelpierrehf 68268c1
py version
rafaelpierrehf 0149d03
comments
rafaelpierrehf 557bd1b
comments
rafaelpierrehf 6e34590
dialog model
rafaelpierrehf e8e896f
dockerfile
rafaelpierrehf 2afeaad
dockerfile
rafaelpierrehf 073f358
tox
rafaelpierrehf a77ed50
unit tests
rafaelpierrehf ab1f3f2
pip
rafaelpierrehf c353728
readme
rafaelpierrehf a2c3442
unit
rafaelpierrehf 167018c
cache
rafaelpierrehf 0ac2960
hub cache
rafaelpierrehf 7b922a4
remove cache
rafaelpierrehf 59db104
unit cache
rafaelpierrehf 9d294d5
cache
rafaelpierrehf 88787e4
passenv
rafaelpierrehf b1ee387
cleanup
rafaelpierrehf f3051ec
comments
rafaelpierrehf 606e410
fix
rafaelpierrehf ef15995
remove tox
rafaelpierrehf a6f0781
fix
rafaelpierrehf 68a87c1
path
rafaelpierrehf 040d581
concurrency
rafaelpierrehf 605c7f3
fix
rafaelpierrehf c7a3cd0
runs on;
rafaelpierrehf 0cccf69
cpu
rafaelpierrehf 8d8e68a
unit tests
rafaelpierrehf 3996bd4
-r
rafaelpierrehf 7251139
ignore
rafaelpierrehf 819cd33
path
rafaelpierrehf b11741f
cache
rafaelpierrehf e8cab4b
backslash
rafaelpierrehf 35f92bc
st, diffusers
rafaelpierrehf 00503c3
cache test dir
rafaelpierrehf b34d991
gpus
rafaelpierrehf d8a60d1
custom pipeline path
rafaelpierrehf 5b55a66
fix
rafaelpierrehf 088a2d8
payload
rafaelpierrehf c628acb
final comments
rafaelpierrehf 50bea98
Update dockerfiles/pytorch/Dockerfile
rafaelpierrehf 0b93a74
Update README.md
rafaelpierrehf 0096a3e
Update dockerfiles/pytorch/Dockerfile
rafaelpierrehf File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| .github | ||
| .pytest_cache | ||
| .ruff_cache | ||
| .tox | ||
| .venv | ||
| .gitignore | ||
| makefile | ||
| __pycache__ | ||
| tests | ||
| .vscode |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| on: | ||
| workflow_call: | ||
| inputs: | ||
| region: | ||
| type: string | ||
| required: false | ||
| default: "us-east-1" | ||
| hf_home: | ||
| required: false | ||
| type: string | ||
| default: "/mnt/hf_cache/" | ||
| hf_hub_cache: | ||
| required: false | ||
| type: string | ||
| default: "/mnt/hf_cache/hub" | ||
| run_slow: | ||
| required: false | ||
| type: string | ||
| default: "True" | ||
| test_path: | ||
| type: string | ||
| required: true | ||
| test_parallelism: | ||
| type: string | ||
| required: false | ||
| default: "4" | ||
| build_img_cmd: | ||
| type: string | ||
| required: false | ||
| default: "make inference-pytorch-gpu" | ||
| log_level: | ||
| type: string | ||
| required: false | ||
| default: "ERROR" | ||
| log_format: | ||
| type: string | ||
| required: false | ||
| default: "%(asctime)s %(levelname)s %(module)s:%(lineno)d %(message)s" | ||
| runs_on: | ||
| type: string | ||
| required: false | ||
| default: '["single-gpu", "nvidia-gpu", "t4", "ci"]' | ||
|
|
||
| jobs: | ||
| pytorch-integration-tests: | ||
| runs-on: ${{ fromJson(inputs.runs_on) }} | ||
| env: | ||
| AWS_REGION: ${{ inputs.region }} | ||
| HF_HOME: ${{ inputs.hf_home }} | ||
| HF_HUB_CACHE: ${{ inputs.hf_hub_cache }} | ||
| RUN_SLOW: ${{ inputs.run_slow }} | ||
| steps: | ||
| - uses: actions/checkout@v4.1.1 | ||
| - name: Docker Setup Buildx | ||
| uses: docker/setup-buildx-action@v3.0.0 | ||
| - name: Docker Build | ||
| run: ${{ inputs.build_img_cmd }} | ||
| - name: Set up Python 3.11 | ||
| uses: actions/setup-python@v2 | ||
| with: | ||
| python-version: 3.11 | ||
| - name: Install dependencies | ||
| run: pip install ".[torch, test]" | ||
| - name: Run local integration tests | ||
| run: | | ||
| python -m pytest \ | ||
| ${{ inputs.test_path }} -n ${{ inputs.test_parallelism }} \ | ||
| --log-cli-level='${{ inputs.log_level }}' \ | ||
| --log-format='${{ inputs.log_format }}' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| name: Run Integration Tests | ||
|
|
||
| on: | ||
| push: | ||
| paths-ignore: | ||
| - 'README.md' | ||
| - '.github/workflows/unit-test.yaml' | ||
| - '.github/workflows/quality.yaml' | ||
| branches: | ||
| - main | ||
| pull_request: | ||
| workflow_dispatch: | ||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| pytorch-integration-local-gpu: | ||
| name: Local Integration Tests - GPU | ||
| uses: ./.github/workflows/integration-test-action.yaml | ||
| with: | ||
| test_path: "tests/integ/test_pytorch_local_gpu.py" | ||
| build_img_cmd: "make inference-pytorch-gpu" | ||
| pytorch-integration-remote-gpu: | ||
| name: Remote Integration Tests - GPU | ||
| uses: ./.github/workflows/integration-test-action.yaml | ||
| with: | ||
| test_path: "tests/integ/test_pytorch_remote_gpu.py" | ||
| build_img_cmd: "make inference-pytorch-gpu" | ||
| pytorch-integration-remote-cpu: | ||
| name: Remote Integration Tests - CPU | ||
| uses: ./.github/workflows/integration-test-action.yaml | ||
| with: | ||
| test_path: "tests/integ/test_pytorch_remote_cpu.py" | ||
| build_img_cmd: "make inference-pytorch-cpu" | ||
| runs_on: "['ci']" | ||
| pytorch-integration-local-cpu: | ||
| name: Local Integration Tests - CPU | ||
| uses: ./.github/workflows/integration-test-action.yaml | ||
| with: | ||
| test_path: "tests/integ/test_pytorch_local_cpu.py" | ||
| build_img_cmd: "make inference-pytorch-cpu" | ||
| runs_on: "['ci']" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to be the wrong place.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have one image now. Only diff between
gpuandcpuis the base image. CUDA Development is the default base image.e.g. for CPU: