Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove inf1 support and upgrade some package versions #785

Merged
merged 1 commit into from
Jun 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/docker-nightly-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
arch: [ cpu, cpu-full, deepspeed, pytorch-inf1, pytorch-inf2, pytorch-cu118, fastertransformer ]
arch: [ cpu, cpu-full, deepspeed, pytorch-inf2, pytorch-cu118, fastertransformer ]
steps:
- uses: actions/checkout@v3
- name: Login to Docker
Expand Down
73 changes: 1 addition & 72 deletions .github/workflows/integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,19 +33,9 @@ jobs:
--fail \
| jq '.token' | tr -d '"' )
./start_instance.sh action_graviton $token djl-serving
- name: Create new Inferentia instance
id: create_inf
run: |
cd /home/ubuntu/djl_benchmark_script/scripts
token=$( curl -X POST -H "Authorization: token ${{ secrets.ACTION_RUNNER_PERSONAL_TOKEN }}" \
https://api.github.com/repos/deepjavalibrary/djl-serving/actions/runners/registration-token \
--fail \
| jq '.token' | tr -d '"' )
./start_instance.sh action_inf $token djl-serving
outputs:
gpu_instance_id: ${{ steps.create_gpu.outputs.action_gpu_instance_id }}
aarch64_instance_id: ${{ steps.create_aarch64.outputs.action_graviton_instance_id }}
inf_instance_id: ${{ steps.create_inf.outputs.action_inf_instance_id }}

cpu-test:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -153,65 +143,6 @@ jobs:
name: ${{ matrix.arch }}-logs
path: tests/integration/logs/

inferentia-test:
runs-on: [ self-hosted, inf ]
timeout-minutes: 30
needs: create-runners
steps:
- uses: actions/checkout@v3
- name: Clean env
run: |
yes | docker system prune -a --volumes
sudo rm -rf /home/ubuntu/actions-runner/_work/_tool/Java_Corretto_jdk/
echo "wait dpkg lock..."
while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do sleep 5; done
- name: Set up JDK 11
uses: actions/setup-java@v3
with:
distribution: 'corretto'
java-version: 11
- uses: actions/cache@v3
with:
path: ~/.gradle/caches
key: ${{ runner.os }}-gradle-${{ hashFiles('**/*.gradle*') }}
restore-keys: |
${{ runner.os }}-gradle-
- name: Install DJL-Bench
working-directory: benchmark
run: ./gradlew installOnLinux
- name: Build container name
run: ./serving/docker/scripts/docker_name_builder.sh pytorch-inf1 ${{ github.event.inputs.djl-version }}
- name: Download models and dockers
working-directory: tests/integration
run: |
docker pull deepjavalibrary/djl-serving:$DJLSERVING_DOCKER_TAG
mkdir logs
./download_models.sh pytorch-inf1
- name: Test Pytorch model
working-directory: tests/integration
run: |
./launch_container.sh deepjavalibrary/djl-serving:$DJLSERVING_DOCKER_TAG $PWD/models pytorch-inf1 \
serve -m test::PyTorch=file:/opt/ml/model/resnet18_inf1_1_12.tar.gz?model_name=resnet18_inf1_1_12
./test_client.sh image/jpg models/kitten.jpg
docker rm -f $(docker ps -aq)
- name: Test Python mode
working-directory: tests/integration
run: |
./launch_container.sh deepjavalibrary/djl-serving:$DJLSERVING_DOCKER_TAG $PWD/models pytorch-inf1 \
serve -m test::Python=file:/opt/ml/model/resnet18_inf1_1_12.tar.gz
./test_client.sh image/jpg models/kitten.jpg
docker rm -f $(docker ps -aq)
- name: On fail step
if: ${{ failure() }}
working-directory: tests/integration
run: |
cat logs/serving.log
- name: Upload test logs
uses: actions/upload-artifact@v3
with:
name: pytorch-inf1-logs
path: tests/integration/logs/

gpu-test:
runs-on: [ self-hosted, gpu ]
timeout-minutes: 30
Expand Down Expand Up @@ -336,7 +267,7 @@ jobs:
stop-runners:
if: always()
runs-on: [ self-hosted, scheduler ]
needs: [ create-runners, inferentia-test, aarch64-test, gpu-test ]
needs: [ create-runners, aarch64-test, gpu-test ]
steps:
- name: Stop all instances
run: |
Expand All @@ -345,5 +276,3 @@ jobs:
./stop_instance.sh $instance_id
instance_id=${{ needs.create-runners.outputs.aarch64_instance_id }}
./stop_instance.sh $instance_id
instance_id=${{ needs.create-runners.outputs.inf_instance_id }}
./stop_instance.sh $instance_id
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ To see examples, see the [starting page](serving/docs/starting.md).
### More examples

- [Serving a Python model](https://github.com/deepjavalibrary/djl-demo/tree/master/huggingface/python)
- [Serving on Inf1 EC2 instance](https://github.com/deepjavalibrary/djl-demo/tree/master/huggingface/inferentia)
- [Serving on Inferentia EC2 instance](https://github.com/deepjavalibrary/djl-demo/tree/master/huggingface/inferentia)
- [Serving with docker](https://github.com/deepjavalibrary/djl-serving/tree/master/serving/docker)

### More command line options
Expand Down
4 changes: 2 additions & 2 deletions serving/docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,6 @@ docker run -it --runtime=nvidia --shm-size 2g -v $PWD:/opt/ml/model -p 8080:8080
mkdir models
cd models

curl -O https://resources.djl.ai/test-models/pytorch/bert_qa_inf1.tar.gz
docker run --device /dev/neuron0 -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.22.1-inf1
curl -O https://resources.djl.ai/test-models/pytorch/resnet18_inf2_2_4.tar.gz
docker run --device /dev/neuron0 -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.22.1-pytorch-inf2
```
5 changes: 0 additions & 5 deletions serving/docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,6 @@ services:
context: .
dockerfile: deepspeed.Dockerfile
image: "deepjavalibrary/djl-serving:${RELEASE_VERSION}deepspeed${NIGHTLY}"
pytorch-inf1:
build:
context: .
dockerfile: pytorch-inf1.Dockerfile
image: "deepjavalibrary/djl-serving:${RELEASE_VERSION}pytorch-inf1${NIGHTLY}"
pytorch-cu118:
build:
context: .
Expand Down
4 changes: 2 additions & 2 deletions serving/docker/fastertransformer.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ ARG torch_wheel="https://aws-pytorch-unified-cicd-binaries.s3.us-west-2.amazonaw
ARG ft_wheel="https://publish.djl.ai/fastertransformer/fastertransformer-0.23.0-py3-none-any.whl"
ARG tb_wheel="https://publish.djl.ai/tritonserver/r23.04/tritontoolkit-23.4-py3-none-any.whl"
ARG ompi_version=4.1.4
ARG transformers_version=4.27.3
ARG accelerate_version=0.17.1
ARG transformers_version=4.29.2
ARG accelerate_version=0.19.0
ARG bitsandbytes_version=0.38.1

EXPOSE 8080
Expand Down
4 changes: 2 additions & 2 deletions serving/docker/pytorch-cu118.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ ARG version=11.8.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:$version as base

ARG djl_version=0.23.0~SNAPSHOT
ARG torch_version=2.0.0
ARG torch_vision_version=0.15.1
ARG torch_version=2.0.1
ARG torch_vision_version=0.15.2
ARG python_version=3.9

RUN mkdir -p /opt/djl/conf && \
Expand Down
60 changes: 0 additions & 60 deletions serving/docker/pytorch-inf1.Dockerfile

This file was deleted.

31 changes: 0 additions & 31 deletions serving/docker/scripts/install_inferentia.sh

This file was deleted.

2 changes: 1 addition & 1 deletion serving/docker/scripts/pull_and_retag.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

version=$1
repo=$2
images="cpu aarch64 cpu-full pytorch-inf1 pytorch-inf2 pytorch-cu118 deepspeed fastertransformer"
images="cpu aarch64 cpu-full pytorch-inf2 pytorch-cu118 deepspeed fastertransformer"

for image in $images; do
if [[ ! "$version" == "nightly" ]]; then
Expand Down
2 changes: 1 addition & 1 deletion serving/docker/scripts/security_patch.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ if [[ "$IMAGE_NAME" == "deepspeed" ]] || \
[[ "$IMAGE_NAME" == "pytorch-cu118" ]] || \
[[ "$IMAGE_NAME" == "fastertransformer" ]]; then
apt-get upgrade -y dpkg e2fsprogs libdpkg-perl libpcre2-8-0 libpcre3 openssl libsqlite3-0 libsepol1 libdbus-1-3 curl
elif [[ "$IMAGE_NAME" == "cpu" ]] || [[ "$IMAGE_NAME" == "pytorch-inf1" ]]; then
elif [[ "$IMAGE_NAME" == "cpu" ]]; then
apt-get upgrade -y libpcre2-8-0 libdbus-1-3 curl
fi
11 changes: 2 additions & 9 deletions tests/integration/download_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

set -e

platform=$1 # expected values are "cpu" "cpu-full" "pytorch-cu118" "pytorch-inf1" "aarch64"
platform=$1 # expected values are "cpu" "cpu-full" "pytorch-cu118" "pytorch-inf2" "aarch64"

rm -rf models
mkdir models && cd models
Expand All @@ -22,10 +22,6 @@ aarch_models_urls=(
"https://resources.djl.ai/test-models/onnxruntime/resnet18-v1-7.zip"
)

inf1_models_urls=(
"https://resources.djl.ai/test-models/pytorch/resnet18_inf1_1_12.tar.gz"
)

inf2_models_urls=(
"https://resources.djl.ai/test-models/pytorch/resnet18_inf2_2_4.tar.gz"
)
Expand All @@ -45,17 +41,14 @@ case $platform in
cpu | cpu-full | pytorch-cu118)
download "${general_platform_models_urls[@]}"
;;
pytorch-inf1)
download "${inf1_models_urls[@]}"
;;
pytorch-inf2)
download "${inf2_models_urls[@]}"
;;
aarch64)
download "${aarch_models_urls[@]}"
;;
*)
echo "Bad argument. Expecting one of the values: cpu, cpu-full, pytorch-cu118, pytorch-inf1, pytorch-inf2, aarch64"
echo "Bad argument. Expecting one of the values: cpu, cpu-full, pytorch-cu118, pytorch-inf2, aarch64"
exit 1
;;
esac
Loading