Upgrade to Cuda12 and latest versions #46

philschmid · 2024-02-05T08:09:14Z

No description provided.

philschmid

Didn't we discuss to remove tox? Using a tool nobody knows from the team, which is used in a single project is not a valid option, especially since everything is realizable in side our environment using existing tools. Please remove it from the CI and how to run tests.

Feel to keep it in your personal environment but for the project we want to stay lean and in the scope of what other team members know.

I didn't do a full review.

philschmid · 2024-02-27T17:26:11Z

.github/workflows/build-container.yaml

    with:
      image: inference-pytorch-gpu
-      dockerfile: dockerfiles/pytorch/gpu/Dockerfile
+      dockerfile: dockerfiles/pytorch/Dockerfile


This looks to be the wrong place.

We have one image now. Only diff between gpu and cpu is the base image. CUDA Development is the default base image.

e.g. for CPU:

build_args: "BASE_IMAGE=ubuntu:22.04"

.github/workflows/gpu-integ-test.yaml

philschmid

Left some comments. Why are using venv now? I thought we use plain python? Isn't that why we removed conda?

.github/workflows/integration-test-action.yaml

README.md

dockerfiles/pytorch/Dockerfile

scripts/entrypoint.sh

src/huggingface_inference_toolkit/sentence_transformers_utils.py

philschmid · 2024-02-28T14:22:35Z

src/huggingface_inference_toolkit/utils.py

-        converted_input = Conversation(
-            inputs["text"],
-            past_user_inputs=inputs.get("past_user_inputs", []),
-            generated_responses=inputs.get("generated_responses", []),
-        )
-        prediction = pipeline(converted_input, *args, **kwargs)
-        return {
-            "generated_text": prediction.generated_responses[-1],
-            "conversation": {
-                "past_user_inputs": prediction.past_user_inputs,
-                "generated_responses": prediction.generated_responses,
-            },
-        }
+        logging.info(f"Inputs: {inputs}")
+        logging.info(f"Args: {args}")
+        logging.info(f"KWArgs: {kwargs}")
+        prediction = pipeline(inputs, *args, **kwargs)
+        logging.info(f"Prediction: {prediction}")
+        return list(prediction)


if the conversational is not needed in that format we can remove the whole wrap_pipeline since its not used?

src/huggingface_inference_toolkit/utils.py

tests/unit/test_serializer.py

philschmid

Almost done, left some suggestion on the dockerfile and had 1 question related to the line-length in 1 file.

README.md

dockerfiles/pytorch/Dockerfile

src/huggingface_inference_toolkit/utils.py

Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

test

682b487

XciD approved these changes Feb 5, 2024

View reviewed changes

philschmid and others added 28 commits February 5, 2024 08:17

to 4.36

43dd281

build image

6584b73

fxi

1cda02f

Move GPU to EKS

b262224

cuda 12, remove conda

8271cc7

Merge remote-tracking branch 'origin/move-gpu-to-eks' into cuda12

3887e21

integ test 2.0

f514a5e

2.0

b164b31

fix

70d6003

indent

b1cc6a2

docker buildx

c728190

depends on

b5ba045

name

10b62b7

indent

383dab3

name

0d162de

trigger

b22370b

colon

db90673

v4

90875ba

download

9066cc8

ls

6fa3cb0

indent

667299a

cache

e64a76a

v4

6b6b33c

revert

9c223fe

path

6036a44

slash

3731fd4

tar

ecde720

path

85a2996

fix

606e410

philschmid commented Feb 27, 2024

View reviewed changes

rafaelpierrehf added 17 commits February 27, 2024 21:26

remove tox

ef15995

fix

a6f0781

path

68a87c1

concurrency

040d581

fix

605c7f3

runs on;

c7a3cd0

cpu

0cccf69

unit tests

8d8e68a

-r

3996bd4

ignore

7251139

path

819cd33

cache

b11741f

backslash

e8cab4b

st, diffusers

35f92bc

cache test dir

00503c3

gpus

b34d991

custom pipeline path

d8a60d1

philschmid commented Feb 28, 2024

View reviewed changes

rafaelpierrehf added 3 commits February 28, 2024 15:29

fix

5b55a66

payload

088a2d8

final comments

c628acb

philschmid commented Feb 28, 2024

View reviewed changes

rafaelpierrehf and others added 3 commits February 28, 2024 17:40

Update dockerfiles/pytorch/Dockerfile

50bea98

Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

Update README.md

0b93a74

Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

Update dockerfiles/pytorch/Dockerfile

0096a3e

Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

rafaelpierrehf merged commit de600c4 into main Feb 28, 2024

rafaelpierrehf deleted the cuda12 branch February 28, 2024 16:54

jagwar unassigned rafaelpierrehf May 18, 2024

Upgrade to Cuda12 and latest versions #46

Upgrade to Cuda12 and latest versions #46

Uh oh!

Conversation

philschmid commented Feb 5, 2024

Uh oh!

philschmid left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philschmid Feb 27, 2024

Choose a reason for hiding this comment

Uh oh!

rafaelpierrehf Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

philschmid left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

philschmid Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

philschmid left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

philschmid left a comment •

edited

Loading

rafaelpierrehf Feb 28, 2024 •

edited

Loading