Skip to content

Conversation

@philschmid
Copy link
Contributor

No description provided.

Copy link
Contributor Author

@philschmid philschmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we discuss to remove tox? Using a tool nobody knows from the team, which is used in a single project is not a valid option, especially since everything is realizable in side our environment using existing tools. Please remove it from the CI and how to run tests.

Feel to keep it in your personal environment but for the project we want to stay lean and in the scope of what other team members know.

I didn't do a full review.

with:
image: inference-pytorch-gpu
dockerfile: dockerfiles/pytorch/gpu/Dockerfile
dockerfile: dockerfiles/pytorch/Dockerfile
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be the wrong place.

Copy link
Contributor

@rafaelpierrehf rafaelpierrehf Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have one image now. Only diff between gpu and cpu is the base image. CUDA Development is the default base image.

e.g. for CPU:

 build_args: "BASE_IMAGE=ubuntu:22.04"

Copy link
Contributor Author

@philschmid philschmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Why are using venv now? I thought we use plain python? Isn't that why we removed conda?

Comment on lines 78 to 84
converted_input = Conversation(
inputs["text"],
past_user_inputs=inputs.get("past_user_inputs", []),
generated_responses=inputs.get("generated_responses", []),
)
prediction = pipeline(converted_input, *args, **kwargs)
return {
"generated_text": prediction.generated_responses[-1],
"conversation": {
"past_user_inputs": prediction.past_user_inputs,
"generated_responses": prediction.generated_responses,
},
}
logging.info(f"Inputs: {inputs}")
logging.info(f"Args: {args}")
logging.info(f"KWArgs: {kwargs}")
prediction = pipeline(inputs, *args, **kwargs)
logging.info(f"Prediction: {prediction}")
return list(prediction)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the conversational is not needed in that format we can remove the whole wrap_pipeline since its not used?

Copy link
Contributor Author

@philschmid philschmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done, left some suggestion on the dockerfile and had 1 question related to the line-length in 1 file.

rafaelpierrehf and others added 3 commits February 28, 2024 17:40
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
@rafaelpierrehf rafaelpierrehf merged commit de600c4 into main Feb 28, 2024
@rafaelpierrehf rafaelpierrehf deleted the cuda12 branch February 28, 2024 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants