Use logging.warning instead of warnings.warn in pipeline.call #29717

tokestermw · 2024-03-18T20:03:46Z

What does this PR do?

Use HF logging instead of warnings.warn for the following warning:

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset

When using HF pipeline for inference, this warning is shown a lot, and logging can get expensive. Moving to HF logging, then we can control using transformers.logging.set_loglevel instead of the warnings package which applies to all libraries using the warnings package.

There is a heuristic here #26527. Not quite sure if this applies in this case.

What does that mean for developers of the library? We should respect the following heuristic:
 - `warnings` should be favored for developers of the library and libraries dependent on `transformers`
 - `logging` should be used for end-users of the library using it in every-day projects

Maybe another solution is to warn just once.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@LysandreJik

amyeroberts

Thanks for adding this! Looks OK to me - suggestion to use warning_once instead

amyeroberts · 2024-03-18T21:14:53Z

src/transformers/pipelines/base.py

@@ -1164,7 +1164,7 @@ def __call__(self, inputs, *args, num_workers=None, batch_size=None, **kwargs):

        self.call_count += 1
        if self.call_count > 10 and self.framework == "pt" and self.device.type == "cuda":
-            warnings.warn(
+            logger.warning(


I'd recommend using warning_once here. This way, the warning will only display the first time it's hit, so shouldn't spam the logs. This will hopefully discourage having to change the logging level to get it to disappear (who knows what other useful warnings you might be surpressing! :D )

Suggested change

logger.warning(

logger.warning_once(

src/transformers/pipelines/base.py

Use logging.warning instead of warnings.warn in pipeline.__call__

b47c9be

amyeroberts approved these changes Mar 18, 2024

View reviewed changes

tokestermw commented Mar 18, 2024

View reviewed changes

src/transformers/pipelines/base.py Outdated Show resolved Hide resolved

Update src/transformers/pipelines/base.py

1d586d2

amyeroberts merged commit 484e10f into huggingface:main Mar 19, 2024
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use logging.warning instead of warnings.warn in pipeline.call #29717

Use logging.warning instead of warnings.warn in pipeline.call #29717

tokestermw commented Mar 18, 2024

amyeroberts left a comment

amyeroberts Mar 18, 2024

Use logging.warning instead of warnings.warn in pipeline.__call__ #29717

Use logging.warning instead of warnings.warn in pipeline.__call__ #29717

Conversation

tokestermw commented Mar 18, 2024

What does this PR do?

Before submitting

Who can review?

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Mar 18, 2024

Choose a reason for hiding this comment

Use logging.warning instead of warnings.warn in pipeline.call #29717

Use logging.warning instead of warnings.warn in pipeline.call #29717