-
Notifications
You must be signed in to change notification settings - Fork 25.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BatchFeature
performance improvement: convert List[np.ndarray]
to np.ndarray
before converting to pytorch tensors
#14307
Comments
Thanks for reporting this. Could it be that PyTorch only added this warning in 1.10? |
Yes, the problem is longstanding but the warning is new in 1.10. Here's the commit where it was added: |
I am getting the same warning on this line with v4.16.2: presumably stemming from these lines - which look identical to those in @eladsegal's PR above: |
@sgugger this warning is also triggered when using the Trainer at:
I'm using PyTorch 1.11 and Transformers v4.20.1 |
@sgugger I'm getting lots and lots of this warning all the time, which make troubleshooting pretty hard. The Jupyter interface has issues because the output gets very big after a while.
Python 3.10.6 |
There is no use commenting on an issue that was resolved without providing a code reproducer. You should open a new issue and follow the template :-) |
@sgugger but what if the issue turns out to be only partially resolved? I think my example of the lines show that the PR potentially only fixed one occurrence of this issue but missed others? Do you think it is better to make a new issue in that case rather than re-open the original one? |
You should definitely open a new one with a code sample that shows the problem: tokenizers do not return NumPy arrays but list of token IDs so even if the line is the same as what was impacted in this PR, it doesn't mean there is a problem to fix either. |
🚀 Feature request
@NielsRogge, @sgugger
When using a
FeatureExtractor
for images and passingList[np.ndarray]
withreturn_tensors="pt"
, the following warning is outputted:As reported in pytorch/pytorch#13918, a significant performance improvement can be obtained by using
torch.tensor
on anumpy.ndarray
instead of onList[numpy.ndarray]
.I think a possible solution would be #14306:
transformers/src/transformers/feature_extraction_utils.py
Lines 136 to 144 in 05fed8b
The text was updated successfully, but these errors were encountered: