New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA devide-side assert is thrown unpredictably #1158
Comments
Is it possible you are running out of memory? Another possibility is to get a more descriptive error by turning off async operations, although that will be slower: https://discuss.pytorch.org/t/how-to-fix-cuda-error-device-side-assert-triggered-error/137553 |
Setting |
What version of PyTorch are you using? What version of CUDA? We recently updated the way things are put on the GPU in an attempt to make it easier to choose the device being used. I actually kind of doubt it will help, but you never know. You'd need to install our dev branch to try it out. Is the problem repeatable in any way? If you give it the same exact input, it doesn't crash the next time around? |
Also, with the |
The log contains this error before the random errors start to occur:
The last line is repeated about 200 times with different block and thread numbers. It is caused by giving too long strings to the transformers library. Can this be the cause of the Stanza error too? This is just one error caused by one illegal input, but the Stanza errors occur randomly for all inputs regardless of their size. Is it possible that this error causes the GPU to go to some kind of error state in which it's unable to process legal inputs? |
I confirmed that the error above is the cause of the problem, so I'm closing this issue. I'm still puzzled how an error in a different library can cause errors in another library. If you have an idea how to prevent this type of problems, please tell me. |
While it's expected to get an error if you try to put text too long for the transformer through the stanza models built on transformers, the tokenizer is not built that way. So that's kind of strange. Thanks for the update. |
I'll leave it open for now in case I have time to investigate this particular circumstance in the near future |
I'll try this week if I can get the bug reproduced in a minimal test program. I extracted the calls related to translation from our program, they are like this: import traceback
import stanza
import torch
from transformers.pipelines import pipeline
source_lang = "en"
target_lang = "es"
texts = [" ".join(map(str, range(1000))), "Example text."]
nlp = stanza.Pipeline(lang=source_lang, processors="tokenize")
translator = pipeline(
f"translation_{source_lang}_to_{target_lang}",
model=f"Helsinki-NLP/opus-mt-{source_lang}-{target_lang}",
device=0 if torch.cuda.is_available() else -1,
batch_size=24
)
for text in texts:
try:
sents = nlp(text).sentences
translations = translator([sent.text for sent in sents])
print(translations)
except:
traceback.print_exc() If the bug occurs, the second text will fail as well, not just the first text. If I can't reproduce the bug, there might be something else in our program that causes this. Adding |
I ran the above script in the Amazon server we use and it produces this output. Note that the first error is for translating the too long text, but the second error is for translating a valid sentence. Somehow the error from the transformers library causes the stanza to fail on the subsequent calls even if they are valid.
I tried with CUDA_LAUNCH_BLOCKING=1 too, and the stacktraces were the same. I use these libraries:
|
I tried updating the libraries and the error changes. The second sentence still fails even though it shouldn't, but the error is different:
Library versions this time:
|
Thanks for investigating. One thing you can do to avoid this state is
tokenize the text with the transformer's tokenizer, then check it against
the maximum length of the model, throwing out sentences which are too long.
…On Thu, Jan 19, 2023 at 5:27 AM Iikka Hauhio ***@***.***> wrote:
I tried updating the libraries and the error changes. The second sentence
still fails even though it shouldn't, but the error is different:
$ CUDA_LAUNCH_BLOCKING=1 python /tmp/test.py
2023-01-19 13:24:20 INFO: Checking for updates to resources.json in case models have been updated. Note: this behavior can be turned off with download_method=None or download_method=DownloadMethod.REUSE_RESOURCES
Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.4.1.json: 193kB [00:00, 91.9MB/s]
2023-01-19 13:24:20 INFO: Loading these models for language: en (English):
========================
| Processor | Package |
------------------------
| tokenize | combined |
========================
2023-01-19 13:24:20 INFO: Use device: gpu
2023-01-19 13:24:20 INFO: Loading: tokenize
2023-01-19 13:24:22 INFO: Done loading processors!
/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/models/marian/tokenization_marian.py:194: UserWarning: Recommended: pip install sacremoses.
warnings.warn("Recommended: pip install sacremoses.")
Token indices sequence length is longer than the specified maximum sequence length for this model (1678 > 512). Running this sequence through the model will result in indexing errors
Your input_length: 1678 is bigger than 0.9 * max_length: 512. You might consider increasing your max_length manually, e.g. translator('...', max_length=400)
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [48,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [49,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [50,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [51,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [14,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
File "/tmp/test.py", line 23, in <module>
translations = translator([sent.text for sent in sents])
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/text2text_generation.py", line 366, in __call__
return super().__call__(*args, **kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/text2text_generation.py", line 165, in __call__
result = super().__call__(*args, **kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/base.py", line 1063, in __call__
outputs = [output for output in final_iterator]
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/base.py", line 1063, in <listcomp>
outputs = [output for output in final_iterator]
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
item = next(self.iterator)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/pt_utils.py", line 125, in __next__
processed = self.infer(item, **self.params)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/base.py", line 990, in forward
model_outputs = self._forward(model_inputs, **forward_params)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/pipelines/text2text_generation.py", line 187, in _forward
output_ids = self.model.generate(**model_inputs, **generate_kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/generation/utils.py", line 1368, in generate
inputs_tensor, model_kwargs, model_input_name
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/generation/utils.py", line 601, in _prepare_encoder_decoder_kwargs_for_generation
model_kwargs["encoder_outputs"]: ModelOutput = encoder(**encoder_kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/models/marian/modeling_marian.py", line 759, in forward
attention_mask = _expand_mask(attention_mask, inputs_embeds.dtype)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/transformers/models/marian/modeling_marian.py", line 104, in _expand_mask
inverted_mask = 1.0 - expanded_mask
File "/tmp/tmp-venv/lib64/python3.7/site-packages/torch/_tensor.py", line 39, in wrapped
return f(*args, **kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/torch/_tensor.py", line 834, in __rsub__
return _C._VariableFunctions.rsub(self, other)
RuntimeError: CUDA error: device-side assert triggered
Traceback (most recent call last):
File "/tmp/test.py", line 22, in <module>
sents = nlp(text).sentences
File "/tmp/tmp-venv/lib64/python3.7/site-packages/stanza/pipeline/core.py", line 408, in __call__
return self.process(doc, processors)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/stanza/pipeline/core.py", line 397, in process
doc = process(doc)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/stanza/pipeline/tokenize_processor.py", line 91, in process
num_workers = self.config.get('num_workers', 0))
File "/tmp/tmp-venv/lib64/python3.7/site-packages/stanza/models/tokenization/utils.py", line 264, in output_predictions
pred = np.argmax(trainer.predict(batch), axis=2)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/stanza/models/tokenization/trainer.py", line 69, in predict
pred = self.model(units, features)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/stanza/models/tokenization/model.py", line 52, in forward
inp, _ = self.rnn(emb)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/tmp/tmp-venv/lib64/python3.7/site-packages/torch/nn/modules/rnn.py", line 775, in forward
self.dropout, self.training, self.bidirectional, self.batch_first)
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
Library versions this time:
stanza==1.4.2
tokenizers==0.13.2
torch==1.13.1
transformers==4.25.1
—
Reply to this email directly, view it on GitHub
<#1158 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWOO7CA6J3P2F2HFR53WTE6MJANCNFSM6AAAAAASQUUYMY>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Describe the bug
I'm using Stanza to do sentence splitting and other preprocessing as a part of a machine translation pipeline. At random times, my server starts to throw errors for about half of the requests. The problem vanishes after server is restarted. The error is always the same:
Most of times there are no errors. Since the errors happen in production and at random times, I haven't been able to reproduce them or debug them properly. I'm unsure how I should proceed.
To Reproduce
I don't know how to reproduce this, as it happens randomly.
My code is something like this (non-relevant parts redacted):
Only one stanza.Pipeline object is created by the server process.
Expected behavior
There should be no errors.
Environment (please complete the following information):
The server is an Amazon EC2 instance.
The text was updated successfully, but these errors were encountered: