-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matcher OOM after pathological pattern #3541
Comments
Thanks for the report! Just to double-check: did you actually run out of memory during matching or not? |
Thanks for the quick answer ! Just to be sure, we did not have an explicit
we just had this bad allocation error |
Coming back on this issue as we run into a lot of allocation errors on our documents, with several different patterns. I don't know if the error is connected in some ways to the issue #3618 and the solution that was proposed in spacy extreme, but it seems it's either linked to too many potential matches in a sentence (e.g. patterns being too complex and generating many potential matches) or with documents being simply too long (we tend to have long documents, that we segment in shorter sentences that can still be long to parse)
Do you think that there's a connection between this issue and the ones exposed in #3618 ? we didn't have such errors before upgrading to spacy 2.1, before that we were running on spacy 2.0.16, and we get better results with the new version, but we can't use them in production due to these memory allocation errors. I wasn't able to reproduce another version of one of the patterns that issued the error due to the size of them (we use several alternative of very large patterns in our matchers), so please let me know if I can provide any additional information or details about the error. |
Hello @ines I don't know if there is evolution on this issue, but it is causing unsolvable errors for my matchers on tasks that used to run smoother, we have tried a lot of different approaches, especially with multi-processing, to try and avoid getting these errors, but it seems we can't find a way to resolve the problem. The odd aspect of it is that some of these errors happen on rather short sentences, so it is strange to me that all the memory should be consumed on such short spans of text. Thanks a lot for the great work on spacy, hope this issue is solvable :) |
Hi all, I can confirm, running into the same issue when using many Matchers. It seems that the issue is especially present in cases where the processed text contains German umlauts ( |
It can be tricky to track down these memory errors, so thanks for your patience on this.
|
After running the code a bit, I do think this is just a pathological pattern, unfortunately. I don't think we can really do anything about this, so I wouldn't class it as a bug: it's just the nature of the operator semantics that you'll be able to construct patterns like this. |
…erators (explosion#3949) * Add regression test for issue explosion#3541 * Add comment on bugfix * Remove incorrect test * Un-xfail test
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
First of all, thanks a lot for all the great new features of Pattern Matching in version 2.1, they are amazing ! However, I keep getting an error with several patterns that I had to update from version 2.0 to 2.1. Here's the description :
How to reproduce the behaviour
The following code produces an error, that might be caused by the many " * " operators on the pattern (maybe because of too many combinations of matches ?) :
The error :
Your Environment
The text was updated successfully, but these errors were encountered: