-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StackOverflowError in FEEL engine #7758
Comments
/cc @saig0 |
@npepinpe you can get it via zdb from the state |
@npepinpe the BPMN and the process variables are interesting for the analysis. A process variable with a huge list could cause a stack overflow error. |
I have the processes used, however no the variables as the data is not there anymore. See: https://drive.google.com/drive/folders/1VipxIaUsp6O5yBa6OdFh16Z83-WaQElH?usp=sharing |
@npepinpe thanks. In the process, I see multi-instance activities with FEEL filter expressions. I can reproduce the stack overflow exception if the list contains more than 1.500 items. We would need to adjust the FEEL engine to work with larger list values. |
I hadn't really thought about this, but generally do we know the impact of evaluating expressions on large collections? In terms of memory, latency, or as we saw possibly stack overflow 😄? Do we recommend anything to users currently regarding this in the documentation? What would be the best practice here? /cc @falko as this might be interesting for you |
No. Currently, we don't have a benchmark for it 😅
No. This topic is not discussed in the docs. |
While I have you here, what would be your recommendation when it comes to such things? Is there a work around? 🙃 |
A workaround would be smaller lists. For example, splitting the list into smaller pieces. In the combination with a process, we could use a multi-instance activity to split the list into pieces (e.g. using sublist() function). |
Lists should have been limited to 100 entries. I'll double-check if larger lists have been tested. |
I've snapshotted the data and added it to the folder for the faulty partitions where the error occurred. EDIT: my assumption is with the data, you can easily reproduce it locally and find the variables/process combo triggering this. Let me know if we need anything else. |
I got confirmation that the issue occurred again after playing with 500 plus lines. |
I created an issue in the FEEL engine to address the problem: camunda/feel-scala#350 |
The issue is fixed in the FEEL engine. We need to build a new release of the FEEL engine and include it. |
Lets start the release train 🚈 shuushuu |
Yes 😅 I'll build a new FEEL release as soon as possible. But I need to merge some PRs first. |
Describe the bug
We ran into an issue while load testing where every once in a while we'd get a StackOverflowError. It seems once this happens, the partition becomes unhealthy, but no leader change occurs since only the StreamProcessor is unhealthy, so processing effectively stops.
Logs
To Reproduce
I unfortunately don't have the processes yet, as curator was configured too aggressively so we couldn't pull them from there. I'll try to get them soon.
Expected behavior
No StackOverflowError - although it made me wonder, if it's expected that some
Error
are thrown as well when executing the engine, should consumers be catching them? We typically only catchException
/RuntimeException
, but if it's expected that we catchError
as well (or some subset), please let us know.Environment
The text was updated successfully, but these errors were encountered: