-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error ShardCollectContext for {0,1,2} already added
in low-memory situations
#15518
Comments
On the repository where we originally observed and reported about the problem... ... we now just increased the heap size assigned to CrateDB on the CI runner, and we believe the error will go away without further ado. |
Thank you for providing steps reproduce @amotl . I can reproduce this locally and confirm that the cause is a duplicate |
Thank you. So, do you think the corresponding patch you are preparing may resolve this problem already? |
I worked out that PR based on simply reading the code(without reproducible scenario). Still would like to clarify my finding but that would be after producing a fix for this. |
Thank you for reporting, @amotl the fix will be available with next hotfix release. |
CrateDB version
latest/nightly
NB: The issue exists for a longer time already, i.e. it has not been introduced recently.
CrateDB setup information
In this context, a "low-memory situation" is created by loading enough volume of data when running on 512 MB heap size, per default configuration.
Problem description
Description
We discovered an edge case, where CrateDB may not be able to detect a low-memory situation through corresponding circuit breaker mechanics. Thus, it responds with an error message which does not make it clear where the problem is originating from.
Example
Reference
ShardCollectContext for {0,2} already added
mlflow-cratedb#53Observations
The problem happens when loading a reasonable amount of data into the table
metrics
, quickly succeeded by aDELETE FROM metrics
operation.Steps to Reproduce
We tried to isolate the problem on behalf of a corresponding self-contained Python program, shared per cratedb_heap_exchaust_weird_error.py, but failed. 1.
What works well to reproduce the error is indeed by just running two test cases of the MLflow adapter for CrateDB. Hereby, we are sharing a quick walkthrough:
Run CrateDB with low heap size
Setup development sandbox
Invoke test cases
Actual Result
The CrateDB Python driver raises an exception like:
Expected Result
The CrateDB Python driver responds with an error message better indicating the problem, like
OutOfMemoryError[Java heap space]
, or other exceptions like theCircuitBreaker
-types, which also more easily lead the user to the right root cause, that the solution is just about adding memory.Footnotes
By using that program, which intends to emulate MLflow test case behaviour, we only have been able to trip sound error responses like
OutOfMemoryError[Java heap space]
by CrateDB, some of them even crashing the process, and some of them tripped by the circuit breaker operating correctly, which we observed on the CrateDB log output. ↩The text was updated successfully, but these errors were encountered: