-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: System continues to create empty segments after upgrade to 2.4 #33646
Comments
I guess the empty segments actually are not "empty", they are compacted segments and pending for indexing. In Milvus 2.3.12, the default maxSize of a segment is 512MB, while in Milvus 2.4.4 it changed to 1GB, so after upgraded, Milvus is trying to compact the segments to a the new maxSize. If you checking the compaction tasks or indexing tasks, I believe you can see the new tasks. /assign @Archalbc |
Hey sorry I completely forgot to say that we are already at 2GB for the maxSegmentSize since 2.3.12 as our querynodes are 32GB. (I'm updating the initial post) |
Hello, here is my values.yaml for the milvus configuration:
vector dimension: 256 |
because each time you flush(or auto flush), your segment number will grow by 32. and after compaction the number will decrease |
We don't call flush in the code, we let the system auto-flush. |
If you only have one collection and one partition, there seems to be no reason you will see 32 growing segments. did you do import or insert? can you offer full logs for debug? |
I think we need logs and also information about "what operation did you do and see a segemnt with 0 entities" |
I guess this might not be related to SDK version. But we need more clues about why the log is created |
@Archalbc I did not reproduce this issue in my milvus, but i can try more time. If it reproduced to you, could you please attach the etcd backup for investigation? Check this: https://github.com/milvus-io/birdwatcher for details about how to backup etcd with birdwatcher. |
Hi,
Sorry for late answer !! We're not using bulk insert. We're upserting each time |
upsert seems to be the issue. becasue upsert also cause one delete on each search. The segment number increase might be also related to L0 delete, but this might be as expected and won't affect search performance |
I can understand that upsert operations are kinda tricky, but I don't understand why we have a change of behavior on the cluster. Everything look fine in 2.3.12 and I doubt the cluster will be OK with an infinite number of segments (because you can see in the graph that it never stop creating segments and they are never cleaned.) How can I securely provide you the etcd backup without exposing it here ? |
please send it to my mail: yanliang.qiao@zilliz.com |
Is there an existing issue for this?
Environment
Current Behavior
After upgrading our Milvus cluster from 2.3.12 to 2.4.4, the system started to continuously create empty segment in all my collections.
![image](https://private-user-images.githubusercontent.com/144429952/336812306-6893d361-3dd0-4cee-8481-44b6fe3dc7ba.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4MzA4MjgsIm5iZiI6MTcxOTgzMDUyOCwicGF0aCI6Ii8xNDQ0Mjk5NTIvMzM2ODEyMzA2LTY4OTNkMzYxLTNkZDAtNGNlZS04NDgxLTQ0YjZmZTNkYzdiYS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwMVQxMDQyMDhaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0yOWI1OTVkNTczOWNkOWJjOGIxNjhhMGEwOGJhYmUwOGY4M2YwNGUzMTkyZGVlYTRkMzE5OTFjYzhjOWQ2ZmFjJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.hllGuRlN-0rqHUNEGyWa03IYxO9ag8JPOaTV65E5iEA)
![image](https://private-user-images.githubusercontent.com/144429952/336813428-dc1f77c4-5737-4b00-b6a9-4aed62efa5e2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4MzA4MjgsIm5iZiI6MTcxOTgzMDUyOCwicGF0aCI6Ii8xNDQ0Mjk5NTIvMzM2ODEzNDI4LWRjMWY3N2M0LTU3MzctNGIwMC1iNmE5LTRhZWQ2MmVmYTVlMi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwMVQxMDQyMDhaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1hNmRjMjJkOTM5ZjkyNTVjMGFhNzA4MzZmMTg5N2FmMTJhNzZmYTlhODBjMmQ0MjE3ZWNmNzViZDljZDc1OTI1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9._6gQHiK2tP7j-Bl6jWubeDo-1_xXH5Qv_a10qN409IU)
![image](https://private-user-images.githubusercontent.com/144429952/336813933-fba74e11-bdbe-4028-ba30-315626b44d40.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4MzA4MjgsIm5iZiI6MTcxOTgzMDUyOCwicGF0aCI6Ii8xNDQ0Mjk5NTIvMzM2ODEzOTMzLWZiYTc0ZTExLWJkYmUtNDAyOC1iYTMwLTMxNTYyNmI0NGQ0MC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwMVQxMDQyMDhaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0xODIzOTE4Y2M3ZWU2NzRkMGFlYmU4OWJhODg0MTY5MmI5YzZmNzNkYTU0N2QxMTRjNzhkM2ZiNTkwNWViZGM1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.kRPpq1ZGyonL58IJGerK-tiLZiXUnH3JbOPsQD6eXMo)
This bug should be fixed since 2.3.15 in: #32553.
We decided to downgrade to 2.3.17 and unfortunately we also encountered an issue as our indexes version were upgraded to "4", Querynode in 2.3.17 were not able to load that.
Sound related to: #33242
Timeline:
Client is using Go SDK 2.3.2, we don't think it's an issue (?).
Schema:
![image](https://private-user-images.githubusercontent.com/144429952/336815426-b08c8790-08cb-4abe-8816-b54f17a8a509.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4MzA4MjgsIm5iZiI6MTcxOTgzMDUyOCwicGF0aCI6Ii8xNDQ0Mjk5NTIvMzM2ODE1NDI2LWIwOGM4NzkwLTA4Y2ItNGFiZS04ODE2LWI1NGYxN2E4YTUwOS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwMVQxMDQyMDhaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT04YzFkNWQ4ODdmZjM3NTI5YjkxNWY1YWE1YjJlOGU0YThiODA1NjI4ZTY1MzZkOTgzZGNmNDU3ZWRmY2JkY2NkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.Elew9RkTJMs5IszplzI1r2WSr8B4k_Qmdxt-GSlVBDk)
Expected Behavior
Steps To Reproduce
Milvus Log
Sorry, I don't have logs anymore since we downgraded everything.
Anything else?
No response
The text was updated successfully, but these errors were encountered: