-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Duplicate data is automatically added #34121
Comments
The title and description of this issue contains Chinese. Please use English to describe your issue. |
could you just show your code? |
My guess is your just have multi process try to insert into the same collection but you don't aware of that |
@SunilWang is the milvus just new created? Could you please refer this doc to export the whole Milvus logs for investigation? /assign @SunilWang |
@yanliang567 I used the visualization tool zilliz/attu to create the Collection |
@xiaofan-luan When my program has stopped, the amount of data continues to increase async function handleOne(task: AigcTaskList) {
try {
const generateImgs = task.generateImgs
const rowData = []
for (let i = 0; i < generateImgs.length; i++) {
const imgUrl = generateImgs[i]
// task.generateVideos
const { data } = await milvusClient.query({
collection_name: collection_name,
filter: `imgUrl == "${imgUrl}"`,
output_fields: ['id', 'vector', 'taskId', 'taskIndex', 'imgUrl', 'video'],
// output_fields: ['id', 'vector', 'imgUrl'],
})
if(data.length > 0){
continue
}
const vector = await getImageVector(imgUrl)
const res = await milvusClient.insert({
collection_name: collection_name,
data: [{
id: `${task.id}_${i}`,
taskId: task.id,
vector,
imgUrl,
taskIndex: i,
video: '',
}],
})
}
}catch (error){
console.log(error)
}
} |
how you deploy milvus, and try to export and share the logs |
|
|
Is it possible that my consistency has something to do with it? |
Duplicate data is found that the vector value will change. My type is: Float Vector (512) Vector value: [-0.11212158203125,-0.408447265625,-0.257568359375,0.2296142578125,0.04913330078125,0.2054443359375,0.60546875,0.1519775390625,0.42236328125,-0.1748046875,0.418212890625,-0.325439453125,0.125732421875, ...] Why is it that when you insert data, you generate multiple arrays of different vectors? |
@SunilWang could you please update the log level to debug, reproduce the issue and export the full milvus logs as as commented above? |
@yanliang567 How can I get the log to you |
if it is big, you could share it in a cloud driver, or send it to me(yanliang.qiao@zilliz.com) |
# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh
# Start the Docker container
$ bash standalone_embed.sh start I am deploying by script, how do I change the log level? |
use docker logs export to a file |
The log has been sent to your email address. Please check |
You can use the following restful request to change the log level to
then verify the log level changes:
|
The current log level is info by default. Do you need to change it |
unless there is more clues. I don't think this could be issue a milvus, please investigate on your self. |
again this is not related to milvus and the way milvus is deployed. |
I printed the console log, and I can confirm that the handleOne function was only saved once for the same vector. |
you can add a log on each of the line
guess you just need to have some patience and debug. there is no magic here. |
256 row is a very small amount of data and it didn't even trigger compaction |
milvusClient.insert is indeed only called 256 times. I have sent the complete log to (yanliang.qiao@zilliz.com) |
@SunilWang let's double confirm a few things:
|
1.We can see the data growth through the visual tool attu. ![]() |
I did not find the collection name testMaLiangVault, and i am not able to reproduce the issue in house. |
Thank you very much for your technical support during this time. My approach is to clean up duplicate data with a scheduled task, which temporarily solves the problem for now. You can close the current issue. |
Is there an existing issue for this?
Environment
Current Behavior
Two things can happen:
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: