Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Duplicate data is automatically added #34121

Closed
1 task done
SunilWang opened this issue Jun 25, 2024 · 33 comments
Closed
1 task done

[Bug]: Duplicate data is automatically added #34121

SunilWang opened this issue Jun 25, 2024 · 33 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@SunilWang
Copy link

SunilWang commented Jun 25, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:2.4.5
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2): "@zilliz/milvus2-sdk-node": "^2.4.3",
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: cpu 80核、128GB
- GPU: 3090
- Others:

Current Behavior

Two things can happen:

  1. After I inserted 256 pieces of data with Node.js SDK, and then deleted all the data and inserted the data again, there would be a lot of repetitive data.
  2. What is even weirder is that I created a Collection with the same structure as others in the same database, and as soon as it was established, new data would be generated constantly!! I'm not doing any data insertion.

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@SunilWang SunilWang added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 25, 2024
Copy link
Contributor

The title and description of this issue contains Chinese. Please use English to describe your issue.

@SunilWang SunilWang changed the title [Bug]: 会自动增加重复数据 [Bug]: Duplicate data is automatically added Jun 25, 2024
@xiaofan-luan
Copy link
Contributor

could you just show your code?

@xiaofan-luan
Copy link
Contributor

My guess is your just have multi process try to insert into the same collection but you don't aware of that

@yanliang567
Copy link
Contributor

@SunilWang is the milvus just new created? Could you please refer this doc to export the whole Milvus logs for investigation?
For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs.
Also share the collection names with the problem will be helpful for us to address the issue.

/assign @SunilWang

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 25, 2024
@SunilWang
Copy link
Author

@yanliang567 I used the visualization tool zilliz/attu to create the Collection

@SunilWang
Copy link
Author

@xiaofan-luan When my program has stopped, the amount of data continues to increase

async function handleOne(task: AigcTaskList) {
    try {
        const generateImgs = task.generateImgs
        const rowData = []

        for (let i = 0; i < generateImgs.length; i++) {
            const imgUrl = generateImgs[i]
            // task.generateVideos

            const { data } = await milvusClient.query({
                collection_name: collection_name,
                filter: `imgUrl == "${imgUrl}"`,
                output_fields: ['id', 'vector', 'taskId', 'taskIndex', 'imgUrl', 'video'],
                // output_fields: ['id', 'vector', 'imgUrl'],
            })

            if(data.length > 0){
                continue
            }

            const vector = await getImageVector(imgUrl)

            const res = await milvusClient.insert({
                collection_name: collection_name,
                data: [{
                    id: `${task.id}_${i}`,
                    taskId: task.id,
                    vector,
                    imgUrl,
                    taskIndex: i,
                    video: '',
                }],
            })
        }
    }catch (error){
        console.log(error)
    }
}

@yanliang567
Copy link
Contributor

@yanliang567 I used the visualization tool zilliz/attu to create the Collection

how you deploy milvus, and try to export and share the logs

@SunilWang
Copy link
Author

@yanliang567

[2024/06/25 03:27:21.021 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:21.021 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=e33a36544b85782d25206d2be5357caf] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700520451211269}]"]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404343019873] [indexName=] [timestamp=0]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545230] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545314] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450612404343020293] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404343019873] [indexName=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450697823444118235] [indexName=] [timestamp=0]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118802] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118430] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450697823444118235] [indexName=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404344463315] [indexName=] [timestamp=0]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404344463315] [indexID=450612404344463454] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:21.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=4d401cc8462a3ad49631cab1f73da7c9] [collectionID=450612404344463315] [indexName=]
[2024/06/25 03:27:23.177 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=abea55c8e6f7dc601c5f379ade63f0c5] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]
[2024/06/25 03:27:23.177 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:23.177 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=abea55c8e6f7dc601c5f379ade63f0c5] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700521014820870}]"]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404343019873] [indexName=] [timestamp=0]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450612404343020293] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545230] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545314] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404343019873] [indexName=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404344463315] [indexName=] [timestamp=0]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404344463315] [indexID=450612404344463454] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450612404344463315] [indexName=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450697823444118235] [indexName=] [timestamp=0]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118430] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118802] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:27:24.734 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=762341feeb3375dec64e4d4c95073c73] [collectionID=450697823444118235] [indexName=]
[2024/06/25 03:27:25.432 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=7ff69fb866f2be9e53a802c978dc51ee] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]
[2024/06/25 03:27:25.433 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:25.433 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=7ff69fb866f2be9e53a802c978dc51ee] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700521604907013}]"]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=7e46127d7b578fa4d7954505cc328808] [collectionID=450612404344463315]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=7ce0a70d1d9bb19cd2b49a8e24a364cc] [collectionID=450612404344463315]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=89fd0cd6710bae455952a5f77f2c83f6] [collectionID=450697823444118235]
[2024/06/25 03:27:25.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=1cd35e0c15d761d84d74fb05f388d323] [collectionID=450612404343019873]
[2024/06/25 03:27:25.924 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=6942d0e9237537ab322c870bdcca769e] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286035924679317]
[2024/06/25 03:27:25.924 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286025925894218] [newVersion=1719286035924679317] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [observers/target_observer.go:493] ["observer trigger update current target"] [collectionID=450612404344463315]
[2024/06/25 03:27:25.925 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=d3d725d6ab3d842e9286c95808cd2457] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286035924743885]
[2024/06/25 03:27:25.925 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286025925844734] [newVersion=1719286035924743885] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=640f7187fefe819280a49eec08ea8f5c] [collectionID=450612404344463315] [partitionIDs="[]"]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [posTs=450700517082660866] [posTime=2024/06/25 03:27:10.131 +00:00]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=640f7187fefe819280a49eec08ea8f5c] [collectionID=450612404344463315] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=718809d78e1a97f8580458949f7c53c8] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286035924849768]
[2024/06/25 03:27:25.925 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286025925918709] [newVersion=1719286035924849768] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=58469440cfb6716568516a0a1cf71a52] [collectionID=450612404343019873] [partitionIDs="[]"]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [posTs=450700517082660866] [posTime=2024/06/25 03:27:10.131 +00:00]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=58469440cfb6716568516a0a1cf71a52] [collectionID=450612404343019873] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=b98a71d4f3b8cb61dddce3c4dd7f37ef] [collectionID=450697823444118235]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=c26d22bc65eba4708c1b859df1ee3e36] [collectionID=450697823444118235] [partitionIDs="[]"]
[2024/06/25 03:27:25.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:27:25.926 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [posTs=450700484668817410] [posTime=2024/06/25 03:25:06.482 +00:00]
[2024/06/25 03:27:25.926 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=c26d22bc65eba4708c1b859df1ee3e36] [collectionID=450697823444118235] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:27:25.927 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=0778caed8f23fea228e6759891e72fb0] [collectionID=450612404343019873]
[2024/06/25 03:27:27.723 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=3f8ed8088194cae1be8a957f88f50e93] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]
[2024/06/25 03:27:27.724 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:27:27.724 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=3f8ed8088194cae1be8a957f88f50e93] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700522207576069}]"]

@SunilWang
Copy link
Author

[2024/06/25 03:29:04.703 +00:00] [INFO] [datacoord/meta.go:1131] ["meta update: add allocation - complete"] [segmentID=450697823444119330]
[2024/06/25 03:29:04.703 +00:00] [INFO] [datacoord/services.go:226] ["success to assign segments"] [traceID=cb73a5b4728f0032095060028d6d5778] [collectionID=450697823444118235] [assignments="[{\"SegmentID\":450697823444119330,\"NumOfRows\":1,\"ExpireTime\":450700547635806213}]"]
[2024/06/25 03:29:05.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=5413d7cc64f959bc496467442fc9f85c] [collectionID=450612404344463315]
[2024/06/25 03:29:05.924 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=2525a6ac820def73cdc3e336a10933d8] [collectionID=450612404344463315]
[2024/06/25 03:29:05.925 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=094b5eaffc38e26fbc1c4c9f19b89fc6] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286135925850912]
[2024/06/25 03:29:05.925 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286125926009796] [newVersion=1719286135925850912] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:29:05.925 +00:00] [INFO] [observers/target_observer.go:493] ["observer trigger update current target"] [collectionID=450612404344463315]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=ccc5c6bb41d15870190ce77b6e1930ca] [collectionID=450697823444118235]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=10b08654249cc3493eb1136b6b16ad9f] [collectionID=450612404343019873]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=448e9aa90174d59f4949f19701b46d05] [collectionID=450612404344463315] [partitionIDs="[]"]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404344463315] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] [posTs=450700532837515266] [posTime=2024/06/25 03:28:10.231 +00:00]
[2024/06/25 03:29:05.925 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=448e9aa90174d59f4949f19701b46d05] [collectionID=450612404344463315] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_1_450612404344463315v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=96010468c384314b47a3b627c39590ef] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286135925395773]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=0b23b9e2e46066465620c92406651691] [collectionID=450697823444118235]
[2024/06/25 03:29:05.926 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286125926088090] [newVersion=1719286135925395773] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [querynodev2/services.go:1306] ["sync action"] [traceID=7dd0718ff12534065923ef3648403371] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [currentNodeID=5] [Action=UpdateVersion] [TargetVersion=1719286135925790361]
[2024/06/25 03:29:05.926 +00:00] [INFO] [delegator/distribution.go:299] ["Update readable segment version"] [oldVersion=1719286125925845758] [newVersion=1719286135925790361] [growingSegmentNum=1] [sealedSegmentNum=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=1aa3ec0ad2c5fcec5fe69448f89ebb9d] [collectionID=450612404343019873] [partitionIDs="[]"]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450612404343019873] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] [posTs=450700532837515266] [posTime=2024/06/25 03:28:10.231 +00:00]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:820] ["get recovery info request received"] [traceID=1f2d93777c8f72df2d699d78b3ff1ea6] [collectionID=450697823444118235] [partitionIDs="[]"]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=1aa3ec0ad2c5fcec5fe69448f89ebb9d] [collectionID=450612404343019873] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_0_450612404343019873v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:117] [GetQueryVChanPositions] [collectionID=450697823444118235] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [numOfSegments=4] ["indexed segment"=3]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/handler.go:302] ["channel seek position set from channel checkpoint meta"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] [posTs=450700484668817410] [posTime=2024/06/25 03:25:06.482 +00:00]
[2024/06/25 03:29:05.926 +00:00] [INFO] [datacoord/services.go:835] ["datacoord append channelInfo in GetRecoveryInfo"] [traceID=1f2d93777c8f72df2d699d78b3ff1ea6] [collectionID=450697823444118235] [partitionIDs="[]"] [channel=by-dev-rootcoord-dml_2_450697823444118235v0] ["# of unflushed segments"=1] ["# of flushed segments"=0] ["# of dropped segments"=0] ["# of indexed segments"=0] ["# of l0 segments"=3]
[2024/06/25 03:29:05.927 +00:00] [INFO] [datacoord/index_service.go:924] ["List index success"] [traceID=630c603f6a4b0cbe80753c2edcdcc5ff] [collectionID=450612404343019873]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404343019873] [indexName=] [timestamp=0]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450612404343020293] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545230] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404343019873] [indexID=450674857463545314] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404343019873] [indexName=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450697823444118235] [indexName=] [timestamp=0]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118430] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450697823444118235] [indexID=450697823444118802] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450697823444118235] [indexName=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:682] ["receive DescribeIndex request"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404344463315] [indexName=] [timestamp=0]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:603] ["completeIndexInfo success"] [collectionID=450612404344463315] [indexID=450612404344463454] [totalRows=0] [indexRows=0] [pendingIndexRows=0] [state=Finished] [failReason=]
[2024/06/25 03:29:06.733 +00:00] [INFO] [datacoord/index_service.go:730] ["DescribeIndex success"] [traceID=41bcb7e728e84f7be0c8e80a0e365be7] [collectionID=450612404344463315] [indexName=]
[2024/06/25 03:29:06.949 +00:00] [INFO] [datacoord/services.go:197] ["handle assign segment request"] [traceID=adf7ff55d3cde56f2acc1624bcd7724f] [collectionID=450697823444118235] [partitionID=450697823444118236] [channelName=by-dev-rootcoord-dml_2_450697823444118235v0] [count=1] ["segment level"=Legacy]

@SunilWang
Copy link
Author

Is it possible that my consistency has something to do with it?
I'm the default: Bounded

@SunilWang
Copy link
Author

Duplicate data is found that the vector value will change.

My type is: Float Vector (512)

Vector value: [-0.11212158203125,-0.408447265625,-0.257568359375,0.2296142578125,0.04913330078125,0.2054443359375,0.60546875,0.1519775390625,0.42236328125,-0.1748046875,0.418212890625,-0.325439453125,0.125732421875, ...]

Why is it that when you insert data, you generate multiple arrays of different vectors?

@SunilWang
Copy link
Author

SunilWang commented Jun 25, 2024

image

I can confirm that I inserted only one piece of data

@SunilWang
Copy link
Author

image

@yanliang567
Copy link
Contributor

@SunilWang could you please update the log level to debug, reproduce the issue and export the full milvus logs as as commented above?

@SunilWang
Copy link
Author

@yanliang567 How can I get the log to you

@yanliang567
Copy link
Contributor

if it is big, you could share it in a cloud driver, or send it to me(yanliang.qiao@zilliz.com)

@SunilWang
Copy link
Author

# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, how do I change the log level?

@yanliang567
Copy link
Contributor

# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, how do I change the log level?

use docker logs export to a file

@SunilWang
Copy link
Author

# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, how do I change the log level?我是按脚本部署的,如何更改日志级别?

use docker logs export to a file使用 Docker 日志导出到文件

The log has been sent to your email address. Please check

@LoveEachDay
Copy link
Contributor

You can use the following restful request to change the log level to info:

curl -X PUT -H "Content-Type: application/json" localhost:9091/log/level -d '{"level": "info"}'

then verify the log level changes:

curl -i http://localhost:9091/log/level

@SunilWang
Copy link
Author

You can use the following restful request to change the log level to info:

curl -X PUT -H "Content-Type: application/json" localhost:9091/log/level -d '{"level": "info"}'

then verify the log level changes:

curl -i http://localhost:9091/log/level

The current log level is info by default. Do you need to change it

@xiaofan-luan
Copy link
Contributor

image

I can confirm that I inserted only one piece of data

both ids and vectors are different.. I do believe you will need to check your deployment.

@xiaofan-luan
Copy link
Contributor

unless there is more clues. I don't think this could be issue a milvus, please investigate on your self.
you can enable access log and should see multi insert. monitoring system should also illustrate that

@SunilWang
Copy link
Author

image
I can confirm that I inserted only one piece of data我可以确认我只插入了一条数据

both ids and vectors are different.. I do believe you will need to check your deployment.ID 和矢量都是不同的。我相信你需要检查你的部署。

# Download the installation script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh

# Start the Docker container
$ bash standalone_embed.sh start

I am deploying by script, there must be only one node.
The way I connect is also ip plus port:10.253.xxx.xxx:19530

@xiaofan-luan
Copy link
Contributor

again this is not related to milvus and the way milvus is deployed.
please read and check your code how you write to milvus carefully, especially how many times you call handleOne fucntion

@SunilWang
Copy link
Author

again this is not related to milvus and the way milvus is deployed.同样,这与 Milvus 和 Milvus 的部署方式无关。 please read and check your code how you write to milvus carefully, especially how many times you call handleOne fucntion请仔细阅读并检查你的代码是如何写入 milvus 的,尤其是你调用 handleOne 函数的次数

I printed the console log, and I can confirm that the handleOne function was only saved once for the same vector.

@xiaofan-luan
Copy link
Contributor

you can add a log on each of the line

  1. how much image do you have
  2. how many milvusClient.insert did you called.
  3. if you stop all the test, can you still see any insertion to milvus? can you still see entity number increase

guess you just need to have some patience and debug. there is no magic here.

@xiaofan-luan
Copy link
Contributor

256 row is a very small amount of data and it didn't even trigger compaction

@SunilWang
Copy link
Author

you can add a log on each of the line您可以在每一行上添加日志

  1. how much image do you have你有多少图像
  2. how many milvusClient.insert did you called.你调用了多少个 milvusClient.insert。
  3. if you stop all the test, can you still see any insertion to milvus? can you still see entity number increase如果你停止了所有的测试,你还能看到任何插入到Milvus的东西吗?你还能看到实体数量增加吗

guess you just need to have some patience and debug. there is no magic here.猜你只需要有一些耐心和调试。这里没有魔法。

milvusClient.insert is indeed only called 256 times. I have sent the complete log to (yanliang.qiao@zilliz.com)

@yanliang567
Copy link
Contributor

@SunilWang let's double confirm a few things:

  1. do you mean when you stop the client insert scripts, the num entities of the collection is still growing? how did you observe the increase?
  2. which collection has this issue? share some names
  3. when did you start the insert script, and when did yo stop it? share a rough timeline please

@SunilWang
Copy link
Author

SunilWang commented Jun 27, 2024

@SunilWang let's double confirm a few things:让我们再次确认几件事:

  1. do you mean when you stop the client insert scripts, the num entities of the collection is still growing? how did you observe the increase?您的意思是当您停止客户端插入脚本时,集合的实体数仍在增长吗?您是如何观察到这种增长的?
  2. which collection has this issue? share some names哪个集合有这个问题?共享一些名称
  3. when did you start the insert script, and when did yo stop it? share a rough timeline please你什么时候开始插入脚本的,你什么时候停止的?请分享一个粗略的时间表

1.We can see the data growth through the visual tool attu.
2.databases:testMaLiangVault Collection: testMaLiangVault
3.The exact time is forgotten, around June 25, 2024, from 2 p.m. to 6 p.m

image

@yanliang567
Copy link
Contributor

I did not find the collection name testMaLiangVault, and i am not able to reproduce the issue in house.

@SunilWang
Copy link
Author

I did not find the collection name testMaLiangVault, and i am not able to reproduce the issue in house.我没有找到集合名称testMaLiangVault,并且我无法在内部重现该问题。

Thank you very much for your technical support during this time.

My approach is to clean up duplicate data with a scheduled task, which temporarily solves the problem for now.

You can close the current issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants