Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: compaction failed because of imported segment's state is not flushed #35349

Open
1 task done
sunby opened this issue Aug 7, 2024 · 3 comments
Open
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sunby
Copy link
Contributor

sunby commented Aug 7, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: master/2.4
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

imported segment's state is not changed to flushed

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@sunby sunby added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 7, 2024
@yanliang567 yanliang567 assigned sunby and unassigned yanliang567 Aug 8, 2024
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 8, 2024
@aping
Copy link

aping commented Aug 9, 2024

yeah I ran into the same issue, steps to repro

Milvus

milvus 2.4.3 (milvusdb/milvus:v2.4.3), I'm in cluster mode

Write data

first create a collection, then importing data with bulk insert, code is like

writer = RemoteBulkWriter(
    schema=collection.schema,
    remote_path="/bulk-writes",
    connect_param=s3_conn,
    file_type=BulkFileType.PARQUET,
    chunk_size=768 * 1024 * 1024,
)

for ...:
    writer.append_row(...)

writer.commit()

utility.do_bulk_insert(
    collection_name=collection_name,
    files=[file],
)

Observation

I was expecting compaction to kick in but when I check the num of segment files they did not change, so checked logs,

datacoord's logs says

[2024/08/09 07:05:07.809 +00:00] [WARN] [datacoord/session_manager.go:203] ["failed to execute compaction"] [node=3112] [error="segment with flushed state not found: segment not found[segment=450726929646550253]"] [planID=450726929697259555]

datanode's log says

[2024/08/09 07:05:07.809 +00:00] [WARN] [datanode/services.go:225] ["compaction plan contains segment which is not flushed"] [traceID=111152661c2b2462cff77a5c6d56703f] [planID=450726929697259555] [segmentID=450726929646550253]

but when I check the segment in birdwatcher it says it's flushed, like

SegmentID: 450726929646550253 State: Flushed, Row Count:97580

Workaround

for now, after all the writes are done, if I restart the datanodes, compaction starts to work

@xiaofan-luan
Copy link
Contributor

@sunby @bigsheeper is this the same issue? memory state is consistent with

@xiaofan-luan
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: master/2.4
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

imported segment's state is not changed to flushed

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

is there a reason for that? what will be the expected fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants