Skip to content

bug: encounter "writer has not been closed or aborted, must be a bug"  #5101

@hzxa21

Description

@hzxa21

Describe the bug

Version: 0.47.2
With the following writer configurations, we encounter "writer has not been closed or aborted, must be a bug".

    let writer = op
            .clone()
            .layer(TimeoutLayer::new().with_io_timeout(Duration::from_millis(
                config.retry.streaming_upload_attempt_timeout_ms,
            )))
            .layer(
                RetryLayer::new()
                    .with_min_delay(Duration::from_millis(1000))
                    .with_max_delay(Duration::from_millis(10000))
                    .with_max_times(3)
                    .with_factor(2.0)
                    .with_jitter(),
            )
            .writer_with(&path)
            .concurrent(8)
            .executor(Executor::with(monitored_execute))
            .await?;

It seems that this happens when the opendal retry is triggered on writer.close() .

Steps to Reproduce

Expected Behavior

Additional Context

Logs:

2024-09-04T07:42:02.770667396Z WARN opendal::layers::retry: will retry after 1.604547739s because: Unexpected (temporary) at Writer::close, context: { timeout: 10 } => io operation timeout reached

2024-09-04T07:42:04.377279911Z WARN opendal::services: service=s3 operation=Writer::close path=xxx -> data close failed: NotFound (permanent) at Writer::close, context: { uri: ..., response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "cache-control": "no-cache", "content-length": "467", "content-security-policy": "block-all-mixed-content", "content-type": "application/xml", "server": "MinIO", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-accel-buffering": "no", "x-amz-id-2": "..."} }, service: s3, path: xxx, written: 138426184 } => S3Error { code: "NoSuchUpload", message: "The specified multipart upload does not exist. The upload ID may be invalid, or the upload may have been aborted or completed.", resource: "xxx", request_id: "xxx" }

2024-09-04T07:42:04.377323167Z WARN opendal::layers::complete: writer has not been closed or aborted, must be a bug

2024-09-04T07:42:04.37733207Z ERROR risingwave_object_store::object: streaming_upload_finish failed error=NotFound (persistent) at Writer::close, context: { uri:..., response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "cache-control": "no-cache", "content-length": "467", "content-security-policy": "block-all-mixed-content", "content-type": "application/xml", "server": "MinIO", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-accel-buffering": "no", "x-amz-id-2": "..."} }, service: s3, path: ... } => S3Error { code: "NoSuchUpload", message: "The specified multipart upload does not exist. The upload ID may be invalid, or the upload may have been aborted or completed.", resource: "...", request_id: "..." }

Are you willing to submit a PR to fix this bug?

  • Yes, I would like to submit a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions