Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: revert opendal 0.46 #4098

Closed
wants to merge 1 commit into from

Conversation

WenyXu
Copy link
Member

@WenyXu WenyXu commented Jun 3, 2024

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

#2662

What's changed and what's your intention?

Revert opendal 0.46

After upgrading to opendal 0.46, it incurs the EntityTooSmall error during closing the writer in the flush/compaction job.
The opendal's FuturesAsyncWriter doesn't work well with our BufferedWriter🥲. Maybe we should use the chunk method provided by opendal instead of implementing a BufferedWriter .

i.e.

                store
                    .writer_with(&path)
                    .chunk(DEFAULT_WRITE_BUFFER_SIZE)
                    .concurrent(concurrency)
                    .await

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.

@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Jun 3, 2024
Copy link
Contributor

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate what is the issue introduced by OpenDAL 0.46?

@tisonkun
Copy link
Contributor

tisonkun commented Jun 3, 2024

Generally we fix issues instead of reverting. Unless there is an emergency to release or it's the dependency's fault that the new version contains bugs.

A PR without a description to revert commits looks confusing.

@fengjiachun
Copy link
Collaborator

We need to explain why we're doing this. @WenyXu

Copy link

codecov bot commented Jun 3, 2024

Codecov Report

Attention: Patch coverage is 81.95489% with 48 lines in your changes are missing coverage. Please review.

Project coverage is 85.21%. Comparing base (4e5dd1e) to head (7e89345).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4098      +/-   ##
==========================================
- Coverage   85.47%   85.21%   -0.27%     
==========================================
  Files         991      991              
  Lines      173053   172997      -56     
==========================================
- Hits       147920   147420     -500     
- Misses      25133    25577     +444     

@WenyXu WenyXu marked this pull request as draft June 3, 2024 14:53
@WenyXu
Copy link
Member Author

WenyXu commented Jun 3, 2024

Generally we fix issues instead of reverting. Unless there is an emergency to release or it's the dependency's fault that the new version contains bugs.

A PR without a description to revert commits looks confusing.

Yes, I will try to remove the BufferedWriter tomorrow. 🥲 cc @v0y4g3r

@Xuanwo
Copy link
Contributor

Xuanwo commented Jun 3, 2024

Revert opendal 0.46

After upgrading to opendal 0.46, it incurs the EntityTooSmall error during closing the writer in the flush/compaction job. The opendal's FuturesAsyncWriter doesn't work well with our BufferedWriter🥲. Maybe we should use the chunk method provided by opendal instead of implementing a BufferedWriter .

Hi, seems better to adapt new API instead of reverting?

@Xuanwo
Copy link
Contributor

Xuanwo commented Jun 3, 2024

The opendal's FuturesAsyncWriter doesn't work well with our BufferedWriter

Would you like to raise a bug for this? BufferedWriter should work just fine like before with FuturesAsyncWriter. What's buf size do you set for your BufferedWriter?

@WenyXu
Copy link
Member Author

WenyXu commented Jun 3, 2024

Hi, seems better to adapt new API instead of reverting?

Yes, I will adapt it by simplifying/removing the BufferedWriter.

Would you like to raise a bug for this? BufferedWriter should work just fine like before with FuturesAsyncWriter. What's buf size do you set for your BufferedWriter?

We found this bug while investigating another bug, I'm still trying to figure out why it happens 😢

@Xuanwo
Copy link
Contributor

Xuanwo commented Jun 3, 2024

We found this bug while investigating another bug, I'm still trying to figure out why it happens 😢

The OOM maybe related to https://opendal.apache.org/docs/rust/opendal/docs/rfcs/rfc_4638_executor/index.html

@WenyXu
Copy link
Member Author

WenyXu commented Jun 3, 2024

The opendal's FuturesAsyncWriter doesn't work well with our BufferedWriter

Would you like to raise a bug for this? BufferedWriter should work just fine like before with FuturesAsyncWriter. What's buf size do you set for your BufferedWriter?

I think the key problem here is that FuturesAsyncWriter will send a small part of the data to the underlying writer directly instead of being controlled by BufferedWriter. I created a PR to fix it #4100

@WenyXu
Copy link
Member Author

WenyXu commented Jun 3, 2024

The opendal's FuturesAsyncWriter doesn't work well with our BufferedWriter

Would you like to raise a bug for this? BufferedWriter should work just fine like before with FuturesAsyncWriter. What's buf size do you set for your BufferedWriter?

I think the key problem here is that FuturesAsyncWriter will send a small part of the data to the underlying writer directly instead of being controlled by BufferedWriter. I created a PR to fix it #4100

Oops, I reviewed the code; the chunk size will be set automatically via the service's capability attributes. There must be something else I don't know🥹

@Xuanwo
Copy link
Contributor

Xuanwo commented Jun 3, 2024

Oops, I reviewed the code; the chunk size will be set automatically via the service's capability attributes.

Only set while chunk has been set. Write will go directly if chunk is None.

@WenyXu
Copy link
Member Author

WenyXu commented Jun 3, 2024

Oops, I reviewed the code; the chunk size will be set automatically via the service's capability attributes.

Only set while chunk has been set. Write will go directly if chunk is None.

Oh, I see. So, the key problem is that we didn't set the chunk size.🥲

@Xuanwo
Copy link
Contributor

Xuanwo commented Jun 3, 2024

Oh, I see. So, the key problem is that we didn't set the chunk size.🥲

Yep, it's quite easy to be used in wrong.

I'm considering change the behavior here: we can make the min_write_size as the default chunk size.

If users has chunk set, we will respect users input and pick up max(chunk, min_write_size). All requests will be the same chunk size.

If users don't have chunk set, we will use min_write_size as chunk size. But we won't really perform the chunk operation: all write that larger than min_write_size will be sent directly.

Would you like to start a pr for this?

@killme2008
Copy link
Contributor

killme2008 commented Jun 3, 2024

Should we release version 0.8.2?
This is why I mentioned in Slack regarding Pull Request #4037 . Testing this fundamental library upgrade extensively in our production environment is crucial.

@tisonkun
Copy link
Contributor

tisonkun commented Jun 4, 2024

Testing this fundamental library upgrade extensively in our production environment is crucial.

Another topic is that we're currently mixing up bug fixes and feature development in patch releases as we are in the 0.x series.

Once we release 1.0, perhaps we should use multi branches and only backport bug fixes to patch releases. The feature releases (minor or major) would have a release train model so that developers know clear the timeline and when to include changes and when feature freeze and test.

cc @fengjiachun

@killme2008
Copy link
Contributor

Testing this fundamental library upgrade extensively in our production environment is crucial.

Another topic is that we're currently mixing up bug fixes and feature development in patch releases as we are in the 0.x series.

Once we release 1.0, perhaps we should use multi branches and only backport bug fixes to patch releases. The feature releases (minor or major) would have a release train model so that developers know clear the timeline and when to include changes and when feature freeze and test.

cc @fengjiachun

Agree, I think we must set up the release rules after 1.0

@WenyXu
Copy link
Member Author

WenyXu commented Jun 4, 2024

If users don't have chunk set, we will use min_write_size as chunk size. But we won't really perform the chunk operation: all write that larger than min_write_size will be sent directly.

It seems we can't reuse the ChunkedWriter directly.

@Xuanwo
Copy link
Contributor

Xuanwo commented Jun 4, 2024

It seems we can't reuse the ChunkedWriter directly.

Sorry for mixing the context. I'm talking about sending PRs to opendal for changing the behavior.

@WenyXu
Copy link
Member Author

WenyXu commented Jun 4, 2024

fixed #4100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants