Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: change load_file_metadata_expire_hours default from 24*7 to 12 hours #15514

Merged
merged 3 commits into from
May 14, 2024

Conversation

BohuTANG
Copy link
Member

@BohuTANG BohuTANG commented May 14, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - default setting value change

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

Copy link

what-the-diff bot commented May 14, 2024

PR Summary

  • Updated Settings Value for File Metadata Expiry
    The duration after which file metadata gets refreshed was previously set to a week (168 hours). This PR changes the expiration time to just 24 hours, meaning that file information will be updated more frequently.

@github-actions github-actions bot added the pr-chore this PR only has small changes that no need to record, like coding styles. label May 14, 2024
@BohuTANG BohuTANG requested review from zhang2014, flaneur2020 and dantengsky and removed request for zhang2014 May 14, 2024 02:11
@sundy-li
Copy link
Member

docs need to be updated

COPY INTO ensures idempotence by automatically tracking and preventing the reloading of files for a default period of 7 days. This can be customized using the load_file_metadata_expire_hours setting to control the expiration time for file metadata.
This parameter defaults to False meaning COPY INTO will skip duplicate files when copying data. If True, duplicate files will not be skipped.

@BohuTANG
Copy link
Member Author

BohuTANG commented May 14, 2024

docs need to be updated

COPY INTO ensures idempotence by automatically tracking and preventing the reloading of files for a default period of 7 days. This can be customized using the load_file_metadata_expire_hours setting to control the expiration time for file metadata.
This parameter defaults to False meaning COPY INTO will skip duplicate files when copying data. If True, duplicate files will not be skipped.

Will update after this PR merged.
PR: datafuselabs/databend-docs#788

@BohuTANG BohuTANG merged commit b96cb06 into datafuselabs:main May 14, 2024
72 checks passed
@TCeason
Copy link
Collaborator

TCeason commented May 14, 2024

("load_file_metadata_expire_hours", DefaultSettingValue {
                    value: UserSettingValue::UInt64(24),
                    desc: "Sets the hours that the metadata of files you load data from with COPY INTO will expire in.",
                    mode: SettingMode::Both,
                    range: Some(SettingRange::Numeric(0..=u64::MAX)),
                }),

The value means hours. So the range's max use u64::MAX that may not be very reasonable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-chore this PR only has small changes that no need to record, like coding styles.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggestion: Set the default value for the load_file_metadata_expire_hours setting to 24 hours
3 participants