Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #57187

…perty (#57187)

### What problem does this PR solve?

Issue Number: #57228 

Related PR: #xxx

Problem Summary:

there are some problems:
1. when insert new data into iceberg,doris can not retrieve new data
untill the table meta cache is refreshed.
2. when alter iceberg table's schema, doris can not detect schema
changes until both the table meta cache and the snapshot meta cache are
refreshed. if we add a new column and insert data into this column using
Spark, doris can not detect this column and new data until both the
table meta cache and the snapshot meta cache are refreshed.
3. both the table meta cache and the snapshot meta cache are 10 minutes,
it's too long.

we can add two params  in catalog properties:
**iceberg.table.meta.cache.ttl-second** and
**iceberg.snapshot.meta.cache.ttl-second**
these params can control the expire of iceberg table meta cache and
snapshot meta cache,

1. Regarding the "iceberg.table.meta.cache.ttl-second" parameter,we can
create a iceberg catalog like this:
CREATE CATALOG iceberg
PROPERTIES (
    "type" = "iceberg",
    "iceberg.catalog.type" = "hms",
    "hive.metastore.uris" = "thrift://localhost:9083",
    **"iceberg.table.meta.cache.ttl-second" = "0"**
);

when we set "iceberg.table.meta.cache.ttl-second" = "0", Doris can
retrieve the latest data immediately without using the cache.
Alternatively, we can set the "iceberg.table.meta.cache.ttl-second"
parameter to a smaller value, thus reducing the cache interval.


2.Regarding the "iceberg.snapshot.meta.cache.ttl-second" parameter, we
can create a iceberg catalog like this:
CREATE CATALOG iceberg
PROPERTIES (
    "type" = "iceberg",
    "iceberg.catalog.type" = "hms",
    "hive.metastore.uris" = "thrift://localhost:9083",
    **"iceberg.table.meta.cache.ttl-second" = "0",**
     **"iceberg.snapshot.meta.cache.ttl-second" = "0"**
);

when we set "iceberg.table.meta.cache.ttl-second" = "0" and
"iceberg.snapshot.meta.cache.ttl-second"="0", doris can immediately
detect schema changes(use desc or select ...), if we add a new column
and insert data into this column using Spark, doris can detect this
column and new data immediately.

Co-authored-by: yaoxiao <yaoxiao@fosun.com>
@github-actions github-actions bot requested a review from morrySnow as a code owner December 19, 2025 02:39
@Thearas
Copy link
Contributor

Thearas commented Dec 19, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Dec 19, 2025
@Thearas
Copy link
Contributor

Thearas commented Dec 19, 2025

run buildall

@morrySnow
Copy link
Contributor

compile failed

@morrySnow morrySnow closed this Dec 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants