-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flink: Support Flink streaming reading #1383
Comments
@JingsongLi we should use the new FLIP-27 source interface, right?
We probably don't want enumerator statically assign all discovered splits up front. Dynamic assignment is better for load balancing with straggler/outlier reader nodes. |
Hi @stevenzwu , yes, The advantage is that the assignment will be more dynamic balanced. It depends on the progress of FLIP-27. If the time is not urgent, we can wait for FLINK 1.12. |
@stevenzwu , we have implemented an internal version for flink streaming reader, which is not built on top of FLIP-27 now. Here is the pull request https://github.com/generic-datalake/iceberg-poc/pull/3/files for our own branch. As Jingsong described, once FLIP-27 is ready, we'd happy to switch the current implementation to FLIP-27. |
@JingsongLi @openinx thx. We are currently implementing an Iceberg source based on FLIP-27 interface. Our initial goal is for backfill purpose. it is bounded but with streaming behavior. Meaning app code stayed with DataStream API, just switching source from Kafka to Iceberg. We are also very interested in streaming/continuous read pattern. It is not urgent. we can probably collaborate. Would love to see building blocks being pushed upstream slowly. |
regarding I am thinking about two level of enumerations to keep the enumerator memory footprint in check.
if job is keeping up with the ingestion, we should only have one unconsumed snapshots. |
@stevenzwu , what is the maximum size of a table in your production environment ? I'm thinking whether it's worth to implement the two-phase enumerators in the first version. If we have 1PB data and each file have the size 128MB, then it will have 8388608 files. If every |
Hi @stevenzwu , about
|
I was mainly discussing in the context of FLIP-27 source. Regardless how we implement the enumeration, there are two pieces of info that enumerator needs to track and checkpoint.
I was mainly concerned about the state size for the latter. That is where I was referring to throttle the eagerness of planned splits. I was thinking about using Here are some additional benefits of enumerating splits snapshot by snapshot.
@openinx note that this is not keyed state where state is distributed among parallel tasks. Here, 8 GB operator state can be problematic enumerator state. I vaguely remember RocksDB can't handle a list larger than 1 GB. the bigger the list, the slower it gets. also if we do @JingsongLi Yeah, the key thing is how coordinator/enumerator controls how the splits are generated. I was saying that we may need some control/throttling there to avoid eagerly enumerate all pending snapshots so that the checkpointed split list is manageable/capped. I thought the idea |
NIT: I think we still need use |
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
Flink is famous for its streaming computation.
After #1346 , it is easy to build Flink streaming reading based on it.
Unlike Spark, Flink streaming continuous monitor new Files of table, and directly send the splits to downstream tasks. The source don't need take care of micro-batch size, because the downstream tasks stores incoming splits into state, and consume one by one.
Monitor ----(Splits)-----> ReaderOperator
Monitor (Single task):
FlinkSplitGenerator
. (Actually usingTableScan.appendsBetween
).ReaderOperator (multiple tasks):
FlinkInputFormat
in a checkpoint cycle.The text was updated successfully, but these errors were encountered: