Integrate append flow API #50

ravi-databricks · 2024-04-25T18:33:41Z

Integrate append_flow API for following use cases:

One time backfill
Multiple kafka topics writing to same target

ravi-databricks · 2024-06-03T18:35:50Z

Introduced bronze_append_flows and silver_append_flows inside onboarding file with below structure:
e.g Main bronze table customer needs to insert from different datasets then DLT-META can launch multiple flows under :

 "bronze_append_flows": [
      {
            "name": "customer_bronze_flow",
            "create_streaming_table": false,
            "source_format": "cloudFiles",
            "source_details": {
               "source_path_it": "{dbfs_path}/integration_tests/resources/data/customers_af",
               "source_schema_path": "{dbfs_path}/integration_tests/resources/customers.ddl"
            },
            "reader_options": {
               "cloudFiles.format": "json",
               "cloudFiles.inferColumnTypes": "true",
               "cloudFiles.rescuedDataColumn": "_rescued_data"
            },
            "once": false
      }
   ]

With above example in case of kafka as source_format append_flows can contain multiple topics in source_details and reader_options
As a result of above change needs to restructure pipeline readers to contain state information like source_details, source_format, reader_options and schema_json. This will make sure dlt.append_flow can have respective callable functions from PipelineReaders like read_dlt_cloud_files, read_dlt_delta, read_kafka
Incorporated additional parameters for dlt.apply_changes

            flow_name,
            once,
            ignore_null_updates_column_list,
            ignore_null_updates_except_column_list

@ganeshchand @neil90 @howardwu-db

ravi-databricks added the enhancement New feature or request label Apr 25, 2024

ravi-databricks self-assigned this Apr 25, 2024

ravi-databricks added this to the v0.0.8 milestone Jul 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate append flow API #50

Integrate append flow API #50

ravi-databricks commented Apr 25, 2024

ravi-databricks commented Jun 3, 2024

Integrate append flow API #50

Integrate append flow API #50

Comments

ravi-databricks commented Apr 25, 2024

ravi-databricks commented Jun 3, 2024