feat(connectors): Delta Lake Sink Connector#2889
feat(connectors): Delta Lake Sink Connector#2889kriti-sc wants to merge 4 commits intoapache:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2889 +/- ##
============================================
+ Coverage 68.36% 68.59% +0.22%
Complexity 739 739
============================================
Files 1053 1057 +4
Lines 84763 85441 +678
Branches 61297 61985 +688
============================================
+ Hits 57948 58605 +657
- Misses 24448 24462 +14
- Partials 2367 2374 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| let endpoint_url = config | ||
| .aws_s3_endpoint_url | ||
| .as_ref() | ||
| .ok_or(Error::InvalidConfig)?; | ||
| let allow_http = config.aws_s3_allow_http.ok_or(Error::InvalidConfig)?; |
There was a problem hiding this comment.
aws_s3_endpoint_url and aws_s3_allow_http are hard-required via .ok_or(Error::InvalidConfig)?, but both are optional in delta-rs/object_store. AWS_ENDPOINT_URL defaults to the standard AWS S3 regional endpoint when omitted, and AWS_ALLOW_HTTP defaults to false. This means users connecting to real AWS S3 (not MinIO/LocalStack) are forced to provide values that shouldn't be necessary. These two fields should be added to the options map only when present, not treated as required.
| let account_key = config | ||
| .azure_storage_account_key | ||
| .as_ref() | ||
| .ok_or(Error::InvalidConfig)?; | ||
| let sas_token = config | ||
| .azure_storage_sas_token | ||
| .as_ref() | ||
| .ok_or(Error::InvalidConfig)?; |
There was a problem hiding this comment.
Both azure_storage_account_key and azure_storage_sas_token are required simultaneously via .ok_or(Error::InvalidConfig)?. In Azure, these are alternative authentication methods. you use either an account key or a SAS token, not both. This blocks users who only have a SAS token (common in restricted-access scenarios). The code should require account_name and at least one of account_key or sas_token, and only insert whichever credential is provided.
|
|
||
| fn apply_coercion(value: &mut Value, node: &CoercionNode) { | ||
| match node { | ||
| CoercionNode::Coercion(Coercion::ToString) => { |
There was a problem hiding this comment.
SQL semantics treat NULL as a special type and its never coerced to any other type in most cases.
This can lead to subtle but hard to debug bugs.
There was a problem hiding this comment.
You are right. Thank you for catching this, fixing this in the next commit. I had misunderstood your point earlier.
Which issue does this PR close?
Closes #1852
Rationale
Delta Lake is a data analytics engine, and very popular in modern streaming analytics architectures.
What changed?
Introduces a Delta Lake Sink Connector that enables writing data from Iggy to Delta Lake.
The Delta Lake writing logic is heavily inspired by the kafka-delta-ingest project, to have a proven starting ground for writing to Delta Lake.
Local Execution
user_id: String, user_type: u8, email: String, source: String, state: String, created_at: DateTime<Utc>, message: Stringusing sample data producer.AI Usage
If AI tools were used, please answer: