-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(protobuf): support any for protobuf message source #12291
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
RisingWave doesn't wrap messages in
I don't get the question |
This comment was marked as outdated.
This comment was marked as outdated.
Now any type is supported, the returned type would be a |
@BugenZhao When will this PR be merged ? |
We are currently awaiting approval from the module owners for this PR. We appreciate your patience, and you can subscribe to the notifications of this PR to get informed. |
Codecov Report
@@ Coverage Diff @@
## main #12291 +/- ##
==========================================
- Coverage 68.26% 68.16% -0.11%
==========================================
Files 1505 1506 +1
Lines 254901 255327 +426
==========================================
+ Hits 174015 174034 +19
- Misses 80886 81293 +407
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 20 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I think it is not expected. |
There is a potential chance that naming conflict could happen, let's say user names the int64 field of B C as the same, i.e., message A {
any v1 = 1;
int64 v2 = 2;
}
message B {
any v3 = 1;
int64 v4 = 2;
}
message C {
int64 v4 = 1;
string v5 = 2;
bytes v6 = 3;
} In the above case, there could potentially exist two kinds of ingest messages: |
In addition, any idea on the gitguardian error? |
Just ignore it, it is not related to the topic |
Yes, you are right, it can be ambiguous if the message names collide. But we should do as little change to the original message content as possible, giving a prefix to each field is opposite to the idea.
I think it is not confusing.
Yes, the customer told us that they will ensure only one message type in the |
It sounds like the conclusion will be this, right? {
"__type": B.full_name(),
"v3": {
"__type": C.full_name(),
"v5": 2,
"v6": "some string",
"v7": NULL // <-- json cannon handle bytes type, I think it is ok to leave NULL here and give a warning
},
"v4": 1,
} LGTM 👍 A little suggestion: a single underscore ( |
The current message will be displayed like this. The example use case is as below. The schema of the messages is as below.
One thing to note is that, we now support match fields[1].clone() {
Some(ScalarImpl::Jsonb(jv)) => {
assert_eq!(
jv,
JsonbVal::from(json!({
"_type": "test.AnyValue",
"any_value_1": {
"_type": "test.StringValue",
"value": "114514",
},
"any_value_2": {
"_type": "test.Int32Value",
"value": 114514,
}
}))
);
}
_ => panic!("Expected ScalarImpl::Jsonb"),
} This PR could be merged after final reviews, cc @tabVersion @fuyufjh @Rossil2012. |
…exist from upstream sink
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for finishing the pr🥹
if key.is_empty() { | ||
key = "Int16".to_string(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we always get a struct here, when will we encounter the missing key scene?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When encountering any type holding a "nested" struct, we could not get the actual field name inside the struct.
e.g., For the proto below, we could only get the type_url
and value
for the second
and third
field in StringStringInt32Value
, rather than the actual field name in those structs. In this case, the current workaround is use anonymous type name as the key, that's when the full_name_vec
turns `None.
message TestAny {
int32 id = 1;
google.protobuf.Any any_value = 2;
}
message StringStringInt32Value {
string first = 1;
StringInt32Value second = 2;
Float32StringValue third = 3;
}
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
resolve #12246
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
Now we support any type for protobuf source, the relevant documentation maybe updated.