-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1624908 Add sendFailure subfields for health ping #565
Conversation
Integration report for "Bug 1624908 Add sendFailure subfields for health ping"Report for upstream
|
3a8f146
to
a4e091e
Compare
Integration report for "Add pingDiscardedForSize subfields"Report for upstream
|
"properties": { | ||
"<unknown>": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acmiyaguchi From the bq schema diff, it looks like this field with name "<unknown>"
just gets dropped and is absent from the bq schema. Does that sound like intended behavior?
I realize that generating a sane bq-compatible field name for "<unknown>"
is a bit of a stretch, but I was expecting that this might be normalized to _unknown_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is currently no behavior for normalizing an empty string as a key into BigQuery. This is certainly something that may be added, perhaps mapping it to _
or __unknown__
. However, this key would conflict with other JSON keys like .
, _
, or __
. I think the right behavior would be to panic or drop the fields in the case where the transpiler detects these name clashes.
Would this be handled deterministically in the decoder?
>> x = {}; x[""]=1; x["."]=2; x["_"]=3;
3
>> x
Object { "": 1, ".": 2, _: 3 }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually not sure why <unknown>
is converted into "", that's unintended behavior for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redash was converting "<unknown>"
to ""
. The BQ console shows the value as "". So I would expect the schema generator to be seeing "<unknown>"
as the input name and potentially normalizing to _unknown_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeesh, having this discussion is difficult due to literally typing <unknown>
getting interpreted as an html tag and hidden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to reproduce this locally in a shell, outside of any rendering done by javascript/browser.
% echo '{"properties": {"payload": {"properties": {"<unknown>": {"type": "string"}, "foo": {"type": "string"}}}}}' | jq
{
"properties": {
"payload": {
"properties": {
"<unknown>": {
"type": "string"
},
"foo": {
"type": "string"
}
}
}
}
% echo '{"properties": {"payload": {"properties": {"<unknown>": {"type": "string"}, "foo": {"type": "string"}}}}}' | jsonschema-transpiler -t bigquery
[
{
"fields": [
{
"mode": "NULLABLE",
"name": "foo",
"type": "STRING"
}
],
"mode": "NULLABLE",
"name": "payload",
"type": "RECORD"
}
]
"properties": { | ||
"abort": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We include the list of observed keys as seen in query: https://sql.telemetry.mozilla.org/queries/72242/source
"properties": { | ||
"<unknown>": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We include the observed keys as seen in query: https://sql.telemetry.mozilla.org/queries/72243/source
We'll go this route for now rather than trying to add the fields directly as proposed in mozilla-services/mozilla-pipeline-schemas#565 This structure in the view is generally compatible with the schema change that would involve concrete fields in case we still want to go that direction in the future.
Closing in favor of mozilla/bigquery-etl#1173 |
We'll go this route for now rather than trying to add the fields directly as proposed in mozilla-services/mozilla-pipeline-schemas#565 This structure in the view is generally compatible with the schema change that would involve concrete fields in case we still want to go that direction in the future.
We'll go this route for now rather than trying to add the fields directly as proposed in mozilla-services/mozilla-pipeline-schemas#565 This structure in the view is generally compatible with the schema change that would involve concrete fields in case we still want to go that direction in the future.
Checklist for reviewer:
.circleci/config.yml
) will cause environment variables (particularly credentials) to be exposed in test logsintegration
CI test by pushing this revision as discussed in the README and review the report posted in the comments.For glean changes:
include/glean/CHANGELOG.md