diff --git a/docs/images/bigquery-json-flow-diagram.svg b/docs/images/bigquery-json-flow-diagram.svg new file mode 100644 index 00000000..02401616 --- /dev/null +++ b/docs/images/bigquery-json-flow-diagram.svg @@ -0,0 +1 @@ +BigquerySinkFactoryBigquerySinkJsonErrorHandlerGoogle Bigqueryalt[is success][no such field error]Create table with default columnsinitiate sink for json messageswrite json messagessuccesfully written records to bigquerygiven new json attributes not present in bq table schemaparse messages with no such fields errorsadd new fields in json messages to bigquery tablesuccessfully added new fields to bigquery tableretry writing json messagesBigquerySinkFactoryBigquerySinkJsonErrorHandlerGoogle Bigquery \ No newline at end of file diff --git a/docs/sinks/bigquery.md b/docs/sinks/bigquery.md index 7400a307..63f13622 100644 --- a/docs/sinks/bigquery.md +++ b/docs/sinks/bigquery.md @@ -11,13 +11,17 @@ Bigquery Sink has several responsibilities, first creation of bigquery table and Currently we support dynamic schema by inferring from incoming json data; so the bigquery schema is updated by taking a diff of fields in json data and actual table fields. Currently we only support string data type for fields, so all incoming json data values are converted to string type, Except for metadata columns and partion key. - ## Bigquery Table Schema Update ### Protobuf Bigquery Sink update the bigquery table schema on separate table update operation. Bigquery utilise [Stencil](https://github.com/odpf/stencil) to parse protobuf messages generate schema and update bigquery tables with the latest schema. The stencil client periodically reload the descriptor cache. Table schema update happened after the descriptor caches uploaded. +### JSON +Bigquery Sink creates the table with initial columns mentioned in the config. When new fields arrive in json data they are added to bigquery table. +### Flow chart for data type json sink and schema update +![](../images/bigquery-json-flow-diagram.svg) + ## Protobuf - Bigquery Table Type Mapping Here are type conversion between protobuf type and bigquery type :