Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instrumentation and logging to BQ sink. #60

Closed
lavkesh opened this issue Jul 6, 2021 · 2 comments · Fixed by #78
Closed

Add instrumentation and logging to BQ sink. #60

lavkesh opened this issue Jul 6, 2021 · 2 comments · Fixed by #78
Assignees

Comments

@lavkesh
Copy link
Member

lavkesh commented Jul 6, 2021

Acceptance criteria:

  • Analysis of metrics for BQ sink.
  • Implementation.

Discussion:

Insert time.
Counter for success/failures of insert messages.
No of Error messages(deserialisation and repose errors from bigquery) or dlqed.
Log the offsetinfo when the error happens.
table/dataset creation logging and metrics.
stencil Proto update logging and metrics. (log exceptions etc)
Think about completeness/freshness/deduplication.

@ravisuhag ravisuhag added this to Pending in Roadmap 2021 H2 Jul 6, 2021
@lavkesh lavkesh self-assigned this Jul 27, 2021
@lavkesh lavkesh moved this from Pending to Progress in Roadmap 2021 H2 Jul 27, 2021
@lavkesh
Copy link
Member Author

lavkesh commented Aug 3, 2021

Core Metrics firehose analysis.

Option 1:

sink_messages_total(type)

  • Counter to indicate how many messages were pass through sink
  • The values of type can be(total, success, failures)
  • sum(sink_messages_total(type=total)) = sum(sink_messages_total(type=success|failure))

global_messages_total(type)

  • counter to indicate how many unique messages were pass through firehose.
  • type can be (consumer, sink, dlq, ignored,filtered)
  • sum(global_messages_total(type=consumer)) = sum(global_messages_total(type=sink|dlq|ignored))

retry_messages_total(type,error_type)

  • counter to indicate how many messages were processed by SinkWithRetryDecorator
  • type can be (total, success, failures)
  • error_type are defined in ErrorTypes class.

dlq_messages_total(type,error_type)

  • counter to indicate how many messages were processed by SinkWithDLQ
  • type can be (total, success, failures)

error_messages_total(errortype)

  • counter to indicate how many messages has errors after processed by sink
  • errortypes are defined in ErrorTypes class

Option 2:
input_message_total(scope)

  • counter to indicate input messages to different scopes.
  • scope= consumer,sink,retry,dlq

success_message_total(scope)

  • counter to indicate success messages for different scopes
  • scope= consumer,sink,retry,dlq

failed_message_total(socpe)

  • counter to indicate failed messages for different scopes
  • scope= consumer,sink,retry,dlq

@lavkesh
Copy link
Member Author

lavkesh commented Aug 12, 2021

Insert time-> BQ insert time. [Sink metrics]
Counter for success/failures of insert messages. [ Sink metrics]
No of Error messages(deserialisation and repose errors from bigquery) or dlqed.[ Sink metrics]
Log the offsetinfo when the error happens[log bigquery errors while parsing, and print toptic, offset , partition in abstract sink]
table/dataset creation logging and metrics.[BigqueryOperationTotal, api_name=, table=,dataset=] [time_taken_operation]
stencil Proto update logging. (log exceptions etc)

Think about completeness/freshness/deduplication.

@lavkesh lavkesh linked a pull request Aug 12, 2021 that will close this issue
@lavkesh lavkesh moved this from Progress to Review in Roadmap 2021 H2 Aug 12, 2021
@lavkesh lavkesh moved this from Review to Done in Roadmap 2021 H2 Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants