Skip to content

Monitoring

Daria Kharlan edited this page May 26, 2022 · 7 revisions

When you create a destination, the agent starts pushing to Anodot system metrics. You can install this dashboard to see the metrics: agent monitoring dashboard

List of metrics available

  • Agent version

Streamsets state:

  • streamsets_cpu
  • streamsets_heap_memory_used_bytes
  • streamsets_non_heap_memory_used_bytes

Pipelines state:

  • pipeline_incoming_records_total
  • pipeline_outgoing_records_total
  • pipeline_error_records
  • pipeline_destination_latency_seconds - amount of time Anodot processes one batch
  • pipeline_source_latency_seconds - amount of time to execute a query to the source
  • pipeline_status
  • pipeline_avg_lag_seconds - seconds, the average time difference between the last datapoint set and the current time. If this value is constantly increasing than probably pipeline can't process all the data in time

Other:

  • kafka_consumer_lag for each topic
  • scheduled_scripts_errors (system cron jobs)

Total amount of metrics generated by monitoring: 2 + number of streamsets * 3 + number of pipelines * 15

Suggested alerts:

  • pipeline_outgoing_records drop and no data

How to disable sending metrics

Set an environmental variable MONITORING_SEND_TO_CLIENT=false for the agent container

Monitoring the agent with Prometheus

You can add the agent as a target for Prometheus. Metric are available on http://<agent-host>:8080/metrics

Recommended alerts to set:

  • Non-negative delta for pipeline_outgoing_records_total equals 0 or no data. This will allow tracking if a pipeline stops sending data
Clone this wiki locally