Monitoring

When you create a destination, the agent starts pushing to Anodot system metrics. You can install this dashboard to see the metrics: agent monitoring dashboard

List of metrics available

Agent version

Streamsets state:

streamsets_cpu
streamsets_heap_memory_used_bytes
streamsets_non_heap_memory_used_bytes

Pipelines state:

pipeline_incoming_records_total
pipeline_outgoing_records_total
pipeline_error_records
pipeline_destination_latency_seconds - amount of time Anodot processes one batch
pipeline_source_latency_seconds - amount of time to execute a query to the source
pipeline_status
pipeline_avg_lag_seconds - seconds, the average time difference between the last datapoint set and the current time. If this value is constantly increasing than probably pipeline can't process all the data in time

Other:

kafka_consumer_lag for each topic
scheduled_scripts_errors (system cron jobs)

Total amount of metrics generated by monitoring: 2 + number of streamsets * 3 + number of pipelines * 15

Suggested alerts:

pipeline_outgoing_records drop and no data

How to disable sending metrics

Set an environmental variable MONITORING_SEND_TO_CLIENT=false for the agent container

Monitoring the agent with Prometheus

You can add the agent as a target for Prometheus. Metric are available on http://<agent-host>:8080/metrics

Recommended alerts to set:

Non-negative delta for pipeline_outgoing_records_total equals 0 or no data. This will allow tracking if a pipeline stops sending data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitoring

List of metrics available

How to disable sending metrics

Monitoring the agent with Prometheus

Recommended alerts to set:

Clone this wiki locally