Conversation
import org.influxdb.InfluxDBFactory; | ||
import org.influxdb.dto.BatchPoints; | ||
import org.influxdb.dto.Point; | ||
import org.influxdb.BatchOptions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort
serverHost = (String) conf.get(SERVER_HOST_KEY); | ||
|
||
// The InfluxDB client expects the host connection string to begin with http or https | ||
if(!serverHost.contains("http://") & !serverHost.contains("https://")){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
startsWith() and && might be better?
And please add a space after "if" and before "{"
* @param context context object containing information on the Topology and Heron system this sink | ||
* is registered with. | ||
*/ | ||
public void init(Map<String, Object> conf, SinkContext context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is a bit long. suggest to refactor some blocks into smaller ones.
*/ | ||
public void flush() { | ||
if(batchEnabled) { | ||
LOG.info("Flushing buffered metrics to InfluxDB"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad indent
@nwangtw Hope those changes address your comments. Thanks for looking at this for me. |
Out of curiosity, is there a pull based alternative to this which doesn't require changes? I was inclined to look as this introduces a lot of dependencies. Here's a post from May 2017 that made me think it might be an option:
|
I agree the added dependencies are excessive. I could rewrite the client to do raw HTTP requests but I would simply be replicating the work of Influx client. To cut down on dependencies, could I package the influx sink on maven central and have it as a single external dependency? Or is that overkill? The Prometheus/Kapacitor option is a great solution (wish I had found it before I wrote the Influx sink). However, it does require Kapacitor (a stream processing engine) to be setup, so we have a stream processing engine to monitor a stream processing engine ("who watches the watchmen", "it's turtles all the way down", etc etc). I think this highlights an issue with the centralised metrics sink design in Heron. In Storm the sink is part of the topology and is included in the deployed fat JAR, so is just pulled in as needed by the topology developers. I am wondering if there may be a way to do an "on demand" version of sinks for Heron? Probably not as they are part of the metrics manager. |
"I could rewrite the client to do raw HTTP requests but I would simply be replicating the work of Influx client." @Code0x58 made a good point. For a metrics sink, it is quite strange to me that so many dependencies are required. In theory, a sink client should be just a simple wrapper for a few HTTP requests, so it may only need a HTTP lib (and/or thrift requests but still). I am wondering how these dependencies are used and if there is a simpler solution. |
Well most of the dependencies come from the http lib that the influxDB java client is using (com_squareup_okhttp3). It looks excessive as I have to list every sub dependency in the WORKSPACE file. Is there a way for bazel to resolve the java dependencies itself? Then I would just need to include the InfluxDB client. If adding these deps really is a hard no, then I can rewrite the sink to just do HTTP requests on the InfluxDB line protocol using the Apache Common client we are already including. |
I see. It is not a hard no, just makes maintenance a bit harder. Personally I am ok (it seems this is the official influxDB client), if they are necessary. |
Altered the influx DB sink so that each topology has its own database on the influx server. Updated configs and documentation to match this change.
Changed the source tag so that it is parsed, as it is in the Tmaster sink, into seperate tags for HostPort, component and instance ID tags.
This PR adds a new metrics sink for InfluxDB. It has options for batch sending (at the flush frequency) or sending as soon as metrics are received. I have updated the website source with documentation and added config stubs to all the
metric-sinks.yaml
files I could find.This has been tested with the Local scheduler, but not any others.