Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new sink): New sematext_metrics sink #3501

Merged
merged 31 commits into from
Oct 11, 2020

Conversation

StephenWakely
Copy link
Contributor

@StephenWakely StephenWakely commented Aug 19, 2020

Closes #1845

This is very much a work in progress.

The sink is working. I can set it up and it can send metrics into Sematext.

I have an issue with the smoke test. I am attempting to send 10 metrics events into the sink (Here : https://github.com/FungusHumungus/vector/blob/sematext_metrics/src/sinks/sematext_metrics.rs#L343), but no matter what I do I cannot get it to process more than 1 event (as witnessed by the println here - https://github.com/FungusHumungus/vector/blob/sematext_metrics/src/sinks/sematext_metrics.rs#L161.

I'm not sure if this is because I have misunderstood precisely how the sink should work, or if I have something wrong in the code, or indeed something wrong in the test..

Any ideas?

@StephenWakely
Copy link
Contributor Author

Another question. As far as I can tell (I still need to confirm this with Sematext support), the only useful metric types are single values - Counters and Guages.

What would be the best way to handle the situation should the sink receives any other type of metric?

@StephenWakely
Copy link
Contributor Author

I'm not sure if this is because I have misunderstood precisely how the sink should work, or if I have something wrong in the code, or indeed something wrong in the test..

Any ideas?

I have worked it out. It is to do with the batch settings. Without setting the batch.max_events it defaults to 20. So presumably, just sending 10 events to the sink means it is going to batch the events up and wait until it can get to 20. Setting max_events to 1 allows the test to recieve all 10 events.

It all makes sense now!

@binarylogic binarylogic requested review from JeanMertz, binarylogic and bruceg and removed request for lukesteensen and JeanMertz August 25, 2020 01:08
@StephenWakely
Copy link
Contributor Author

I think this is good to go now.

One thing I'm not sure I have done the best way is in sending the version number of Vector to Sematext. The version string returned by get_version in lib.rs is fairly long and would need to be url-encoded - the version string would then be about half of the entire request . To avoid this I am instead making my own simplified version string that mostly just contains 'vector' with the version number.
(https://github.com/FungusHumungus/vector/blob/sematext_metrics/src/sinks/sematext/metrics.rs#L46)

Not sure if that is the best way. If it is, maybe we would want to move this function into lib.rs so that other code can access it if needed?

Another possible improvement that may be worth considering, currently the sink is sending one request per metric to Sematext. It could be possible to group the events so every metric with the same namespace and timestamp in a batch would be sent in a single request. This could save network traffic if there are a lot of metrics sent through Vector in one go.

Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks mostly good, but a couple of cleanups needed and an internal event for the dropped metric warning/error.

src/sinks/influxdb/mod.rs Outdated Show resolved Hide resolved
src/sinks/sematext/mod.rs Outdated Show resolved Hide resolved
src/sinks/sematext/mod.rs Show resolved Hide resolved
src/sinks/sematext/metrics.rs Outdated Show resolved Hide resolved
src/sinks/sematext/metrics.rs Outdated Show resolved Hide resolved
src/sinks/sematext/metrics.rs Outdated Show resolved Hide resolved
src/sinks/sematext/metrics.rs Outdated Show resolved Hide resolved
src/sinks/sematext/metrics.rs Outdated Show resolved Hide resolved
src/sinks/sematext/metrics.rs Outdated Show resolved Hide resolved
src/internal_events/sematext_metrics.rs Outdated Show resolved Hide resolved
src/internal_events/sematext_metrics.rs Outdated Show resolved Hide resolved
@bruceg bruceg self-assigned this Aug 26, 2020
@bruceg bruceg added domain: sinks Anything related to the Vector's sinks provider: sematext Anything `sematext` service provider related type: feature A value-adding code addition that introduce new functionality. labels Aug 26, 2020
@StephenWakely
Copy link
Contributor Author

Aaarrg.. darn it. I think I have messed this pull request up. I rebased the branch with master, and it seems to have pulled in a ton of stuff that I don't need.

Really sorry, any idea what I need to do to fix this?

@MOZGIII
Copy link
Contributor

MOZGIII commented Aug 31, 2020

The easiest way is to do an interactive rebase (git rebase -i master) and drop the commits that it shows in the list that are coming from master. They don't have to exist in your branch. so rebase shouldn't show them in the final git rebase -i master.

@StephenWakely
Copy link
Contributor Author

@MOZGIII Awesome! That seems to have tidied it up, although it has still kept you on as a reviewer. I presume you can remove yourself if needed?

Relief! Many thanks.

@MOZGIII MOZGIII removed their request for review September 1, 2020 00:08
Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a possible pub change and a question for @binarylogic about the docs.

src/sinks/influxdb/mod.rs Outdated Show resolved Hide resolved
.meta/sinks/sematext_metrics.toml.erb Outdated Show resolved Hide resolved
@binarylogic
Copy link
Contributor

My biggest question here is the data model, which should align with #3552. That, unfortunately, is something that will likely require an RFC :/.

We could merge this and follow up with any changes, but that'll be a breaking change.

@StephenWakely
Copy link
Contributor Author

Have made the suggested changes.

I'm not sure I fully understand how this is not aligned with the data model from #3552?

@binarylogic
Copy link
Contributor

@FungusHumungus that's good to hear. It's more about consensus. In #3552 (comment), @drunkirishcoder proposes a public contract for the schema, which I agree with. We'll try to expedite our discussions there so we don't block this any longer than needed.

@binarylogic
Copy link
Contributor

I've opened #3779 for this.

@binarylogic
Copy link
Contributor

@FungusHumungus this should be good to go. A few things:

  1. Is the concern in the original PR description still valid?
  2. Did you manually test this to ensure it works?
  3. Need to resolve the conflicts.

Thanks.

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
@StephenWakely
Copy link
Contributor Author

@binarylogic Would you be able to have a look over the docs to check I have done them correctly?

In particular, I'm not 100% sure about the is the How it works section.

https://github.com/timberio/vector/pull/3501/files#diff-2c39791f3299d4d31f24a395ece3881f


components: sinks: sematext_metrics: {
title: "Sematext Metrics"
short_description: ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a one sentence description here.

StephenWakely and others added 3 commits October 11, 2020 00:41
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>
Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FungusHumungus docs looks great, nice work!

@StephenWakely StephenWakely merged commit 36dea21 into vectordotdev:master Oct 11, 2020
@StephenWakely StephenWakely deleted the sematext_metrics branch October 11, 2020 09:06
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
* Initial wip for the sematext_metrics sink

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Moved Sematext modules into a single module

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Added documentation for the Sematext metrics sink

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Send the Vector version with the Sematext request

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixing issues following comments from bruceg

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Moved Sematext modules into a single module

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixing issues following comments from bruceg

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Updated sematext events in line with comments from BruceG.

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Changes following a merge with master

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixed input type in documentation

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Added relevant_when fields to docs for host and region fields.

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Restricted the visibility of a number of functions

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Added noun to sematext metrics docs

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixed clippy warnings

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixes following a merge with master

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Added a better healthcheck url

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* sink.run no longer takes Result

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Renamed host parameter to endpoint

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Renamed events following similar changes to statsd events

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Changes following a review from @JeanMertz

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Changes following a merge with master

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Add docs cue file

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Further changes following merge with master

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fix cue formatting

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Further attempt to fix cue formatting.

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Use tabs not spaces

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixed docs cue errors

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixed cue doc following merge with master

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Added a short description to the docs

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

* Fixed cue formatting

Signed-off-by: Stephen Wakely <fungus.humungus@gmail.com>

Co-authored-by: binarylogic <bjohnson@binarylogic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: sinks Anything related to the Vector's sinks provider: sematext Anything `sematext` service provider related type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add sematext_metrics sink (wrap Influx API?)
7 participants