New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide global traffic counters #4357

Merged
merged 19 commits into from Nov 30, 2017

Conversation

Projects
None yet
2 participants
@kroepke
Member

kroepke commented Nov 16, 2017

This change adds a section to the System overview page, displaying a histogram of bytes Graylog has accepted and sent to Elasticsearch.

  • Expose input and output traffic as metrics
  • Once per minute update the hourly bucket for each node
  • Show daily histogram graphs on Overview page
  • Expose hourly buckets in REST API (in order to accomodate other timezones more accurately than by providing pre-aggregated daily buckets in UTC)
  • Properly size graphs according to container width
  • Refresh traffic graph while page is visible (the backend only updates once a minute, though)
  • Do not count traffic coming from archive restoring (potentially track a third traffic value for it)

@kroepke kroepke added the in progress label Nov 16, 2017

@kroepke kroepke added this to the 3.0.0 milestone Nov 16, 2017

@kroepke kroepke self-assigned this Nov 16, 2017

@kroepke kroepke force-pushed the traffic-counter branch from e41bdd6 to ca9ca04 Nov 16, 2017

kroepke added some commits Nov 16, 2017

adding a rudimentary hourly output traffic counter
display a histogram on the Overview page, currently only showing daily traffic
track global MessageInput raw message length
the global counter makes it much easier to keep track of traffic, as we don't have to filter for various meters
turn the output counter into a global metric name as well
extend traffic data objects for input values, and display a second graph on the overview page
allow background traffic to bypass regular traffic counter
instead this type of traffic is counted towards "system traffic"
resize traffic graphs as window resizes
still ugly at very small widths, but I'll live with that for now

@kroepke kroepke force-pushed the traffic-counter branch from 402f23c to 5b178a5 Nov 24, 2017

@bernd bernd assigned bernd and unassigned kroepke Nov 27, 2017

final DateTime now = Tools.nowUTC();
final int secondOfMinute = now.getSecondOfMinute();
// on the top of every minute, we flush the current throughput
if (secondOfMinute == 0) {

This comment has been minimized.

@bernd

bernd Nov 27, 2017

Member

What happens if this doesn't run at second 0? (e.g. when the load is high) Is that a problem?

This comment has been minimized.

@kroepke

kroepke Nov 28, 2017

Member

The periodical gets called every second, and only adds to the total, so I think it's safe.
Worst case we'll not update the traffic for another minute, but I doubt that any traffic happens if a periodical has to wait for over a second to run.

msgSize += FIELD_MESSAGE.length() + message.length();
obj.put(FIELD_MESSAGE, message);
final String source = getSource();
msgSize += FIELD_SOURCE.length() + message.length();

This comment has been minimized.

@bernd

bernd Nov 27, 2017

Member

I guess this should be source.length(). 😄

This comment has been minimized.

@kroepke

kroepke Nov 28, 2017

Member

Ahem. Yes. I'll add a test.

@Produces(MediaType.APPLICATION_JSON)
public class TrafficResource extends RestResource {

public static final ImmutableSet<String> RESOLUTION = ImmutableSet.of("days", "hours");

This comment has been minimized.

@bernd

bernd Nov 27, 2017

Member

This constant is unused. Is it a leftover or supposed to be in use?

This comment has been minimized.

@kroepke

kroepke Nov 28, 2017

Member

Left over from when the resource could preaggregate to days. I'll remove

propTypes: {
from: PropTypes.string.isRequired,
to: PropTypes.string.isRequired,
traffic: PropTypes.object.isRequired,

This comment has been minimized.

@bernd

bernd Nov 27, 2017

Member

Can you use PropType.shape() here to make it easier to see which structure is required?

This comment has been minimized.

@kroepke

kroepke Nov 28, 2017

Member

I think that's not possible, because traffic simply has key/value pairs and not a certain shape.
I'll add a comment instead.

classSizes.put(boolean.class, 4);
classSizes.put(int.class, 4);
classSizes.put(float.class, 4);
classSizes.put(long.class, 8);

This comment has been minimized.

@bernd

bernd Nov 27, 2017

Member

Does this lookup also work for the autoboxed types? (Long.class, ...)

This comment has been minimized.

@kroepke

kroepke Nov 28, 2017

Member

No it does not. Good catch, I'll add a test.

kroepke added some commits Nov 28, 2017

fix byte counting in message fields
add test for expected sizes
don't count message field twice
don't count stream field name if no streams are present

added tests

kroepke added some commits Nov 28, 2017

refactor message size estimation
instead of counting the message size during output serialization, perform
a rolling accounting of field sizes when mutating the message

this has the added benefit to be able to cheaply get the current size for
 various purposes (such as estimating the size difference for enrichment)

also track global metric for message sizes right after decoding them from
the journal but before running any processing

DateTime, Date, ZonedDateTime and streams now count for 8 bytes, instead
of their string representation
}
long newValueSize = 0;
long oldValueSize = 0;
final long oldSize = sizeCounter.getCount();

This comment has been minimized.

@bernd

bernd Nov 29, 2017

Member

The oldSize variable is only used in the TRACE log message and could be moved into the isTraceEnabled block to avoid the variable allocation on every call.

This comment has been minimized.

@kroepke

kroepke Nov 30, 2017

Member

But then we need to delay updating the sizeCounter and I'm not sure it's really worth the effort for a stack variable.

@@ -429,6 +525,10 @@ public void setStreams(final List<Stream> streams) {
public void addStream(Stream stream) {
indexSets.add(stream.getIndexSet());
streams.add(stream);
sizeCounter.inc(8);

This comment has been minimized.

@bernd

bernd Nov 29, 2017

Member

We should only increment the size counter if the stream hasn't been in the set yet. (check stream.add(stream) return value)

@@ -448,6 +548,10 @@ public void addStreams(Iterable<Stream> newStreams) {
*/
public boolean removeStream(Stream stream) {
final boolean removed = streams.remove(stream);
sizeCounter.dec(8);

This comment has been minimized.

@bernd

bernd Nov 29, 2017

Member

We should only decrement the size counter if the stream has actually been removed from the set. (check stream.remove(stream) return value. (check already done in line 556)

}
}

private void updateSize(String fieldName, Object newValue, Object previousValue) {

This comment has been minimized.

@bernd

bernd Nov 29, 2017

Member

The setSource() method sets the new source value directly in the fields map and thus doesn't do the size counting via updateSize().

@bernd

bernd approved these changes Nov 30, 2017

LGTM for a first cut now 👍

@bernd bernd merged commit 8a6d5fc into master Nov 30, 2017

4 of 5 checks passed

graylog-project/pr Jenkins build graylog-project-pr-snapshot 754 has failed
Details
ci-web-linter Jenkins build graylog-pr-linter-check 2099 has succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
license/cla Contributor License Agreement is signed.
Details

@wafflebot wafflebot bot removed the in progress label Nov 30, 2017

@bernd bernd deleted the traffic-counter branch Nov 30, 2017

bernd added a commit that referenced this pull request Nov 30, 2017

Provide global traffic counters (#4357)
* adding a rudimentary hourly output traffic counter

display a histogram on the Overview page, currently only showing daily traffic

* track global MessageInput raw message length

the global counter makes it much easier to keep track of traffic, as we don't have to filter for various meters
turn the output counter into a global metric name as well
extend traffic data objects for input values, and display a second graph on the overview page

* remove eslint warning

* fix license format (sigh)

* add option to get pre-calculated daily traffic buckets

* include sliding 30 day traffic sum in text

* allow background traffic to bypass regular traffic counter

instead this type of traffic is counted towards "system traffic"

* resize traffic graphs as window resizes

still ugly at very small widths, but I'll live with that for now

* adjust log levels for production

* fix byte counting in message fields

add test for expected sizes

* remove obsolete constant

* don't count message field twice

don't count stream field name if no streams are present

added tests

* add comment explaining traffic prop shape

* get rid of some obvious warnings

* add some deprecation

* refactor message size estimation

instead of counting the message size during output serialization, perform
a rolling accounting of field sizes when mutating the message

this has the added benefit to be able to cheaply get the current size for
 various purposes (such as estimating the size difference for enrichment)

also track global metric for message sizes right after decoding them from
the journal but before running any processing

DateTime, Date, ZonedDateTime and streams now count for 8 bytes, instead
of their string representation

* add traffic visualization for message sizes after decoding them from the journal

* remove confusing comment

* fix sizing issues

(cherry picked from commit 8a6d5fc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment