Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Loki Chunks Dashboard #1126

Merged
merged 13 commits into from
Oct 8, 2019
Merged

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Oct 7, 2019

What this PR does / why we need it:
This PR cleans up the Loki Chunks dashboard either by renaming incorrectly named series or adding needed metrics.

  • Added new metrics
    • loki_ingester_memory_chunks
    • loki_ingester_memory_streams (added instead of loki_ingester_memory_series)
    • loki_ingester_chunk_utilization
    • loki_ingester_chunk_compression_ratio
  • The following metrics were already being published with a different name. Fixed in the dashboard:
    • cortex_chunk_store_index_entries_per_chunk
    • cortex_ingester_flush_queue_length
    • loki_ingester_chunk_entries

Which issue(s) this PR fixes:
Fixes #1045
Fixes #1105

Special notes for your reviewer:
I'm concerned about the method used to calculate loki_ingester_chunk_utilization and would like a close review. In gzip.go I am calculating the total percentage of blocks used in the chunk and returning that as utilization. I think a more interesting metric would be %age of block size used, but this would require uncompressing the log entries or storing on the block the uncompressed size when it was cut.

Checklist

  • Documentation added
  • Tests updated

@joe-elliott joe-elliott changed the title Metrics cleanup Fix Loki Chunks Dashboard Oct 7, 2019
Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update the docs with the new metrics you added? docs/operations/observability.md

Overall, this is looking good. I think the way you find the utilization of chunks is fine, and we definitely don't want to decompress any cut chunks. I'd like a second pair of eyes to double-check how you compute it though.

/cc @cyriltovena @slim-bean

pkg/ingester/instance.go Outdated Show resolved Hide resolved
Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Let me know if you want a second pair of eyes on this before merging.

pkg/chunkenc/gzip.go Outdated Show resolved Hide resolved
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
…mpressed bytes

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott
Copy link
Member Author

@rfratto @slim-bean

This should be ready to go. Utilization changes complete. Compression ratio metric added.

Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM! @slim-bean I'll leave the final review and merge up to you.

Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott
Copy link
Member Author

Since fixing #1105 was trivial I went ahead and did that as well. See the comment on the issue, but I basically just removed the only unused metric.

Co-Authored-By: Ed Welch <ed@oqqer.com>
@slim-bean
Copy link
Collaborator

Last question:

Do we want to also observe the uncompressed size per tenant?

I'm not sure if we would really care about this much? It might be interesting to know on a per-tennant basis what the encryption looks like but i'm not sure this is interesting enough to warrant a growing per-tennant metric.

@joe-elliott
Copy link
Member Author

joe-elliott commented Oct 8, 2019

Uncompressed size per tenant should basically be captured in loki_distributor_bytes_received_total. Do we feel like there is something to be gained from metricing it when flushing to blocks? I suppose differences in the values could reveal issues?

Compression ratio per tenant seems like a bad idea. It's a histogram and the cardinality would be very high.

@slim-bean
Copy link
Collaborator

Yeah I agree we wouldn't want to do a per tenant histogram, and hadn't considered using the received_bytes_total metric.

Ok, lets just leave things the way they are now and get this merged!

@rfratto rfratto merged commit b835e20 into grafana:master Oct 8, 2019
@CyberoxiDevops
Copy link

Hi, any updates about loki_ingester_chunk_utilization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Loki defines metrics that it never updates "Loki / Chunks" dashboard uses undefined metrics
4 participants