Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling connection-oriented weird-types that happen only once per connection #623

Open
ckreibich opened this issue Oct 4, 2019 · 2 comments

Comments

@ckreibich
Copy link
Contributor

commented Oct 4, 2019

Sampling of connection-oriented weird-types that happen only once per connection (such as data_before_established) doesn't work well right now: you either get each of those weirds, or none of them. The logic to tweak here is in Reporter::PermitFlowWeird().

I believe that's a known problem, but figured I'd create a ticket in case it's not tracked. I also don't know whether 3.0 has changed anything in this regard, but if so, that'd be great to know.

@jsiwek

This comment has been minimized.

Copy link
Member

commented Oct 4, 2019

The sampling state for connection-oriented weirds is stored per-connection, but what you expect/want would be for sampling state to be tracked globally per-weird-name?

If so, think it's not hard to add that as an option, but the current default behavior was mostly about being conservative: "people are less likely to miss redundant data we automatically throw out (or offer "compressed" stats for), and more likely to expect that we report all novel information by default".

@jsiwek jsiwek added this to Unassigned / Todo in Release 3.1.0 via automation Oct 4, 2019
@jsiwek jsiwek added this to the 3.1.0 milestone Oct 4, 2019
@ckreibich

This comment has been minimized.

Copy link
Contributor Author

commented Oct 4, 2019

Exactly — some level of global control would be ideal. I believe I'm missing some history here from the updates to earlier versions of the sampling code (2.5, iirc), so others may have additional context.

The problem with the current approach is that in environments where data_before_established comes up a lot it effectively bypasses sampling and blows up the weird-log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Release 3.1.0
  
Unassigned / Todo
2 participants
You can’t perform that action at this time.