New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: introduce stats client #992
Conversation
7202a8d
to
37370d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like this approach! One thing I'm wondering about is if we can make elements
more intuitive/user friendly in the public interface. Would it make sense to have the platform expose a strongly typed interface/type for that rather than having the user provide an array of strings?
Documentation on the expected usage would also be great.
@rebello95 I like the idea of having Element be a strong type. There are actually invalid character strings that one could pass, and we can prevent that by having Element have a conditional initializer. This would allow error discovery to happen more or less statically. |
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
c0a17d6
to
3ded59b
Compare
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
) : StatsClient { | ||
|
||
override fun getCounter(vararg elements: Element): Counter { | ||
return Counter(WeakReference(engine), elements.asList()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iOS has a strong ref below, we should keep them consistent (I think a strong ref is what you want here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the weak reference is intended. We would not want to have stats spread through the application to necessarily hold the engine alive, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, this is passing a weak reference to the Counter, whereas on iOS the Counter itself makes the reference weak. Would it make sense to move WeakReference(engine)
into the counter file's constructor to make it clear that it doesn't retain the engine?
Separately, I'm wondering if it makes sense to allow counters to retain the engine. Do we want the engine to stick around as long as a consumer is holding on to a counter ref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to move WeakReference(engine) into the counter file's constructor to make it clear that it doesn't retain the engine?
yea
Do we want the engine to stick around as long as a consumer is holding on to a counter ref?
probably not, counters could be referenced by feature modules of the platform(iOS/android) code, and we probably don't want the feature modules to (indirectly) hold a strong ref to engine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intuition right now is that it seems natural for engine clients to strongly retain the engine, but perhaps less so for the artifacts they produce. This has more to do with users of the library being able to implicitly manage the lifecycle of the engine than anything else. I should say that I'm not strongly convinced this is right, but it seems reasonable for now for experimentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a resolution here? Looks like the weak ref is still being passed from this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, and a big question.
@@ -204,6 +204,9 @@ stats_flush_interval: {{ stats_flush_interval_seconds }}s | |||
- safe_regex: | |||
google_re2: {} | |||
regex: '^http.dispatcher.*' | |||
- safe_regex: | |||
google_re2: {} | |||
regex: '^client.*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer having a more restrictive rule so that people have to think before their application level stats get emitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what rules do you have in mind?
right now all the elements used for the stats are checked against the regex: ^[A-Za-z_]+\$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with @junr03, long-term it might make sense to make people specify this in configuration, so that some thought goes into the stats they emit. For now, as a new, experimental interface, this is probably good enough to move forward.
) : StatsClient { | ||
|
||
override fun getCounter(vararg elements: Element): Counter { | ||
return Counter(WeakReference(engine), elements.asList()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the weak reference is intended. We would not want to have stats spread through the application to necessarily hold the engine alive, right?
@@ -93,6 +93,19 @@ Engine::~Engine() { | |||
main_thread_.join(); | |||
} | |||
|
|||
void Engine::recordCounter(std::string elements, uint64_t count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we cover this in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a ticket to cover testing generally at this layer, will link here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goaway don't forget to add this ticket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created (but it's internal).
Signed-off-by: Jingwei Hao <jingweih@lyft.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My big question about performance was answered, and a disclaimer was added. @goaway and @jingwei99 will follow up with @jmarantz to discuss the dynamic vs static stats path.
1 missing issue. Otherwise, LGTM.
@@ -93,6 +93,19 @@ Engine::~Engine() { | |||
main_thread_.join(); | |||
} | |||
|
|||
void Engine::recordCounter(std::string elements, uint64_t count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goaway don't forget to add this ticket
void Engine::recordCounter(std::string elements, uint64_t count) { | ||
if (server_) { | ||
server_->dispatcher().post([this, elements, count]() -> void { | ||
static const std::string client = "client"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No non-pod statics (static init fiasco). Moreover what you really want here since this string is known at compile-time is to save a StatName for "client" using either a StatNamePool or StatNameManagedStorage. You want to do that in a context object that is created once at process startup and then re-used. Then you have no thread contention issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, but some comments.
- We seem to be missing Kotlin stats client tests
- Can we also update documentation / tutorials which are now invalid due to the engine interface changes?
- Would also be good to add an issue for documenting the stats interface and linking that in the PR description
"StatsClient.kt", | ||
"StatsClientImpl.kt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these be in stats/
which is captured below with globbing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these ideally live here, basically in the same level as StreamClient
, as the interface Engine.kt
exposes a StatsClient
and a StreamClient
.
) : StatsClient { | ||
|
||
override fun getCounter(vararg elements: Element): Counter { | ||
return Counter(WeakReference(engine), elements.asList()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a resolution here? Looks like the weak ref is still being passed from this line
* Element values must conform to the regex /^[A-Za-z_]+$/. | ||
*/ | ||
class Element(val element: String) { | ||
init { | ||
if (!Pattern.compile("^[A-Za-z_]+\$").matcher(element).matches()) { | ||
throw IllegalArgumentException( | ||
"Element values must conform to the regex /^[A-Za-z_]+$/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regex is hardcoded 3 places here. Can we use a constant instead so we don't accidentally fail to change one of the usages in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair
"StatsClient.swift", | ||
"StatsClientImpl.swift", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here regarding living in stats/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, StatsClient
and its implementation ideally live at the same level as StreamClient
given the Engine.swift
interface.
library/swift/src/StatsClient.swift
Outdated
/// - parameter elements: Elements to identify a counter | ||
/// - returns: A Counter based on the joined elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// - parameter elements: Elements to identify a counter | |
/// - returns: A Counter based on the joined elements. | |
/// - parameter elements: Elements to identify a counter | |
/// | |
/// - returns: A Counter based on the joined elements. |
MockEnvoyEngine.onRecordCounter = nil | ||
} | ||
|
||
func testCounterDelegatesToEngine() throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func testCounterDelegatesToEngine() throws { | |
func testCounterDelegatesToEngine() { |
XCTAssertEqual(actualCount, 1) | ||
} | ||
|
||
func testCounterDelegatesToEngineWithCount() throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func testCounterDelegatesToEngineWithCount() throws { | |
func testCounterDelegatesToEngineWithCount() { |
XCTAssertEqual(actualCount, 5) | ||
} | ||
|
||
func testCounterWeaklyHoldsEngine() throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func testCounterWeaklyHoldsEngine() throws { | |
func testCounterWeaklyHoldsEngine() { |
import Foundation | ||
import XCTest | ||
|
||
final class StatsClientImplTests: XCTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add these tests for Kotlin too?
library/swift/test/BUILD
Outdated
name = "engine_builder_tests", | ||
srcs = [ | ||
"StreamClientBuilderTests.swift", | ||
"EngineBuilderTests.swift", | ||
], | ||
deps = [ | ||
"//library/objective-c:envoy_engine_objc_lib", | ||
], | ||
) | ||
|
||
envoy_mobile_swift_test( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we alphabetize these test cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking out loud: prolly worth adding a lint/pre-commit check for this in the long run
@rebello95 : for this comment, yea i missed it in my previous commit, and have a fix locally, planned to push along with other suggested changes |
assertThat(elementsCaptor.getValue()).isEqualTo("test.stat") | ||
assertThat(countCaptor.getValue()).isEqualTo(5) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tried to emulate StatsClientImplTests.testCounterWeaklyHoldsEngine
, but I wasn't able to get the EnvoyEngine
instance garbage collected - as calling System.gc() doesn't guarantee gc.
@jingwei99 @goaway can we update the docs to reflect the interface changes from this PR? |
yup, i'll work on the docs. |
Thanks! |
Description: Introduces a StatsClient capable of giving out Counters that, when incremented, emit stats from Envoy's internal stats engine. Also refactors the top-level platform Engine interface to make both the StreamClient and the StatsClient available.
Risk Level: Low
Testing: Pending
Co-authored-by: Mike Schore mike.schore@gmail.com