-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add Labels.Unique() API, remove label encoder from ungrouped batcher #641
Conversation
@@ -329,42 +313,10 @@ type Record struct { | |||
// presented in sorted order by the SDK. | |||
type Labels interface { | |||
Iter() LabelIterator | |||
Unique() interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be changed to return a string
? Or, another concrete type?
The interface{}
type when used in this way circumvents type checking that helps prevent runtime errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be changed because the underlying type used by the real SDK is a variable-sized array, the same one used as a map key for variable-size label sets. The "Simple" implementation uses a string, and the constraint is that you can't use both at the same time--that's why I moved "Simple" into the test package.
I've added comments on the interface methods here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a bad smell to it and is making me think there is a mistake in the design.
the two implementations do not return a consistent value, uniqueness is only guaranteed for a single implementation.
This seems like the antithesis of what a Go interface is used for. Instead of the Labels
interface acting as a contract between the implementer and the caller of the declared methods it becomes a direct circumvention of the typing system. Instead of the compiler asserting the passed type is compatible, it is now the the callers responsibility to assert the passed interface is from a single implementation and thus compatible.
may be used as a map key
Additionally, it is only in documentation that the type returned needs to be a comparable value. This again is moving additional responsibility of the compiler to the user.
I get that this can work, but I feel it is building something users are going to have safety issues and frustration with down the road.
I'm wondering if we have over-optimized the Labels implementation in the default SDK and it has now put us into a corner because of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I can move past this concern if we think this is need to be merged to unblock stuff. But I think it is an issue and we should give it some serious thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a prior PR about this: #523
It was an optimization in the *sdk/metric.SDK
, and there is only one SDK in use at a time. The SDK goes to great lengths to encode a unique interface{}
value for use in the sync.Map
of current records, and I think it's reasonable for the integrator to use this. If an exporter is receiving data from multiple SDKs, which I think of as an irregular configuration, then it would have to be defensive about this kind of thing.
When testing, it is appropriate to use the simple implementation of the export metric.Labels
, and still correct. It would only be incorrect to mix metric.Labels
value from a test and a real SDK in the same map, then you'd have two entries, unexpectedly with the same identity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny story, opening up that prior PR, GitHub showed me the pending review I had started but didn't finish because the PR was merged before I could. It contained similar concerns around the premature optimization and complexity added by using the reflect package to dynamically declare variable-length arrays.
I think this approach can be tractable if you maintain all uses of the code, like inside the SDK. However, when we are exporting this implementation in this way we lose all guarantees on use and conformity. We cannot ensure users will understand and appropriately follow the restrictions and guidelines outlined here. In fact, I think it should be expected that users are going to use these things in ways that we never anticipated much less the ways we anticipate to be wrong.
You consistently seem to have a better grasp on the project structure direction and specifically the end goals of Labels than I do. I'm wondering if I'm missing a big-picture thing here that is driving us in this direction. From my perspective, it seems like we could build a better user experience by going about this in a different way. Maybe we can talk more about this in the next SIG meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question: are the interfaces and types in sdk/export/metric
supposed to be reused/implemented by other SDKs? If not, then having an exporter gathering data from multiple SDKs would be possible. Unless we are talking about multiple instances of the same SDK type.
If we would want to differentiate the unique representations across SDKs, then we could have something like this:
// in sdk/export/metric/metric.go
var sdkIDCounter int64 = 0
// NewIDForSDK should be called by each SDK at the creation time, blah blah
func NewIDForSDK() int64 {
return atomic.AddInt64(&labelEncoderIDCounter, 1)
}
// UniqueRepr is a unique representation of the labels,
// can be used as a map key, and so on and so forth
type UniqueRepr struct {
id int64
raw interface{}
}
// NewUniqueRepr yadda yadda, id should come from NewIDForSDK()
func NewUniqueRepr(id int64, data interface{}) UniqueRepr {
return UniqueRepr{
id: id,
data: data,
}
}
type Labels interface {
…
Unique() UniqueRepr
}
// in sdk/metric/sdk.go
func (ls *labels) Unique() export.UniqueRepr {
return export.NewUniqueRepr(ls.meter.id, ls.ordered)
}
func New(batcher export.Batcher, opts ...Option) *SDK {
…
m := &SDK{
…,
id: export.NewIDForSDK(),
}
…
}
I think that even if we abandon the idea of SDK ID, hiding the interface{}
inside a struct as an unexported member still would be better than returning interface{}
. As such, it wouldn't give the exporters any idea about guessing the real type behind interface{}
. Maybe even we could rename UniqueRepr
to MapKey
or something to give a clear message about the purpose of the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion to wrap the interface{} is a great help.
As noted below, I think we can drop the idea of "SDK ID" and instead move the "label set" type into the API. Then the batchers and exporters can be guaranteed that the unique value is really unqiue.
I'll hold off on this PR for now.
I thought over the objection to this PR, looked at the benchmark changes from #523, and I'm still in favor of the basioc approach. The argument to me is that the benchmarks are significantly better, and the operation being optimized is in the path of every directly-accessed instrument. Assuming that the SDK's I like @krnowak's suggestion that wrapping the I propose to shelve this PR for now, while refactoring label sets as described above in a replacement for #640. After the refactoring, this PR can be re-applied (using @krnowak's suggestion) without a uniqueness problem. There are two separate matters that this brings up, which I'd like to address: |
See #651 |
This is an optimization for export pipelines that do not encode labels as a string. My real goal was simply to remove the label encoder dependency from the integrator (still called batcher), to reduce cycles and dependencies between components.