-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch Collector from struct to interface #680
Comments
See progress made by @diamonwiggins in #670 |
Just a note I modified the final bullet of the proposal to clarify that we only need implement a single collector's I also edited the Definition of Done to clarify that desire for 1 collector to implement Merge with the rest just implementing the standard no-merge interface. |
#670 has been updated with my latest changes. I'd still consider it to be a draft at this point. Ended up being more to work through than I expected. A few things to get the discussion going:
spec A
spec B
Here we want the final spec to be
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: example
spec:
collectors:
- clusterResources: {}
- clusterResources:
namespace: kube-system
- secret:
namespace: default
name: secret1
- secret:
namespace: default
name: secret2
|
Still working through the PR to fully understand the questions. I'll start with the one at the top, you have a lot of questions in a single comment so I'm not going to try and address them all at once.
Yes I think the intention of Merge has been to include any operations beyond a direct append of the collector so it would be error handling, deduplication, or any other shade of gray ranging from a simple append to full on item by item merging.
First, I don't think I agree with the statement that all use cases are just dedup. Even if all of todays cases are just dedup that doesn't follow that all possible future cases are just dedup. The distinction I'm making here is, we want to allow more than dedup even if current collectors and file paths don't make that reasonable today. So the question isn't what works right now, but what do we want to have available in the interface. Are you intending to say there is no use case for Merging at all as a concept on Collectors? Next you propose that dedup shouldn't be repeated in each collector. Which is DRY and I appreciate that. My question, similar to above, is do you believe dedup can be written completely agnostic to the collector and it's configuration? Not for one specific collector, but for all possible collectors. I believe you need to have context about the collector to provide dedup, is that not the case? I'll revisit with the example below.
First to say that every collector needs some dedup, I don't know that to be true. We could, for example, merge and do all appends and most collectors would work fine. Dedup then becomes an optimization and optimization is not required for every collector, even if it would be best if they all implemented the most optimized possible collection. The example you gave does clearly present a use case for dedup. It relies on knowing the collector and what it does. I'm not so sure you can globally 'dedup' all collectors the same. Each collector defines what it's inputs mean, and what it collects. In the example you know how ClusterResources operates and you are using that to decide what the output of merge should be. I was under the impression only name and exclude were defined on the common scheme although it looks like namespace is actually there too. My point here is, a global dedup would only be able to operate on the shared scheme because it has no guarantees about anything else. This means a shared dedup if even possible would be highly restricted. To close the example out. What you've shown is the generic rule "If namespace isn't defined that overrides a defined namespace and the two specs can be deduplicated". That isn't true of the systctl collector which still needs a namespace and uses a default if none is provided. Two of these collectors can not be merged with the same logic you used above, so the decision would have to be made at the collector level instead where it knows what defaults are. While that isn't DRY only the collector in context of what it does with inputs can implement deduplication (which is just a subset of Merge). |
At this time this is the way that I see it, but certainly could be convinced otherwise. You mention:
I think the operations of direct append, error handling, and deduplication is something all the collectors would make use of, so the separate interface in my opinion would only make sense if we have actual use cases for
My assumption was that every collector on some level would "dedup" in the since of if the same exact collector configuration is provided twice, you would discard the duplicates. As well as most every collector would also have handling around multiple collectors using the same output path in the bundle as the collectorName could technically be the same.
This is true, but I guess the only point I'm making is similar to the spec A
spec B
result
If we had a use case like the above, then in my mind that would be a slam dunk example of something that fits into a separate A couple points to wrap here:
|
I think that's a great use case we should strive to allow. If current log collector doesn't allow it, setting up the interface so that we can support this use case is a good start! I'll review the concepts in number 4 next and see if I understand it enough to have an opinion on it. |
I've pushed up my latest changes. An update on where all this stands now:
I think this will simplify the merge logic if it's able to look at all of its spec items and make a decision that returns apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: example
spec:
collectors:
- clusterResources:
namespaces:
- default
- myapp-namespace
- clusterInfo: {}
....
- clusterResources: {} With the way the code is written now, the first
I'm not sure I understand this item from the Definition of Done. Could use some clarification.
We have limited use cases for merge at the moment. What do we expect to be tested here? Just that a function returns output, or some sort of collector specific merge logic that gets written? |
Outline
With #665 we have Troubleshoot appending multiple specs provided on the CLI (or call) into one long spec. Some collectors which may be specified more than once might have specific de-duplication rules which need to be taken into account, to avoid collecting the same thing multiple times.
This task is to enable a means to run a collector specific merge which could de-duplicate, or aggregate, based on the collector type.
In scope:
Out of scope:
Proposal
Collector
type to an interface with methods that includeCollect()
collector.Collect()
to actually get the dataassumptions
SupportBundle
, no other changes are needed to receive multiple inputs.Definition of done
Followup tasks
The text was updated successfully, but these errors were encountered: