New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audit default resource gathering #492
Comments
This also includes updating the default list, I'm currently making a patch for v0.11.4 that isn't very clean that requires cleaning. |
Hey @timothysc could you add some information about this? Not sure the work items. |
So I actually have a poc working locally (very WIP) that updates the namespace queries to use the dynamic client. There seem to be some things that aren't walked so easily in the dynamic client, like pod logs, server groups, etc. Those things still seem to need to be a one-off for each of them which, frustratingly, removes some of the value of moving to the dynamic client. However, a question for @timothysc and the larger community if anyone wants to chime in: in my POC it gathers everything it can via the dynamic client and that ends up also including secrets. Do we consider that a problem? It is no secret (pun intended) that k8s secrets aren't by default secure, but I dont know if we'll have people yelling about adding them now. IMO it should just be a wakeup call that they shouldn't be using them perhaps. In my POC it still allows users to choose explicitly the items to collect so they can avoid those if they want, but it is an extra step. |
POC is actually working now for all the items; leveraging the existing code for:
Other than that all the NS and cluster resources get listed using the dynamic client. The only drawback to this is that it is, in the POC form, much more time consuming. Queries take about 45s right now because we are running ~250 queries and each (with the dynamic client) takes around 200ms. This is about 3-4x slower than using the typed clients. The dynamic client could make it much easier though to query things more in bulk and reduce the number of queries significantly, i.e. I could query all the jobs and then split them up when saving them. That could reduce the time by 7x unless the list time goes up linearly with the number of objects. |
Considering how much stuff we could be leaving on the table by not querying those items, I think it is worth it to add them and use the dynamic client. As long as we give the users a way to cut them out/down I think its fine. |
Random thought on this but even though it is fun to remove code (we could remove lots of the typed query stuff), it may be worth having a more complicated code if the performance is 3-4x faster. We could just combine the two approaches to use the typed client on the existing/more stable API objects, and just use the dynamic for some of the less common/newer items. At some point, I'll have to check what the overall performance is in that situation. |
Was so much slower (~200ms per query, even when there wasn't any responses) but that may be a result of some rate limiting on the client side. Need to look into the field If we could use the dynamic client for everything then it'd get all the benefit and not the slowdown. If we need to use both then we can get the benefits but pay in the complexity of using both types of queries. |
This change is fairly substantial so I wanted to specify a few details: - All queries are now executed with the discovery/dynamic clients walking the API server - QPS and Burst can be set on the config.json so that you can speed up/slow down your queries based on speed and throttling. - The filenames containing the the resource query results are updated to list group_version_resource.json as opposed to resource.json. This makes it explicitly clear where the resources came from and avoids a problem where different groups with the same resource name could collide. - To avoid collision in general though, the queries only grab the first resource with a given name that it encounters. This is the behavior kubectl uses as well. - Resource filtering via the config.json field Resource is still done, but explicitly compared against the resource.Name field given by the server. Most visibly, the casing of the values is different than it used to be. - If an nil Resource value is set in the config.json, all values will be queried with the exception of secrets. - If an empty array of Resources is set in the config.json, no values will be queried, same as today. - The default resource list exists but has a few extra values added to it. In the future we may remove it entirely and just default to query all. This was kept so make the casing changes more clear to users of sonobuoy gen. Fixes #492 Fixes #630 Signed-off-by: John Schnake <jschnake@vmware.com>
Describe the solution you'd like
There are additional api-resources that could be gathered
Environment:
The text was updated successfully, but these errors were encountered: