Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate caching layer for kubeapps plug-ins #3032

Open
gfichtenholt opened this issue Jun 23, 2021 · 28 comments · Fixed by #3044 or #3151
Open

Investigate caching layer for kubeapps plug-ins #3032

gfichtenholt opened this issue Jun 23, 2021 · 28 comments · Fixed by #3044 or #3151
Labels
component/apis-server Issue related to kubeapps api-server kind/enhancement An issue that reports an enhancement for an implemented feature
Projects

Comments

@gfichtenholt
Copy link
Contributor

Initially caching layer initially specifically to the flux plugin, but the end goal is to generalise it to provide caching to all plugins as needed

Here is the plan

  1. install redis as part of kubeapps helm chart
  2. write code for new deployment/operator that watches for changes to certain flux HelmRepository CRD and updates the contents of the redis cache, e.g. loads the index.yaml
  3. modify flux plug-in code to use said cache as needed
@gfichtenholt gfichtenholt self-assigned this Jun 23, 2021
@gfichtenholt gfichtenholt added this to Inbox in Kubeapps via automation Jun 23, 2021
@absoludity absoludity moved this from Inbox to Committed in Kubeapps Jun 23, 2021
@absoludity absoludity added component/apis-server Issue related to kubeapps api-server kind/feature An issue that reports a feature (approved) to be implemented priority/high labels Jun 23, 2021
@absoludity
Copy link
Contributor

Initially caching layer initially specifically to the flux plugin, but the end goal is to generalise it to provide caching to all plugins as needed

Here is the plan

1. install redis as part of kubeapps helm chart

2. write code for new deployment/operator that watches for changes to certain flux HelmRepository CRD and updates the contents of the redis cache, e.g. loads the index.yaml

Just a thought: I don't know that we need to go that far, and even if we did, it doesn't solve the fact that such a CRD won't necessarily change when the corresponding index.yaml is updated. With flux it will, aiui, flux itself will handle keeping the cached index.yaml up to date with some set frequency, and I guess you're planning to watch that resource for changes and update the extracted content of the redis cache when that happens.

But I wonder if a more general approach, which will work for any requirements, would be for a plugin to be able to provide a function that gets called on some condition, whether that's a time or a watch.

Note also that I don't think a separate controller/deployment is required... any goroutine could run on startup as part of the existing kubeappsaips service to watch resources and/or do things at certain times... might be worth starting as a goroutine there, but see what you think.

I guess one other thing that may become essential here (if it's a general service provided to all plugins) is that the process that watches is going to need explicit RBAC to whatever it's watching (unlike our API endpoints which can run synchronously as the user). If it's a goroutine running in the kubeappsapis service, then I'm OK with the burden of responsibility being on the person configuring (or the chart) to bind the read-only RBAC for the configured plugins to the kubeappsapis service account.

Looks like an interesting problem!

3. modify flux plug-in code to use said cache as needed

@ppbaena ppbaena moved this from Committed to In progress in Kubeapps Jun 30, 2021
Kubeapps automation moved this from In progress to Done Jul 8, 2021
gfichtenholt added a commit that referenced this issue Jul 8, 2021
* bump fluxv2 version for local dev env

* step 2

* step 3

* step 4

* step 5

* step 6

* step 7

* step 8

* step 9

* step 10

* step 11

* step 12

* step 13

* step 14

* undo un-intended changes from a messed up merge

* step 15

* step 15

* test cleanup

* Michael's feedback

* Michael's comments #2

* step 16

* step 17

* step 18

* revert unintended change

* step 19

* Michael's feedback #3

* Michael's feedback #4

* small change to force CI test run

* small change to add a name to AvailablePackageDetail response

* Michael's feedback #5

* add a couple of debug statements to help diagnose CI failures

* Add --set redis.auth.password=password in CI

* Fix chart deps version

Co-authored-by: Antonio Gamez Diaz <agamez@vmware.com>
@gfichtenholt
Copy link
Contributor Author

the first step was pushed. My next item is to improve the efficiency and robustness of cache based on watches.
at the moment, the code completely relies on k8s events firing and notifying of any changes to make sure cache is up to date. Watches have known limitations (e.g. limited history) and in my experience also are sometimes flaky, meaning dying and/or missing events. I plan to follow
https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes and use the package https://pkg.go.dev/k8s.io/client-go/tools/cache to improve efficiency and robustness of the cache

@gfichtenholt gfichtenholt reopened this Jul 8, 2021
Kubeapps automation moved this from Done to In progress Jul 8, 2021
@project-bot project-bot bot moved this from In progress to Inbox in Kubeapps Jul 8, 2021
@gfichtenholt
Copy link
Contributor Author

possibly related to above note. At the moment I am always setting expiration (time-to-live) to 0 for objects in the cache. That means objects never expire. I am wondering if I should give the caller (a plug-in using the cache) the option to set the TTL to something other than 0 and if so, give them some means either periodically refresh the contents cache or do it on a cache miss.

@absoludity
Copy link
Contributor

possibly related to above note. At the moment I am always setting expiration (time-to-live) to 0 for objects in the cache. That means objects never expire. I am wondering if I should give the caller (a plug-in using the cache) the option to set the TTL to something other than 0 and if so, give them some means either periodically refresh the contents cache or do it on a cache miss.

That may help with another aspect I mentioned above: some plugins will need items to be updated regardless of whether a related k8s resource changes. For example, AppRepositories CRs in our existing helm support just point to a URL and currently have a cron-job for each to ensure it's checked at some period.

Also, as per our discussion on your PR, I'd be keen that we verify that using Redis as the backend for cache data will enable plugins to filter/query for cached data (eg. filtering available packages by various fields, paginating results without duplicates etc.), as if the query possibilities are not strong enough, we may need to use postgresql (with json fields) as we do for the current helm support. Interested to see what you find.

@absoludity absoludity moved this from Inbox to In progress in Kubeapps Jul 8, 2021
@gfichtenholt
Copy link
Contributor Author

possibly related to above note. At the moment I am always setting expiration (time-to-live) to 0 for objects in the cache. That means objects never expire. I am wondering if I should give the caller (a plug-in using the cache) the option to set the TTL to something other than 0 and if so, give them some means either periodically refresh the contents cache or do it on a cache miss.

That may help with another aspect I mentioned above: some plugins will need items to be updated regardless of whether a related k8s resource changes. For example, AppRepositories CRs in our existing helm support just point to a URL and currently have a cron-job for each to ensure it's checked at some period.

Also, as per our discussion on your PR, I'd be keen that we verify that using Redis as the backend for cache data will enable plugins to filter/query for cached data (eg. filtering available packages by various fields, paginating results without duplicates etc.), as if the query possibilities are not strong enough, we may need to use postgresql (with json fields) as we do for the current helm support. Interested to see what you find.

That sounds good. I can look at that real soon. It would help to see some unit tests in helm plug-in that exercise filter/query options

@gfichtenholt
Copy link
Contributor Author

the first step was pushed. My next item is to improve the efficiency and robustness of cache based on watches.
at the moment, the code completely relies on k8s events firing and notifying of any changes to make sure cache is up to date. Watches have known limitations (e.g. limited history) and in my experience also are sometimes flaky, meaning dying and/or missing events. I plan to follow
https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes and use the package https://pkg.go.dev/k8s.io/client-go/tools/cache to improve efficiency and robustness of the cache

on a 2nd thought using k8s.io/client-go/tools/cache may not be relevant for us. Upon closer inspection, k8s.io/client-go/tools/cache is "a client-side caching mechanism". In other words, something you might use in the absence of server-side caching. We have server-side caching with redis.

@ppbaena ppbaena added this to the 2021-Q2 milestone Jul 9, 2021
@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Jul 11, 2021

copied from slack thread:
Here is the latest on this subject. First of all, it is important to understand
that Redis at the core was written as "glorified" KV store,
(https://redis.io/topics/data-types-intro), meaning it has some very
limited capabilities when it comes to querying. Most operations are like
'set key 'foo' to value 'bar' or maybe little more advanced
'add value 'bar' to list of vales for key 'foo'). These are all operations that perform in
O(1) (constant) time, and they're very proud of that. It works very well for certain use
cases they quote, like adding the latest tweet to the list of tweets done by the user.
As far as filtering, you can ask for all keys or keys matching a very limited support of patterns.
You can't even ask a question like "give me all keys that look like '*a' or '*b'" -
redis/redis#3627 (basically, OR operand
is not supported in queries).
No filtering by anything in the value exists in the core redis server.
As a side note, none of the KV stores I have played with in the past (etcd, consul and now redis)
offer the kind of query capabilities you get with a SQL DB,
like "give me a list of users whose country_code is 'UK' or 'FR'". So that part is pretty
consistent. I think it's a trade-off: redis is faster/more efficient when it comes to doing very basic
things compared to a SQL DB but going beyond basic is difficult, to say the least.
Now, and this is my guess, is that Redis Labs have realized they they are losing some
business to SQL vendors due to these limitations and have tried to address them. More on
that below.
Also, it is important to note that as of today, for each registered repo I am creating
one key which is a repo name and store 'the value'. What's in the value itself is basically
the parsed repo index that contains all the chart versions with their names, keywords, categories, etc.
This decision is of course not cast in stone, just my first attempt at caching layer. We can
have a discussion about what to store for each repo, and change that.
Because I have chosen the repo name as the key, I can implement a filter based on repo names only.
But we have other filter options, such as by categories, app/pkg versions, etc.
Also, there is an option of filter by query, which I understand has somewhat fuzzy semantics,
where I think (some part of) chart name plays in the match as well as the chart keywords.
Anyway, to continue, the way I see it filtering can be theoretically be implemented in 3 places
starting at the top of the stack:

  1. UX - very undesirable, I understand the reasons Antonio mentioned previously
  2. a) core (aggregation) server AND/OR
    b) plug-in
    Since 2a and 2b are co-located in kubeapps-apis pod, performance-wise there shouldn't be that much of a difference
  3. real back-end, meaning redis server itself

Each of (1), (2) and (3) are separated by a network hop, so the farther down
the stack you do the filtering, the less data travels across the network and up the stack,
so it is most desirable to filter as further down the stack as you can.
Having said that, if we really need to do the filtering at (3), given different
filtering options we want to support, I see one of two possibilities:

  1. instead of creating one key per repo, create LOTS of keys,
    basically for each chart, pkgVersion, category, etc, everything that we'd want
    to have in our query criteria. Easy to say, but probably a nightmare to keep up-to-date
    as far as cache is concerned OR
  2. keep storing a key for repo, and store the indexed result as a JSON document,
    that can be queried via JSONPath-like expressions. Ah, but how, you ask?
    To be able to do that, we need a couple of "special" redis modules, called RediSearch and RedisJSON
    https://redislabs.com/blog/index-and-query-json-docs-with-redis/. Redis was written with the philosophy of keeping the core very small but with the ability to add/load additional modules to extend functionality
    which is what we'd be doing here. Apparently, this latest release was announced just 3 days ago and
    is available in 'private preview'. I don't really know what that term means practically, but apparently
    the docker image is freely available on docker hub or we can compile the bits ourselves ourselves
    as it is open source. I'd be happy to go down that road to see where it leads, but it may be a bit of a detour
    I need to make sure I have support from powers that be

@absoludity
Copy link
Contributor

As a side note, none of the KV stores I have played with in the past (etcd, consul and now redis)
offer the kind of query capabilities you get with a SQL DB,
like "give me a list of users whose country_code is 'UK' or 'FR'". So that part is pretty
consistent. I think it's a trade-off: redis is faster/more efficient when it comes to doing very basic
things compared to a SQL DB but going beyond basic is difficult, to say the least.

Yep, as you said, they're designed and optimized to store a value (only), not a structured document with it's own fields and subfields. Both SQL DBs and Document DBs are optimized for storing and querying structured data (postgresql just happens to do both normalized DB tables and document db functionality).

Also, it is important to note that as of today, for each registered repo I am creating
one key which is a repo name and store 'the value'. What's in the value itself is basically
the parsed repo index that contains all the chart versions with their names, keywords, categories, etc.
This decision is of course not cast in stone, just my first attempt at caching layer. We can
have a discussion about what to store for each repo, and change that.

Yeah, I wondered about this - it makes perfect sense if you want to cache a simple value (the content of an index.yaml, as you're doing), but it may mean more work to query or use finer granularity at the cache layer. Worth thinking whether we can store something general but shared, such as the JSON for an AvailablePackage, for example (though no idea if that would work with Redis). ie. if plugins can cache and query AvailablePackages, though perhaps there's other things that would be needed... look forward to hearing more of your thoughts/investigations there.

Now, and this is my guess, is that Redis Labs have realized they they are losing some
business to SQL vendors due to these limitations and have tried to address them. More on
that below.

Interesting!

Regarding your options, I'd personally think that (3) is the best option if the backend can support it (this is what we're doing with the current helm support), but (2b) is less ideal but OK if the backend does not support it. I don't personally think (2a) is an option - the aggregate API should be aggregating results from the plugin API calls of the same signature (I'll post a separate comment about the pagination point that came up on slack).

Regarding option (3) with Redis, I'd definitely avoid creating our own indexing system (your 3.1 which you also identify as a nightmare :) ), but using the existing (or nascent) RedisSearch indexing functionality to which you've pointed could be useful?

2\. keep storing a key for repo, and store the indexed result as a JSON document,
    that can be queried via JSONPath-like expressions.

It's not clear to me from their docs yet whether a collection of data should be stored as one document (as in their books example: JSON.SET myDoc $ '{"books": [{"title": "Peter Pan", "price": 8.95}, {"title": "Moby Dick", "price": 12.99}]}'), or as separate documents (as in their indexing, querying and FT search example: JSON.SET myDoc $ '{"title": "foo", "content": "bar"}' which is followed an index on $.title across all docs). It looks like the latter, if you want to be able to create FT indexes to return different docs in the set (though this also looks possible on a single doc with a collection, but perhaps not ideal).

I wonder if it's worth investigating creating a doc per available package detail, then indexing as needed. This would mean we'd could provide a trivial API for each plugin to cache and query packages. Not sure though, keen to see what you find out.

But to the larger question, yes, if you do go with Redis, I think we'll definitely be depending on this new RedisJSON and RedisSearch capabilities. The only non-OSS part seems to be the Active:Active support, which isn't something we'd need anyway?

@absoludity
Copy link
Contributor

Dimitri said:

one option that is worth exploring is to do a hybrid implementation.
by that i mean:

  • that plugins return all the packages if we can make it fast enough. plugins could apply some of the filtering like the one that filter by repository name
  • then have the core server do the aggregation and all the filtering/sorting/pagination.

one reason this might be useful is that plugins can only deal with filtering. they cannot deal with sorting/pagination anyway as this needs to be applied to the whole list and only the core server can do it.

I don't think this is true. If individual plugins provide pagination then we can aggregate them easily enough, I think. (Note that the page token used in the API is opaque and so can easily include offsets for each plugin. The only sticking point is that offset needs to be to a certain item in the result, rather than a page of the result, which would be ambiguous as we may have included only half of a page in the aggregated results).

so might as well put filtering at the core layer as well.
any thoughts?

I'm very keen for the aggregate API to support exactly the same signatures as the individual plugins (ie. filtering, pagination etc.)

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Jul 12, 2021

FYI, we had a team meeting today where I described the full context of the current situation with the caching layer to Pepe. He mentioned that there is strong goal to have a kubeapps release mid-september which should include 'direct helm' and 'flux' plug-in. Something working that we can demo. Given that, it maybe more important at this time to focus on bringing the flux plug in up-to-date to have feature parity with helm plug-in and not invest 2-3 weeks at this time into a promising but not yet proven technology RediSearch/RedisJSON, particularly given the fact they are in private preview mode at this time. What may look good on a blog post MAY turn out to be less than perfect. I've been around long enough to know that :-). I told Pepe this would be fine and the best option to accomplish this seems like option 2b I described above - filtering done on the flux plug-in side. That way, I can keep what I already have working (some very basic caching with redis where I can use redis to filter by keys (repo names)), and implement other filtering options in flux plug-in itself, which shouldn't be too hard in the short term. I should have it done in 3 days or so. This would be a more iterative approach. Longer term, we will go back and revisit the RedisSearch and RedisJSON modules and see if we can make them do the hard work. The changes would be contained in one plug-in only and wouldn't affect other layers of the code. Helm direct plugin would, for the time being, continue to use postgres DB as the cache, as the caching layer I am working on isn't really finished. Antonio also suggested it maybe worthwhile to look into a GraphQL DB like Neo4J if redis does't deliver what we need when we get enough time to spend looking into what they offer with these new modules. This approach is fine with me. Pepe promised to discuss with Michael on the upcoming stand-up and let me know. To be absolutely clear, I still believe that lightweight, in-memory DB like redis is the a better answer for caching purposes than postgres. We'll just table the unknowns for a little while to make progress with other features and get back to it when we have time

@absoludity
Copy link
Contributor

Yep, happy to move forward with your 2b option if that means 3 days vs 3 weeks... especially the new nature of the redissearch/redisjson. We can postpone the other plugins using a shared caching layer for now.

To be absolutely clear, I still believe that lightweight, in-memory DB like redis is the a better answer for caching purposes than postgres.

It may turn out to be that, with the RedisJSON and RedisSearch. Let's see (in the future).

We'll just table the unknowns for a little while to make progress with other features and get back to it when we have time

Yep, fine with me.

@ppbaena
Copy link
Collaborator

ppbaena commented Oct 13, 2021

As the usage of RedisJSON and RedisSearch it is not allowed due to license restrictions, and kubeapps plugins are behaving with standard responses times in our testing, IMO we should close this issue and reopen just in case we really need to add a caching layer for plugins, after testing in more demanding scenarios.

@ppbaena ppbaena moved this from Backlog to Waiting For Review in Kubeapps Oct 13, 2021
@absoludity
Copy link
Contributor

Sounds good. I've not done in-real-life tests of the flux plugin (ie. via the UI), but I think Greg has done lots testing (probably without the UI, but should be the same). If we can add the bitnami repo and install charts without issue, then we don't need any more investigation here for now.

Let's make that the review task to move this to done. I'm keen to try it out anyway (probably next week) Thanks!

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Oct 14, 2021

Here is some data freshly collected just now. helm plug-in vs fluxv2 plug-in. Querying bitnami repo (largest dataset we have, i.e. worst case scenario) in both cases for available packages with Category == "Database" via the REST API with Postman:
Screen Shot 2021-10-13 at 5 49 13 PM
Screen Shot 2021-10-13 at 5 50 27 PM

Total elapsed time: around 250 ms for direct helm plug-in, around 600 ms for fluxv2 plug-in

Another example, filter by query text 'maria':

Screen Shot 2021-10-13 at 6 03 13 PM

Screen Shot 2021-10-13 at 6 05 09 PM

Direct helm: 186 ms, fluxv2: 561 ms.
Same order of magnitude number, but clearly slower. A bit disappointing to me, actually. I was hoping flux would come out ahead
I am going to spend a little bit of time profiling to see what majority of time is taken up by, e.g. if client vs server-side filtering might be the culprit

@absoludity
Copy link
Contributor

Great, thanks Greg. That's good to know. And yes, we can work to improve them as we move forward, we just want to ensure it's comparable, which it is. I'll still aim to checkout where we're at using this via the UI on Monday.

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Oct 14, 2021

Ah, got to the bottom of it. The culprit had nothing to do with filtering. As a matter of fact, filtering took 0 time :-). The culrpit was unmarshalling bytes from redis cache into var charts []models.Chart object via a call to json.Unmarshal(bytes, &charts). That takes 99.9% of the whole cost of the call.
Storing the data as bytes in the cache was an arbitrary decision I made back when I started cuz it was easiest to do and move on to other things. I bet I can improve on that now and beat helm plug-in numbers.

@absoludity
Copy link
Contributor

Excellent, nice work Greg.

I bet I can improve on that now and beat helm plug-in numbers.

I bet you can too! The question is whether you can then improve the helm one afterwards too :P

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Oct 14, 2021

yep, a simple switch from json to gob encoding (a few lines of code) and look at the difference :-)
Screen Shot 2021-10-13 at 8 48 35 PM
flux plugin (103ms) is faster than helm (250ms). Hehe. I will include this change with my next PR.

The only other thing I wanted to say about this issue is this: at some point back there was talk of writing a generic caching module that all plug-ins could use. I tried my best to do that with this cache I wrote. As things stand today, someone who wants both helm plug-in and flux plug-in will have a postgresql installation as well as redis installation in their cluster. Kind of heavy footprint. I stated this before, I strongly feel redis (an in-memory lightweight cache) is a better fit for our needs that postgresql is (a full-blown relational database that pulls data off disk as needed, granted with much more features than a KV store). Anyway, part of me was hoping at some point helm plug in might be re-factored to use redis as a cache as well. It doesn't have to be now and it doesn't have to be part of this issue, I just wanted to mention so we don't forget about it

Having said all that, feel free to close this issue as you see fit
Thanks

@absoludity
Copy link
Contributor

flux plugin (103ms) is faster than helm (250ms). Hehe. I will include this change with my next PR.

Awesome Greg!

Anyway, part of me was hoping at some point helm plug in might be re-factored to use redis as a cache as well. It doesn't have to be now and it doesn't have to be part of this issue, I just wanted to mention so we don't forget about it

Yes, I think last time we discussed that we were saying that as long as we can have the same functionality (querying and searching via json queries that both postgres and mongo have, or something that otherwise enables the same functionality when indexing) then we'd be keen. I don't remember exactly, but thought that lead to those extra modules for redis which we can't include.

Either way, it'd be excellent if we can do that eventually.

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Oct 14, 2021

Well, clear communication is very important, we've said that time and again. In that spirit, I'd like to lay out all the cards (or more like concerns) on the table. Not 100% sure what I am about to say below belongs in a public forum, let me know if not, will be happy to move it to a private forum if needed.

Regarding

"as long as we can have the same functionality (querying and searching via json queries that both postgres and mongo have, or something that otherwise enables the same functionality when indexing) then we'd be keen"

  • yes, I'd love it as well, but our legal team told us its not going to happen with redis. Possibly a mistake on their part, but kind of difficult to fight. Still, what I am saying today is I can deliver what you want without this feature - client side (i.e. plug-in) filtering lets you be as flexible as you want, and I have just verified that it costs almost nothing. As in zero. As in better than fancy server-side filtering with postgresql that you're so fond of.

Let's switch gears for a second. Am I right in thinking that the scenario where a customer has two plug-ins (helm and flux) enabled and therefore both postgresql and redis deployed as unacceptable even for MVP? If I were a customer, I would certainly ask why two different caching solutions are used and my maintenance/administration costs have doubled? If the answer is "not acceptable", this is an existential threat, an underground mine that's just waiting to explode at some future point. We shouldn't just ignore the issue, it'll come up and will be much more painful to deal with later. If the answer is "yes, its acceptable for the MVP", then ok we can proceed and drop this topic for a while.

@dlaloue-vmware
Copy link
Collaborator

catching up on this thread.

I agree with Greg that a decision must be made to switch caching over to redis or stay on postgres. We cannot have both solutions (except during transition phase).

In general redis should be more suited to caching than using postgres. Greg has shown that there is no performance issue from performing searching/filtering in-code. We should be able to upgrade to using redis, and we can always use RedisSJON/RedisSearch if/when it becomes possible.

@dlaloue-vmware
Copy link
Collaborator

There was an earlier comment from Michael about the TTL for CRDs that do not have update events:

some plugins will need items to be updated regardless of whether a related k8s resource changes. For example, AppRepositories CRs in our existing helm support just point to a URL and currently have a cron-job for each to ensure it's checked at some period.

While it may be true we can support this, in case of AppRepository i think we should instead fix the AppRepository itself. Currently, the AppRepository does not have a status that provides conditions of its state as it is common in k8s. Especially, the AppRepository should have condition regarding the last time the sync was executed, whether it failed, and also whether the index.yaml was changed.

There is another benefit from having status conditions. Using status conditions makes it easier to provide status feedback in the UI. I believe today we do not show any status, and the only way to find about issues is to look at the cron job (which a user may not have access to).

@gfichtenholt
Copy link
Contributor Author

gfichtenholt commented Oct 15, 2021

agree 100% on both points above

(1) decide on ONE caching base technology (redis or postgresql)
(2) Somewhat independent of decision on (1), the caching implementation I've got now is a "resource-watcher" based cache. It is currently written with redis support, but can be re-written with postgresql support. The way it works, is it reacts to changes for a given resource delivered as events. Current helm plug-in with apprepository controller does not work this way today, i.e. when the contents of index.yaml are updated, AppRepository CR does not change. Which is arguably a bug or at least inconsistent with the way k8s resources meant to work. So, if we were to change the caching used by direct helm plug-in, we may need to either (a) write a new kind of caching layer on top of (1) or else (b) fix AppRepository to work differently. (b) seems like a more correct option

Finally, as I've already stated, filtering/searching limitations of core redis are a non-issue for datasets we use in practice

@absoludity
Copy link
Contributor

Sheesh, quite a thread to come back to after the weekend. Just want to re-iterate, I'm trying to say "Awesome [work] Greg", and on the following discussion to agree that yes, as long as we can show that we have something that provides the same functionality to the user in the browser (whether that's by having access to the same json document search or something else that enables the same functionality to be exposed when indexing repos), I'm happy for us to consider updating the helm plugin to use the same at some point. The response reads like something else has been understood.

Replying in-line:

Regarding

"as long as we can have the same functionality (querying and searching via json queries that both postgres and mongo have, or something that otherwise enables the same functionality when indexing) then we'd be keen"

* yes, I'd love it as well, but our legal team told us its not going to happen with redis. Possibly a mistake on their part, but kind of difficult to fight.

Right, I'm not involved there.

Still, what I am saying today is I can deliver what you want without this feature - client side (i.e. plug-in) filtering lets you be as flexible as you want,

I understand you to mean here that the plugin itself does the filtering (still server side, but yes, client-side to redis/postgres/whatever) rather than getting the backend technology (redis, postgres, mongo whatever) to do the filtering. That's great and I have no problem with that given how quick you've shown it to be. I think you've mentioned that for a paginated request at the end of a large index, this is then scanning the whole index to return the last page, but that this is still much quicker and won't be an issue for memory. Great, let's start testing that end-to-end as a user of Kubeapps. I plan to setup Kubeapps with the plugin and give it a whirl this week.

and I have just verified that it costs almost nothing. As in zero. As in better than fancy server-side filtering with postgresql that you're so fond of.

I'm a little sad about the tone being used here. I think we've chatted about this numerous times now and each time I've explained the history of how we ended up using postgresql's documentDB support here. It's not due to any fondness of mine.

Let's switch gears for a second. Am I right in thinking that the scenario where a customer has two plug-ins (helm and flux) enabled and therefore both postgresql and redis deployed as unacceptable even for MVP?

It's not unacceptable for trying out the flux functionality in Kubeapps, as I'm going to do this week. but for an MVP, I'm not sure that we can recommend enabling both plugins (Helm+Flux) at the same time, given that the Helm plugin will be picking up any helm installation, whether created by the Helm CLI, Kubeapps's helm plugin or those created by flux, rather than leaving the ones created by flux for the flux plugin to handle. We'd already talked about not currently supporting having both Helm+Flux plugins enabled at the same time. It'd make more sense to have Helm+Carvel or Flux+Carvel enabled (when available) but only ever one Helm-based plugin. If we're not supporting a scenario where both Helm and Flux are enabled, we can similarly update the chart so that postgres isn't deployed if the helm plugin is not enabled.

If we later find that people do have a need to have both plugins enabled (Helm+Flux) we could do something about that, but I'm not sure we'll need/want to.

Longer-term, if we do end up continuing to support both plugins, I'd still be keen to invest in unifying them, for sure, but it could be (for example) that we retire the helm-direct plugin at some point in the future. On the other hand, if updating the helm plugin to use redis is something really important to you right now, I really don't mind. No one wants to stop that from happening.

Dimitri wrote:

catching up on this thread.

I agree with Greg that a decision must be made to switch caching over to redis or stay on postgres. We cannot have both solutions (except during transition phase).

Hi Dimitri . See my reply above and let me know what you think.

Regarding the details of the AppRepository controller, can we please move that to a more relevant issue (if one exists already, or create one if not).

Just re-reading my earlier comment before submitting this one, I see that the words I used: "or something that otherwise enables the same functionality when indexing" could be read to mean that I was saying that the technology itself had to enable json search capabilities for me to consider it for some reason, rather than meaning that as long as the choice of technology otherwise enables the same functionality to the user (ability to filter the data). I'm sorry that wasn't clearer. Just for the future, please know that I have no interest in blocking solutions that work well.

@gfichtenholt
Copy link
Contributor Author

Just re-reading my earlier comment before submitting this one, I see that the words I used: "or something that otherwise enables the same functionality when indexing" could be read to mean that I was saying that the technology itself had to enable json search capabilities for me to consider it for some reason, rather than meaning that as long as the choice of technology otherwise enables the same functionality to the user (ability to filter the data). I'm sorry that wasn't clearer. Just for the future, please know that I have no interest in blocking solutions that work well.

yes, that's how I read it, hence my comments. Apologies for the tone. Pls don't take it to heart, I mean well. Not supporting helm+flux enabled at the same time. Ever? That's a new one, first time I am hearing anything like that. That does change the picture a bit.

@ppbaena ppbaena moved this from Waiting For Review to Done in Kubeapps Oct 25, 2021
@ppbaena ppbaena moved this from Done to Next iteration discussion in Kubeapps Oct 25, 2021
@ppbaena ppbaena moved this from Next iteration discussion to Backlog in Kubeapps Oct 25, 2021
@ppbaena ppbaena added kind/enhancement An issue that reports an enhancement for an implemented feature and removed kind/feature An issue that reports a feature (approved) to be implemented labels Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/apis-server Issue related to kubeapps api-server kind/enhancement An issue that reports an enhancement for an implemented feature
Projects
Status: 🗂 Backlog
Kubeapps
  
Backlog
5 participants
@absoludity @ppbaena @dlaloue-vmware @gfichtenholt and others