Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle namespace in GetAvailablePackages and GetPackageRepositories(), if applicable #2905

Merged
merged 26 commits into from
May 31, 2021

Conversation

gfichtenholt
Copy link
Contributor

No description provided.

@gfichtenholt gfichtenholt self-assigned this May 29, 2021
@gfichtenholt gfichtenholt added this to In progress in Kubeapps via automation May 29, 2021
Copy link
Contributor

@absoludity absoludity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to figure out why your PRs have all the former commits in the history - but it's only aesthetic, given that we're squashing commits on merge.

On a semi-related note, I noticed while testing the generic endpoint on friday that getting the list of packages is taking around 7s or more? I assume this is because it's pulling the whole index.yaml each time? I tihnk you mentioned that it does cache it internally but I guess it's still a decent size to unpack etc. Do you have thoughts of how we can reduce that to be usable in a UX? As you found, the index.yaml contains all versions ever published (or close to it), so I wonder if we will need to cache something like the latest versions for each chart for use when searching the catalog?

var resourceIfc dynamic.ResourceInterface = resource
if namespace != "" {
resourceIfc = resource.Namespace(namespace)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the little I've played with fluxv2, it looks as though when you use the flux cli (flux create source helm bitnami ... as per the dev setup), it creates a repo in the flux-system namespace?

So it seems this namespace is special? Do repos in this namespace make charts available in all namespaces? If so, I wonder what the default behaviour here should be. ie. when no namespace is included in the request, should we return all namespaces or the flux-system ones? I think it's probably correct and best as is (as you've got it) so that it can be consistent across packaging formats. And I assume you can install fluxv2 in other namespaces for a custom setup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to figure out why your PRs have all the former commits in the history - but it's only aesthetic, given that we're squashing commits on merge.

On a semi-related note, I noticed while testing the generic endpoint on friday that getting the list of packages is taking around 7s or more? I assume this is because it's pulling the whole index.yaml each time? I tihnk you mentioned that it does cache it internally but I guess it's still a decent size to unpack etc. Do you have thoughts of how we can reduce that to be usable in a UX? As you found, the index.yaml contains all versions ever published (or close to it), so I wonder if we will need to cache something like the latest versions for each chart for use when searching the catalog?

yes, on PR having prior commits in history. I was asking the same thing a couple of weeks back and would love to resolve it. If you have suggestions how, I will be happy to try.

regarding index.yaml. No, it is not pulling index.yaml each time. index.yaml is indeed cached, once pulled. I instrumented the code and it is the unmarshalling/parsing of the (huge) index.yaml into helmrepo.IndexFile struct that takes all the time. Which is a shame in this case, cuz we discard all but the latest chart versions, which is like 95% for bitnami repo. Signal-to-noise ratio is horrible, I agree. I don't see any options to tell flux to only index latest charts https://fluxcd.io/docs/components/source/helmrepositories/. But I am going to be on the lookout for this. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be even more specific it is this line of code in getHelmIndexFileFromURL() that takes up most of the time:
err = yaml.Unmarshal(contents, &index)
looking at what it does seems the cost is converting YAML to JSON. if somehow flux would cache the JSON version of the repo, we would be good. Or I can try to write my own parser without converting to JSON to see if I can do better :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, index.yaml for bitnami repo is > 8MB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and it contains > 13K chart versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think bitnami repo is one of, if not, THE WORST CASE scenario. Just open the attached index.yaml in an editor and take a look at what you're dealing with :-) I guess what I am saying I don't know if I would use bitnami repo as a measuring stick

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example this one
apiVersion: source.toolkit.fluxcd.io/v1beta1 kind: HelmRepository metadata: name: jetstack namespace: default spec: url: https://charts.jetstack.io interval: 1m
it takes only 30ms to get packages from. Pretty acceptable UX experience

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the bitnami index is crazily big - we've always had a similar issue in Kubeapps when we sync it, but it happens in the background.

The Bitnami catalog is an outlier because it is a catalog of charts (and contains all historical versions, which is fine for the helm cli). I don't know whether there are other supported catalogs like Bitnami's.

I don't know that it would help that much to have a yaml of just the latest versions, as we'd still need to parse the full version when a user views a particular chart (if we want to present the available options), I think?

I do wonder whether we might need to provide a caching layer to the kubeappsapis service (redis or something) which plugins can utilise. Not sure yet, let's see what you find. A custom parser that just pulls out the bits needed without marshalling the whole thing into memory might be interesting :) Excellent if we don't need a caching layer there.

@gfichtenholt gfichtenholt merged commit 9bcf615 into vmware-tanzu:master May 31, 2021
Kubeapps automation moved this from In progress to Done May 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Kubeapps
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants