-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add agent for request to artifacthub.io/api/chartsvc #66
Conversation
Thanks |
Hi @sstarcher @tuananhnguyen-ct! This is Sergio, from Artifact Hub 👋 I'm so happy I stumbled across this PR 😄 A bit of context first When the Helm Hub migrated to Artifact Hub, we started receiving a lot of search requests using the legacy search API endpoint. We realized that many of them were constantly (many times per second) searching for the same term, and in many cases for terms that were not even related to the kind of content available in the Hub (like random domains and other stuff). Requests were coming from a lot of different sources, from multiple cloud providers. We thought some software was doing those requests in an automated fashion, and we did our best to protect the service by imposing some rate limits and blocking most of them. Blocking some user-agents in that endpoint was one of the measures taken, which affected some of your users. After the fix in this PR was released, the number of requests that made it to our backend servers started growing again. To give you an idea of the current situation, we are receiving more than 108 million search requests per day in that endpoint, which is more than 1.25 thousand per second 😅. The vast majority of those requests are being blocked, so the impact on How can we solve this I realize many of your users many not even be aware that they are running into some issues, because even though the requests are being blocked we continue receiving them. It'd be great if we could work together on optimizing how I'm not very familiar with the
We'd be more than happy to provide a specific endpoint for your use case if that helps. We did something similar for Harbor replication some time ago for similar reasons. Maybe we could generate a list of all available packages and the latest version for each of them, as it looks like the end goal of you interaction with the Artifact Hub API is to get the latest version of a given chart. When I understand it's not your responsibility how users use Thank you very much in advance for your time and help! |
@tegioz that all makes sense to me. I'm very sorry I have caused you some headache. I no longer actively work on this project, but due to this causing you some headache I would be happy to take some time out and assist. Let me know if you think it's reasonable for you to develop a separate endpoint. |
Thank you for getting back to me so quickly @sstarcher! No worries, I'm happy we may have found the possible cause and I really appreciate your offer to help 🙂 Adding the endpoint suggested wouldn't be a problem at all. Let's summarize to check it all makes sense and we'll start working on our part as soon as possible:
Some questions:
[
{
"name": "chart name",
"latest_version": "1.0.0",
"repo": {
"name": "repo name",
"url": "repo url"
}
}
] Thanks again! |
Have you done an analysis on the user agent? Anyone running this should have a golang user agent. That would give you more of an idea of this project and other golang projects being the cause. We are using One thing we should probably do is have this project set its own user agent so it can be identified easier in the future. I can't speak for all users, but I think a 6 hour rate would be reasonable. I do agree to have the chart name and the repo URL both would be very helpful. Do you know the overall size of the data structure if we were to have it in json with the above info? I don't want to put a ton of burden on the client end, but if it's something we can easily store in a few mb of memory that would make searching on the client easy. |
Most of the requests have that user-agent, that's actually one the filters we have in place. It was set to
That sounds like a great idea! We need to keep in mind that there will probably be older versions out there for a while.
Awesome, we'll do that then 👍
I'd say it should be around 1MB as of today, but I can't tell with more precision right now. I think we have enough to start implementing the new endpoint. If you can think of something else please don't hesitate pinging me 🙂 |
Related to: sstarcher/helm-exporter#66 Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com> Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com> Co-authored-by: Sergio Castaño Arteaga <tegioz@icloud.com> Co-authored-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
Related to: sstarcher/helm-exporter#66 Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com> Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com> Co-authored-by: Sergio Castaño Arteaga <tegioz@icloud.com> Co-authored-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
Related to: sstarcher/helm-exporter#66 Signed-off-by: Sergio Castaño Arteaga <tegioz@icloud.com> Signed-off-by: Cintia Sanchez Garcia <cynthiasg@icloud.com> Co-authored-by: Sergio Castaño Arteaga <tegioz@icloud.com> Co-authored-by: Cintia Sanchez Garcia <cynthiasg@icloud.com>
New endpoint is ready @sstarcher 🙂 https://artifacthub.io/api/v1/helm-exporter Response size is 715KB as of today. It'll be cached for one hour, so Thanks! |
Reference documentation: https://artifacthub.io/docs/api/#/Integrations/getHelmExporterDump |
@tegioz awsome thanks. I can likely do this update this upcoming weekend |
Awesome, thanks! |
Thanks for taking care of this @sstarcher! I will take a look at it shortly 👍 |
Change the URL from hub.helm.sh to artifacthub.io so we can skip the redirection
Add the agent to fix #64
It seems only this endpoint requires a user agent.