Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLRPC statistics on "abusive" requests #9136

Open
abitrolly opened this issue Feb 25, 2021 · 6 comments
Open

XMLRPC statistics on "abusive" requests #9136

abitrolly opened this issue Feb 25, 2021 · 6 comments
Labels
needs discussion a product management/policy issue maintainers and users should discuss

Comments

@abitrolly
Copy link
Contributor

What's the problem this feature will solve?

An ongoing 2 months outage of XMLRPC search reported by https://status.python.org/incidents/grk0k7sz6zkp can be solved by optimizing or caching popular queries.

Describe the solution you'd like

I'd like to see the volume and contents of the:

  • most popular API requests
  • longest API requests

Additional context

Depending on the statistics, it will be possible to provision additional index servers to offload API requests. or provide a way for organization to incrementally sync the database. Sync can be done either using global event notifications similar to Fedora Messaging System, or using standard P2P Merkle tree lookup mechanism employed by blockchains.

@ewdurbin
Copy link
Member

ewdurbin commented Mar 4, 2021

XMLRPC call rate over time-4

Our current attempted call rate for the disabled search endpoint is roughly 100rps (yellow trace). All of these are receiving either a rate limit response (brown trace) or a disabled response (red trace). This call rate has not changed since we implemented rate limiting or disabled search.

The issue isn't solely one of provisioning resources to sustain the search volume, it is that we don't have any viable mechanism to communicate with users of the very expensive XMLRPC API who abuse the endpoint. Architecturally XMLRPC being based on POST requests, combined with the high cardinality of results (search queries are arbitrary), makes caching this at the CDN edge or otherwise reducing the load imposed on our backends untenable in the long run.

Our current search is based on ElasticSearch, which I'm not familiar enough with to determine if such incremental syncs are viable.

@abitrolly
Copy link
Contributor Author

@ewdurbin it is possible to publish stats by popularity on these 150rps without doing the actual requests? Without it we can only state that optimization in general sense is impossible.

@ewdurbin
Copy link
Member

ewdurbin commented Mar 4, 2021

popularity in what sense?

@abitrolly
Copy link
Contributor Author

Structure or request, which query, how popular are such queries. Then it will be possible to determine overhead for certain query structures and set selective filters to cut expensive requests and optimizing most popular more.

@di
Copy link
Member

di commented Mar 11, 2021

How do you propose to "set selective filters to cut expensive requests" and how would that be less expensive than the current response?

@di di added the needs discussion a product management/policy issue maintainers and users should discuss label Mar 11, 2021
@abitrolly
Copy link
Contributor Author

Filters can be set at load balancer, at web server, at middlewire or at Django level. It might be possible to set them at SQL level is SQL can explain that the query is too expensive to be run. Whatever method is chosen, it depends on metrics. The best way is to add OpenTracing of course. Maybe the "abusive" requests are just malformed XML that make parser choke.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion a product management/policy issue maintainers and users should discuss
Projects
None yet
Development

No branches or pull requests

3 participants