Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"too many requests by the client" when executing two XML-RPC methods in a row #8753

Closed
evgeni opened this issue Oct 28, 2020 · 11 comments · Fixed by #8756
Closed

"too many requests by the client" when executing two XML-RPC methods in a row #8753

evgeni opened this issue Oct 28, 2020 · 11 comments · Fixed by #8756

Comments

@evgeni
Copy link

evgeni commented Oct 28, 2020

Describe the bug
The warehouse XML-RPC documentation has a nice example how to obtain information about a package:

import xmlrpc.client
import pprint
client = xmlrpc.client.ServerProxy('https://pypi.org/pypi')
client.package_releases('roundup')
pprint.pprint(client.release_urls('roundup', '1.6.0'))

However, due to recent rate-limiting changes, the example doesn't work anymore and only raises:

xmlrpc.client.Fault: <Fault -32500: 'HTTPTooManyRequests: The action could not be performed because there were too many requests by the client. Limit may reset in 0 seconds.'>

Expected behavior
Example from the documentation (or any other XML-RPC based client) works.

To Reproduce

Paste the above into Python3:

$ python3 /tmp/test.py
Traceback (most recent call last):
  File "/tmp/test.py", line 5, in <module>
    pprint.pprint(client.release_urls('roundup', '1.6.0'))
  File "/usr/lib64/python3.8/xmlrpc/client.py", line 1109, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python3.8/xmlrpc/client.py", line 1450, in __request
    response = self.__transport.request(
  File "/usr/lib64/python3.8/xmlrpc/client.py", line 1153, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib64/python3.8/xmlrpc/client.py", line 1169, in single_request
    return self.parse_response(resp)
  File "/usr/lib64/python3.8/xmlrpc/client.py", line 1341, in parse_response
    return u.close()
  File "/usr/lib64/python3.8/xmlrpc/client.py", line 655, in close
    raise Fault(**self._stack[0])
xmlrpc.client.Fault: <Fault -32500: 'HTTPTooManyRequests: The action could not be performed because there were too many requests by the client. Limit may reset in 0 seconds.'>

My Platform
Python 3.8 on Fedora 32, but that shouldn't matter, really.

Additional context

@evgeni
Copy link
Author

evgeni commented Oct 28, 2020

@di I don't think that adding a sleep to the example is the right thing to fix this.

  1. The error raised by the API does not tell anything about that (even worse, it says "0 seconds", which is clearly wrong, you need to sleep 1 second)
  2. I don't think it is feasible to add appropriate sleeps to existing consumers, as they will have more than just a few calls, probably shuffled all over the place.

One such consumer is https://github.com/fedora-python/pyp2rpm

@di
Copy link
Member

di commented Oct 28, 2020

The error raised by the API does not tell anything about that (even worse, it says "0 seconds", which is clearly wrong, you need to sleep 1 second)

The message is technically correct (in reality it's a fraction of a second that is being rounded down) but I'll agree it's confusing. I've made #8757 to address this.

I don't think it is feasible to add appropriate sleeps to existing consumers, as they will have more than just a few calls, probably shuffled all over the place.

For the purposes of the original issue (the documentation example), it is sufficient. For clients consuming this API, they will need to support adhering to the ratelimit. We're not planning on relaxing or changing this ratelimit in the future.

@evgeni
Copy link
Author

evgeni commented Oct 29, 2020

While I still think that 1request/second is a way too harsh limit to make the API really usable, I came up with the following to make my life a tad easier in regard to adhering to the limit:

import time
import xmlrpc.client

class RateLimitedServerProxy(xmlrpc.client.ServerProxy):

    def __getattr__(self, name):
        time.sleep(1)
        return super(RateLimitedServerProxy, self).__getattr__(name)

client = RateLimitedServerProxy('https://pypi.org/pypi')

Feel free to add that to the examples.

evgeni added a commit to evgeni/pyp2rpm that referenced this issue Oct 29, 2020
pypi.org has rate-limited their XML-RPC API to 1 request/second, see
pypi/warehouse#8753 for details.

Let's adhere to that limit until we migrate to the new JSON API.
evgeni added a commit to evgeni/pyp2rpm that referenced this issue Oct 29, 2020
pypi.org has rate-limited their XML-RPC API to 1 request/second, see
pypi/warehouse#8753 for details.

Let's adhere to that limit until we migrate to the new JSON API.
evgeni added a commit to evgeni/pyp2rpm that referenced this issue Oct 29, 2020
pypi.org has rate-limited their XML-RPC API to 1 request/second, see
pypi/warehouse#8753 for details.

Let's adhere to that limit until we migrate to the new JSON API.
@gordonmessmer
Copy link

pyp2rpm is affected by this, and I can add sleep to my own code to address the rate limit, but there's another problem.

pyp2rpm calls virtualenv (which calls pip), which appears to also use the XMLRPC interface with a simple retry on failure. Often, this appears to create a loop that never exits, though some of the time it does work and some of the time it does fail. Since all of this is happening in an external process, I don't have much control over it.

I'm not getting any of the output from the python 2 pip process, so I suppose it's possible that isn't rate-limit related, but that seems like the most likely explanation.

At this point, I don't see any solution other than removing all support for Python 2.

@pradyunsg
Copy link
Contributor

pradyunsg commented Oct 30, 2020

pyp2rpm calls virtualenv (which calls pip), which appears to also use the XMLRPC interface with a simple retry on failure.

Which version of virtualenv+pip is this, and what pip/virtualenv command are you running?

@gordonmessmer
Copy link

Python 2.7, virtualenv 20.0.25, pip 20.1.1

 104297 pts/0    Sl+    0:00  |       |       \_ /home/gordon/git/pyp2rpm/.tox/py27/bin/python /home/gordon/git/pyp2rpm/.tox/py27/bin/virtualenv -p python2 venv
 104302 pts/0    S+     0:00  |       |           \_ /home/gordon/git/pyp2rpm/.tox/py27/bin/python -c from virtualenv.seed.wheels.periodic_update import do_update;do_update(u'setuptools', '2.7', '/home/gordon/git/pyp2rpm/.tox/py27/lib/python2.7/site-packages/virtualenv/seed/wheels/embed/setuptools-44.1.1-py2.py3-none-any.whl', '/root/.local/share/virtualenv', [], True)
 104366 pts/0    S+     0:01  |       |           |   \_ /home/gordon/git/pyp2rpm/.tox/py27/bin/python -m pip download --disable-pip-version-check --only-binary=:all: --no-deps --python-version 2.7 -d /root/.local/share/virtualenv/wheel/house setuptools<44.0.0
 104303 pts/0    S+     0:00  |       |           \_ /home/gordon/git/pyp2rpm/.tox/py27/bin/python -c from virtualenv.seed.wheels.periodic_update import do_update;do_update(u'wheel', '2.7', '/home/gordon/git/pyp2rpm/.tox/py27/lib/python2.7/site-packages/virtualenv/seed/wheels/embed/wheel-0.34.2-py2.py3-none-any.whl', '/root/.local/share/virtualenv', [], True)
 104375 pts/0    R+     0:00  |       |           |   \_ /home/gordon/git/pyp2rpm/.tox/py27/bin/python -m pip download --disable-pip-version-check --only-binary=:all: --no-deps --python-version 2.7 -d /root/.local/share/virtualenv/wheel/house wheel<0.34.0
 104304 pts/0    S+     0:00  |       |           \_ /home/gordon/git/pyp2rpm/.tox/py27/bin/python -c from virtualenv.seed.wheels.periodic_update import do_update;do_update(u'pip', '2.7', '/home/gordon/git/pyp2rpm/.tox/py27/lib/python2.7/site-packages/virtualenv/seed/wheels/embed/pip-20.1.1-py2.py3-none-any.whl', '/root/.local/share/virtualenv', [], True)
 104361 pts/0    S+     0:00  |       |               \_ /home/gordon/git/pyp2rpm/.tox/py27/bin/python -m pip download --disable-pip-version-check --only-binary=:all: --no-deps --python-version 2.7 -d /root/.local/share/virtualenv/wheel/house pip<20.2.2

@gordonmessmer
Copy link

I also don't think the rate limiting at PyPI is working correctly. I've updated pyp2rpm to sleep 2 seconds after a failed request, and I still see repeated failures. On Fault, the application logs the error, then sleeps 2 seconds, then retries:

2020-10-31 00:01:02,573::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:04,688::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:06,805::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:08,907::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:11,000::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:13,092::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:15,204::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:17,316::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:19,455::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls
2020-10-31 00:01:21,719::pyp2rpm.convertor::INFO::sleeping due to xmlrpc fault for release_urls

@gordonmessmer
Copy link

Now I see...

xmlrpc.client.Fault: <Fault -32500: 'HTTPTooManyRequests: The action could not be performed because there were too many requests by the client. Limit may reset in 52 seconds.'>

The documented requirement of sleeping one second is not adequate to deal with the actual rate limit.

Furthermore, the documentation states:

Users of this API are strongly encouraged to subscribe to the pypi-announce mailing list for notices as we begin the process of removing XML-RPC from PyPI.

... but I've looked at the list archives for the last few months, and I don't see any notices regarding this stage of the deprecation.

Would you consider removing the rate limit until its actual behavior has been announced, and the documentation describes how to adjust legacy clients while they're ported to the JSON API?

@pfmoore
Copy link
Contributor

pfmoore commented Nov 19, 2020

I have just been hit by this. I'm fine with the idea that people should move to another API, but what is the replacement for the mirroring API (changelog_last_serial and changelog_since_serial)?

@pfmoore
Copy link
Contributor

pfmoore commented Nov 19, 2020

Also, there have been absolutely no messages about this on pypi-announce, and the "RSS API" mentioned has no documentation, and the JSON API has no equivalents for the parts of the XMLRPC API that are important for those of us trying to incrementally maintain an offline mirror of bulk PyPI data. (By "incrementally" I hope it's clear that I mean "in a server-friendly manner" 🙁)

@lsaffre
Copy link

lsaffre commented Dec 7, 2020

While I still think that 1request/second is a way too harsh limit to make the API really usable, I came up with the following to make my life a tad easier in regard to adhering to the limit:

import time
import xmlrpc.client

class RateLimitedServerProxy(xmlrpc.client.ServerProxy):

    def __getattr__(self, name):
        time.sleep(1)
        return super(RateLimitedServerProxy, self).__getattr__(name)

client = RateLimitedServerProxy('https://pypi.org/pypi')

Feel free to add that to the examples.

Thanks @evgeni, I had the same problem because I request a status of my project before releasing, and your workaround helped me. Summary of my release script:

client = RateLimitedServerProxy('https://pypi.python.org/pypi')
released_versions = client.package_releases(name)
urls = client.release_urls(name, released_versions[-1])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants