Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document API policies #3333

Merged
merged 5 commits into from
Mar 21, 2018
Merged

Document API policies #3333

merged 5 commits into from
Mar 21, 2018

Conversation

di
Copy link
Member

@di di commented Mar 21, 2018

Fixes #2931.

@di di requested review from brainwane and ewdurbin March 21, 2018 18:30
Copy link
Member

@ewdurbin ewdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some flavor about XMLRPC

For periodically checking for new packages or updates to existing packages,
please use our RSS feeds.

If at all possible, it is recommended to use the JSON/RSS/Legacy APIs over
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stronger language, may be worth stating that XMLRPC will be going away as soon as we find a way to.

No new integrations should use the XMLRPC interfaces that PyPI provides, it is planned for deprecation.

~~~~~~~~~~~~~

Due to the heavy caching and CDN use, there is currently no rate limiting of
PyPI APIs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*At the edge. Requests that hit our backends (uploads, XMLRPC) may be rate limited (automatically or manually) if they are causing degradation of service.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally we should reserve the right to temporarily prohibit a client based on activity.

All API requests also provide an ``ETag`` header. If you're making a lot of
repeated requests, please ensure your API consumer will respect this header to
determine whether to actually repeat a request or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need exception for XMLRPC here. At this point it provides no information about if the response was cached internally.


All API requests are cached. Requests to the JSON, RSS or Legacy APIs are
cached by our CDN provider. You can determine if you've hit the cache based on
the ``X-Cache`` and ``X-Cache-Hist`` headers in the response.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*Hits

Copy link
Member

@ewdurbin ewdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, though we may want to be a bit more explicit about our ability to prohibit a client at any layer.

We do have the ability to prohibit a client or IP at the edge, at the nginx layer, or internally via the pyramid app and have used this approach for mitigating accidental DoS in the past.

@di
Copy link
Member Author

di commented Mar 21, 2018

@ewdurbin Might it be preferable to be less explicit about how we can prohibit malicious clients? I think for the average user, it's enough to just know "we can drop the 🔨 on you".

Copy link
Contributor

@brainwane brainwane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like having this on the API reference index page, and this feels solid overall.

Request: add a note to the top of xml-rpc.rst saying we're working on deprecating this API and pointing to our APIs/feeds tag or similar.

* Set your consumer's ``User-Agent`` header to uniquely identify your requests
* Try not to make a lot of requests (thousands) in a short amount of time
(minutes). Generally PyPI can handle it, but it's preferred to make requests
in serial over a longer amount of time if possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we add "consider running your own organizational mirror instead/in addition" & point to bandersnatch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's good advice for folks that need consume our APIs. Does bandersnatch provide all the exact same APIs as PyPI?

Generally I think "use bandersnatch" is good advice for users that either can't use pypi.org or are concerned about it's availability. I'm not sure that using it could be considered a "best practice" for consuming our API.

(With the exception of using a mirror for the Simple/download API, which could be helpful in reducing load on our CDN, but I think we don't really have the assumption that this is necessary or recommended for big consumers -- we just handle the traffic.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document tells API users how to be good neighbors, so I suggest we add a line advising orgs that it would be friendly for them to use internal mirrors for Simple/downloads. But my opinion is mild here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a happy medium 🙂

degradation of service.

In addition, PyPI reserves the right to temporarily or permanently prohibit a
consumer based on irresponsible activity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a past org we suggested: use a descriptive User-Agent header with your contact info in it, so we can at least potentially ping you to suggest alternate approaches if you're causing a ruckus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point.

Copy link
Contributor

@brainwane brainwane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@di di merged commit b321bc1 into pypi:master Mar 21, 2018
@di di deleted the document-api-policies branch March 21, 2018 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants