Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the pip search command #5216

Closed
dstufft opened this issue Apr 13, 2018 · 146 comments
Closed

Remove the pip search command #5216

dstufft opened this issue Apr 13, 2018 · 146 comments
Labels
C: search 'pip search' state: awaiting PR Feature discussed, PR is needed type: deprecation Related to deprecation / removal. type: maintenance Related to Development and Maintenance Processes

Comments

@dstufft
Copy link
Member

dstufft commented Apr 13, 2018

As of today (Oct 7, 2022), pip search does not return results with PyPI (pypi.org). Please use pypi.org/search instead.

Relevant context on this that GitHub hides by default:

#5216 (comment)

PyPI is shutting down the API that pip search used, because the design of that API was such that it made it easy for people to overwhelm and take down PyPI as a whole using that API. The pip search command doesn't work without that API, and we don't have a way to restore 1:1 functionality without it.

#5216 (comment)

Here's the inevitable truth. The XMLRPC based search API either needs major surgery to somehow move the load off of the PyPI backends so we can just ignore if it breaks, massive clusters need to stop hammering it, or we need to deprecate it. [snip graph + few sentences] The fact that we've managed to keep it working to this point is frankly something that should provide us an immense amount of pride in engineering and shame in foresight.

The status.python.org incident that (eventually) led to the XMLPRC search (which powers pip search) being disabled on pypi.org


(original description below)

Currently pip allows you to search a repository by running pip search, which will then print out a bunch of packages that match, see for example:

$ pip search requests
negotiator-3k (1.0.0)                                                  - Proper Content Negotiation for Python
    
    The Negotiator is a library for decision making over Content Negotiation requests.
    It takes the standard HTTP Accept headers (Accept, Accept-Language, Accept-Charset,
    Accept-Encoding) and rationalises them against the parameters acceptable by the
    server; it then makes a recommendation as to the appropriate response format.
    
    This version of the Negotiator also supports the SWORDv2 extensions to HTTP Accept
    in the form of Accept-Packaging.
odoo10-addon-sql-request-abstract (10.0.1.0.0.99.dev1)                 - Abstract Model to manage SQL Requests
odoo9-addon-sql-request-abstract (9.0.1.0.0.99.dev6)                   - Abstract Model to manage SQL Requests
odoo8-addon-sql-request-abstract (8.0.1.0.0.99.dev7)                   - Abstract Model to manage SQL Requests
zenodo-accessrequests (1.0.0a2)                                        - Zenodo module for providing access request feature.
requests-wsgi-adapter (0.4.0)                                          - WSGI Transport Adapter for Requests
requests-celery-adapters (2.0.9)                                       - Requests lib adapters to send Celery messages (tasks)
odoo9-addon-hr-holiday-notify-employee-manager (9.0.1.0.0.99.dev1)     - Notify employee's manager by mail on Leave Requests creation.
odoo9-addon-purchase-request-operating-unit (9.0.1.0.0)                - Operating Unit in Purchase Requests
odoo10-addon-hr-holidays-notify-employee-manager (10.0.1.0.0.99.dev4)  - Notify employee's manager by mail on Leave Requests creation.
odoo9-addon-sql-export (9.0.1.0.0.99.dev12)                            - Export data in csv file with SQL requests
odoo8-addon-sql-export (8.0.1.0.0.99.dev9)                             - Export data in csv file with SQL requests

...

The output here goes on for ~900 lines and the results are just complete trash. This is a better on Warehouse:

$ pip search --index https://pypi.org/pypi requests
requests (2.18.4)                         - Python HTTP for Humans.
  INSTALLED: 2.18.4 (latest)
aiohttp-requests (0.1.0)                  - A thin wrapper for aiohttp client with Requests simplicity
anonymous-requests (0.2)                  - 
apiclient-requests (0.1.2)                - A simple python base package for building good api clients on
careful-requests (0.1.4)                  - Requests for header-sensitive servers (like Accept-Encoding)
crawl-requests (2.2.8)                    - crawl_requests(like requests) can update ua and proxy automatically.
gcloud-requests (1.1.9)                   - Thread-safe client functionality for gcloud-python via requests.
jsonapi-requests (0.6.0)                  - Python client implementation for json api. http://jsonapi.org/
jsonrpc-requests (0.4.0)                  - A JSON-RPC client library, backed by requests
nav-requests (1.1.4)                      - Renamed to `nav`
parse-requests (1.0.7)                    - parse-rest-python - A fast and simple Python library to interact with Parse.com REST API
play-requests (0.0.3)                     - pytest-play plugin driving the famous Python requests library for making HTTP calls
PyGithub-requests (1.26.0)                - Use the full Github API v3
Randomized-Requests (1.0.2)               - Python package that makes post and get request with random proxy and user agent
requests-aeaweb (0.0.1)                   - Requests wrapper to log onto AEAweb.org.
requests-aliyun (0.3.1)                   - authentication for aliyun service
requests-auth (1.0.2)                     - Easy Authentication for Requests
requests-aws (0.1.8)                      - AWS authentication for Amazon S3 for the python requests module
requests-aws4auth (0.9)                   - AWS4 authentication for Requests
requests-bce (0.0.5)                      - authentication for bce service

...

Which gives us ~111 lines of output, and which actually returns some meaningful output.

I believe that this command is a fairly regular source of confusion for users, primarily because it uses a different source of truth than pip install does, which means they need to configure the location to search at differently than they need to configure the location to install from (and the search API is not standardized, and to the best of my knowledge, very few alternative implementations support it.

There has been a long standing idea of switching search to use the PackageFinder() class to try and resolve these issues, but I don't think that is going to work reasonably either. The problem is that while that would reconcile the differences between, PEP 503 doesn't provide any mechanism to pass information like the summary that we print alongside each result above. Speaking with my PyPI hat on, I would be very opposed to adding such information to the PEP 503 repository API, because it would bloat the responses and have them take up more bandwidth for a very minority edge case. The other problem is that the PackageFinder() API itself doesn't fall back to /simple/ anymore, but that's resolvable but the larger issue with that is that /simple/ is 7MB large as of right now, and that is likely to continue to grow, having pip search issue a 7MB http request is a pretty crummy experience.

So that leaves us in a bit of a sticky situation. The current implementation is confusing and practically speaking only searches PyPI and not anywhere else, but our best path forward for resolving that is a non-starter due to other concerns.

So I think we should just rip the bandaid off and deprecate and eventually remove the pip search command. The only other alternative I can really think of that would actually resolve this, is to switch to using /simple/, but that would then mean getting hit with a 7MB download just to try and search.

Thoughts @pypa/pip-committers?

@pfmoore
Copy link
Member

pfmoore commented Apr 13, 2018

I've rarely used pip search, and never really got useful data from it. But then again, I've never got much use out of PyPI's search facilities anyway.

I'd like to see a good search facility on PyPI, and if there were one, I'd want to be able to use it on the command line as pip search. I'd be happy for pip search to only work on indexes that support a (yet to be defined in a PEP 😄) standard search API, assuming that PyPI is one such index.

So I guess I'm -1 on removing pip search, but happy to acknowledge that it's currently useless - possibly even to the extent of having it simply report "Search is currently disabled because there is no search API defined for package indexes - please use the index search page directly". Long term I do think it's worth having though.

(Another option is that we could develop a plugin API and delegate producing a good pip search to 3rd party contributors. But that's a whole other debate ;-))

@dstufft
Copy link
Member Author

dstufft commented Apr 13, 2018

@pfmoore Is there a functional difference of deprecating/removing the search command now, and if a standardized search API gets designed and implemented, adding it back at a later point? Having the command there but always returning an error seems the worst of all possible outcomes.

@pradyunsg pradyunsg added the type: maintenance Related to Development and Maintenance Processes label Apr 13, 2018
@pfmoore
Copy link
Member

pfmoore commented Apr 13, 2018

Two, in my view. Both minor.

  1. It signals our intent to have a search command, not simply to dump it. (Assuming that is our intent...)
  2. If there are people using it, even in its current bad state, leaving it alone avoids harming them, and costs us little. Obviously going as far as a "Currently disabled" message removes this benefit (so this one only applies if we stop at the "acknowledge it's useless" level 😄).

But honestly, I don't care enough to fight for its retention. I do think it deserves a deprecation cycle, but you included that in your proposal anyway, so that's fine.

@cjerdonek
Copy link
Member

I tried a couple examples (pip search websockets and pip search trio), and the results seem reasonable / useful.

For the "requests" example, it looks like the reason it's picking up ~900 instead of ~100 is that the first implementation is also searching the summary string instead of just the project name, and "requests" is a common English word. For example, it picks up this:

odoo8-addon-sql-export (8.0.1.0.0.99.dev9)         - Export data in csv file with SQL requests

@pradyunsg
Copy link
Member

Do we expect the XML-RPC API to be removed before we manage to make a search API?

If that's the case, we should go ahead and pull out the band aid. Otherwise, what Paul suggests (making it loudly known that we expect pip search to "lie" and be no-so-useful) is also fine.

@pradyunsg pradyunsg added the C: search 'pip search' label Apr 15, 2018
@pradyunsg
Copy link
Member

Any interest in this? I don't mind deprecating it for removal.

@xnuinside
Copy link

If pip search will be deprecated, which tool need to use for packages search in PyPi (not in warehouse, in public)? I ask, because, we have several internal cases when we use "pip search". Thanks in advance for the answer!

@di
Copy link
Sponsor Member

di commented Sep 5, 2018

@xnuinside I'm not sure I understand your question, specifically the "(not in warehouse, in public)" part, can you elaborate?

The pip search command uses PyPI's XML-RPC API that is available for anyone to use: https://warehouse.readthedocs.io/api-reference/xml-rpc/

@xnuinside
Copy link

xnuinside commented Sep 5, 2018

@di, Dustin, PyPI's API it is what I need) thx

@pradyunsg
Copy link
Member

No one is opposed to actually deprecating the pip search command and, as far as I can tell, there's no reason to not do this soon. I've gone ahead and added this to pip 20.2 release, since we've missed the window for pip 20.1.

@di
Copy link
Sponsor Member

di commented Apr 22, 2020

FWIW, a while back I asked what people used pip search for and got some responses: https://twitter.com/di_codes/status/1131243583078588418

@pradyunsg
Copy link
Member

pradyunsg commented Apr 22, 2020

@di Thanks, I was aware of that and if that's representative of our userbase (it's not, but let's roll with the assumption), Ee's response looks like the main reason that endpoint is hit so often.

None the less, deprecation is probably the best tool we have to surface any concerns users might have with removal/replacement of functionality. That should tell us if pip search is the reason for PyPI's XML-RPC search endpoint numbers. :)

@chrahunt
Copy link
Member

chrahunt commented Jul 8, 2020

Just confirming, the intent is to add the deprecation warning by 20.2? The way we're talking it sounds like we might drop it in 20.2, but there's no deprecation warning being emitted to warn people!

@pradyunsg
Copy link
Member

The plan is definitely to deprecate in 20.2.

@pradyunsg
Copy link
Member

Quoting @nlhkabu's excellent suggestion from Zulip:

I don't think it is a good look to mark something as "for deprecation" and then backtrack on that decision. Maybe instead we can add a warning asking for user feedback. E.g.
"The pip team is considering refactoring or removing this command. Please let us know how you use it here: url"

I think I'm going to go ahead and file a new issue, and use that as a URL for this message. If anyone has inputs, please let me know. I'd really like to slip this into 20.2 if possible.

@nlhkabu nlhkabu added the UX User experience related label Jul 14, 2020
@domdfcoding
Copy link
Contributor

I might have missed it, but was there any follow up to the survey on how people use the search command?

@bitplane
Copy link

bitplane commented Sep 21, 2022

Why not just download a list of packages from an URL and grep it locally? We expect a package repository to have a search command, the default one is broken, so just y'know, make it search locally like apt does.

Removing it or keeping it with error messages seems like an admission of failure. Downloading a cacheable, zipped list and grepping it isn't much of a technical challenge. Is the problem political?

Edit:

pypi/warehouse#12242

@domdfcoding
Copy link
Contributor

There was discussion further up that pip search doesn't just search the package names – it also searches the summary description, and possibly other metadata.

As for client-side searches of metadata, I made https://github.com/domdfcoding/pypi_search/ last week which is periodically updated with package names and summaries.

But if all you want to search are package names you can download the file at https://pypi.org/simple/ and parse it – even easier with the JSON version of that page now.

@eabase
Copy link

eabase commented Oct 6, 2022

@bitplane

Why not just download a list of packages from an URL and grep it locally?

I don't think you read the entire thread. That's exactly what pip-search as part of the pip-date package, does.

From this comment.

@pradyunsg
Copy link
Member

pradyunsg commented Oct 7, 2022

I might have missed it, but was there any follow up to the survey on how people use the search command?

No, that hasn't happened yet.

I was wrong -- https://docs.google.com/forms/d/1-4kiVV8NnlkBrCr6x7eb8SobnS1RVRAR2xEl8iUeu24/viewform was a survey we did. One of the maintainers will need to check what the results of that were.

rarylson added a commit to rarylson/update-conf.py that referenced this issue Oct 21, 2022
Fixes are:

- Update badges in `README.md`
- Fix `pip search` as it's deprecated
    - See: pypa/pip#5216
rarylson added a commit to rarylson/update-conf.py that referenced this issue Oct 21, 2022
Fixes are:

- Update badges in `README.md`
- Fix `pip search` as it's deprecated
    - See: pypa/pip#5216
rarylson added a commit to rarylson/update-conf.py that referenced this issue Oct 21, 2022
Fixes are:

- Update badges in `README.md`
- Fix `pip search` as it's deprecated
    - See: pypa/pip#5216
@Ledenel
Copy link

Ledenel commented Nov 12, 2022

I think PyPI updating the deprecation message, to be clearer will help reduce user confusion. Currently, we're getting:

❯ pip search pip
ERROR: XMLRPC request failed [code: -32500]
RuntimeError: PyPI's XMLRPC API is currently disabled due to unmanageable load and will be deprecated in the near future. See https://status.python.org/ for more information.

Replacing that with something like the following will help:

❯ pip search pip
ERROR: XMLRPC request failed [code: -32500]
RuntimeError: PyPI does not support 'pip search' anymore. Please use the pypi.org search via a browser instead.

I'll see if I can make a PR to Warehouse for this.

I'm unsure how I feel about removing the command entirely at this point -- I've seen that it works with non-PyPI package indexes that implement the API (devpi and Artifactory, for example) which don't really have other "easy to use" package search APIs today.

Maybe we can switch to an alternative implementation for pip search like shubhodeep9/pipsearch#1 ?

@pradyunsg
Copy link
Member

@pradyunsg
Copy link
Member

pradyunsg commented Nov 12, 2022

FWIW, I'm opposed to trying to parse HTML out of pypi.org/search results -- that's inherently fragile and extremely coupled to what PyPI implements today (making that exceedingly difficult for alternative indexes to implement or for PyPI to update its HTML markup used to present that data).

If PyPI added an alternative search API that had better cachability, providing better DOS protection (which has been requested already), we could switch over pip search to use that if XMLRPC doesn't work (or vice-versa).

@diimpp
Copy link

diimpp commented Nov 30, 2022

Why I've just spend 10 minutes reading drama, while I just wanted to pip search? :D What's the point of shipping deprecated command with cryptic message?

@notatallshaw
Copy link
Contributor

Why I've just spend 10 minutes reading drama, while I just wanted to pip search? :D What's the point of shipping deprecated command with cryptic message?

Because the command isn't deprecated, the API on Pypi is though. You can point it to a private repo that supports XMP RPC search and it works fine.

@pradyunsg
Copy link
Member

pradyunsg commented Nov 30, 2022

What's the point of shipping deprecated command with cryptic message?

❯ pip search foo
ERROR: XMLRPC request failed [code: -32500]
RuntimeError: PyPI no longer supports 'pip search' (or XML-RPC search). Please use https://pypi.org/search (via a browser) instead. See https://warehouse.pypa.io/api-reference/xml-rpc.html#deprecated-methods for more information.

It's not exactly cryptic.

@ghost
Copy link

ghost commented Feb 4, 2023

As of 2023, the documentation for a command does not specify that is uses XML-RPC API and may or may not be deprecated. A minimum information with example that clearly does not work and eventually leads you to this page.

So, after years of discussion, is there a decision? :)

@paranic
Copy link

paranic commented Feb 10, 2023

why don't you just release a pip version with a workaround? this is so bad for the reputation of python in general.

@pradyunsg
Copy link
Member

So, after years of discussion, is there a decision? :)

No, not yet. We’ve not changed anything in pip’s core implementation other than improving the error messaging since pypi.org unconditionally fails now. Using it with alternative indexes still works, however.

why don't you just […]

If this were just that simple, we’d have done it by now. It’s not.

@pradyunsg
Copy link
Member

pradyunsg commented Feb 11, 2023

Ok, I’m gonna go ahead and say that…

We’re not going to be changing the pip search command until there’s a new search API designed for PyPI. Until then, we’re going to leave the pip search implementation largely untouched on pip’s end since there’s little cost to having that logic in pip.

With that rationale, I’m gonna close this out as “no, let’s keep this even if it’s non-functional with pypi.org today”.

update: I assume people aren't reading the conversation above -- I've seen that the XML-RPC API works with non-PyPI package indexes that implement it (devpi and Artifactory, for example) which don't really have other "easy to use" package search APIs today outside of this.

@DiddiLeija
Copy link
Member

Thanks for tracking this @pradyunsg! I agree with closing this, since there's nothing we can do by now.

@paranic
Copy link

paranic commented Feb 11, 2023

how about contacting pypi maintainers;

@merwok
Copy link

merwok commented Feb 11, 2023

You should read the original message here, and the three first links in it.
This is not easy to fix given current constraints. Refusing to read the explanations and adding unhelpful messages is just rude.

@pypa pypa locked as resolved and limited conversation to collaborators Feb 11, 2023
@pradyunsg
Copy link
Member

pradyunsg commented Feb 11, 2023

Locking this since this topic has run its course and I don’t expect to see any new bits of information coming out of the sort of discussion that’s been taking place here over the last few months.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: search 'pip search' state: awaiting PR Feature discussed, PR is needed type: deprecation Related to deprecation / removal. type: maintenance Related to Development and Maintenance Processes
Projects
None yet
Development

Successfully merging a pull request may close this issue.