Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any desire for a "pretty print" output of "pip list"? #3651

Closed
dougthor42 opened this issue May 4, 2016 · 10 comments
Closed

Is there any desire for a "pretty print" output of "pip list"? #3651

dougthor42 opened this issue May 4, 2016 · 10 comments
Labels
auto-locked Outdated issues that have been locked by automation

Comments

@dougthor42
Copy link
Contributor

dougthor42 commented May 4, 2016

  • Pip version: 8.1.1
  • Python version: Python 3.4.3 x64
  • Operating System: Windows 7

Description:

Is there any desire for a "pretty print" output of pip list?

I'm willing and able to do the work for this, but I'd rather not bother if it's not something people are interested in.

What I've run:

N/A

Comments:

When running pip list, the output looks like so:

$ pip list
Sphinx (1.2)
spyder (2.3.8)
SQLAlchemy (0.9.0)
statsmodels (0.5.0)
stdlib-list (0.2.1)

$ pip list -o
Sphinx (1.2) - Latest: 1.4.1 [wheel]
spyder (2.3.8) - Latest: 2.3.9 [sdist]
SQLAlchemy (0.9.0) - Latest: 1.0.12 [sdist]
statsmodels (0.5.0) - Latest: 0.6.1 [sdist]

I thought that it might be nice to have the option to arrange the items into columns:

$ pip list --pretty
Package           Current
Sphinx            1.2
spyder            2.3.8
SQLAlchemy        0.9.0
statsmodels       0.5.0
stdlib-list       0.2.1

$ pip list -o --pretty
Package           Current      Latest     Type
Sphinx            1.2          1.4.1      [wheel]
spyder            2.3.8        2.3.9      [sdist]
SQLAlchemy        0.9.0        1.0.12     [sdist]
statsmodels       0.5.0        0.6.1      [sdist]
stdlib-list       0.2.1        0.3.3      [wheel]

Implementation

In order to implement this, I would modify list.py, specifically the output_package and run_outdated methods of ListCommand. The modifications would be to add format specifiers to the return values (tentative):

  • output_package: '%-40s %-15s %7s'
  • run_outdated: '%-35s %-15s %s'

I would of course add the ``--prettyoption toListCommand.init()` and update documentation to reflect the change. To maintain backwards compatibility, the --pretty option would default to false, so there will be no change in behavior for default commands.

Potential issues

  • Given that I'll continue to use %-style formatting, and that the option will default to off, I don't foresee any major issues.
  • There may be some truncation of package names, versions, or paths.
    • It's hard to guess how long a package name or version will be, so I plan on maxing out the terminal width (79 chars) and giving more space for the version (as I think that's more important than the name).

My original plan was to use .format(), but that would have resulted in the following issues:

  • Adding this will be backwards-incompatible with python version prior to 2.6, as .format was not introduced until then.
  • The .format function is slightly slower than %-style formatting. It also does not have the nifty logging lazy-eval that %-formatting does.
@pfmoore
Copy link
Member

pfmoore commented May 4, 2016

Some comments:

  1. The output without --pretty should remain as it is, because scripts probably rely on it (your first example with headings used the command pip list, I suspect you meant pip list --pretty).
  2. I'm not massively in favour of the headings, as it makes it harder to process the output with a script (and the new column format is easier than the current format to process with a script). But they are nice for a "human readable" layout.
  3. There's no need for [...] round the type in the column format.
  4. I'm 100% against truncating anything. Ideally, you should have your format adapt to the longest entry, but if you don't want to go to that much effort you should at least let too-long entries fully display at the cost of misaligning the columns on that line. Also, there's no need to take the terminal width into account - just make the lines as long as they need to be and they'll wrap if the terminal is too narrow - the same "never lose information" principle applies here as with column widths.
  5. If you don't do adaptive column widths, you'll be likely to have an ugly display as you will either leave too much space for potentially long values, or be prone to long values breaking the column layout.

As for the option name, maybe add --columns and -no-headers options. The first selects the columnar layout, and the second suppresses column headers (for script use).

BTW, a simple adaptive tabulation function isn't that hard:

def tabulate(vals):
    assert len(vals) > 0

    sizes = [0] * len(vals[0])
    for row in vals:
        sizes = [max(s, len(str(c))) for s, c in zip(sizes, row)]

    result = []
    for row in vals:
        display = " ".join([str(c).ljust(s) for s, c in zip(sizes, row)])
        result.append(display)

    return result

@dougthor42
Copy link
Contributor Author

dougthor42 commented May 4, 2016

  1. Yes, that was a typo. Any commands without --pretty (or --columns, --no-headers) will remain 100% the same. I've fixed the original post to reflect this.
  2. Agreed.
  3. Also agreed. If you look at the format specifier, I've actually left them out. I just forgot to remove them when making the example.
  4. I agree that truncating version strings is 100% not acceptable. I'm a little less concerned about truncating the package name, since there's likely not much useful information at the very end of it. That said, if people are using --pretty for parsing, it's an issue (why would they do such a thing??). But you want no truncation, so you'll have no truncation 😃.
    • Adaptive Width has pros and cons:
      • Pro: looks better. Much better
      • Major Pro: no truncation
      • Con: Much harder to parse. But again, why use --pretty for parsing?
      • Minor Con: The current method of displaying items uses a single loop. With adaptive width, it would have to be two loops - one for data pull, the 2nd for formatting. Not a big deal since we're not talking about lists with millions, or even thousands, of elements, but still technically a regression in terms of speed/efficiency.
  5. I was planning on going through PyPI and analyzing all of the version strings and package names for length and setting the widths appropriately. But since I'm just going to add adaptive width, I guess I don't need to do that anymore.

So I'll go ahead and start working on it with the following changes:

  • Change options to --columns and --no-header
  • use adaptive width

Question:
How do you feel about having a --fixed-width option with the note that it could result in data loss? This could be useful for those who want to parse the output of --columns.

@pfmoore
Copy link
Member

pfmoore commented May 4, 2016

Personally, I see the adaptive width version as just as easy to parse (in Python, use split(), in shell use awk). And much easier to parse than the current output, to the extent that if it weren't for backward compatibility, I'd personally prefer the tabular output as the default.

With a --fixed-width option we end up with 3 options controlling the "pretty" output, which seems excessive.

@dougthor42
Copy link
Contributor Author

Now that I think about it more, you're right: column formatting will actually be easier to parse via split() because the only spaces that exist are ones that separate the columns (except for ones in file paths...).

And yeah, 3 options just for pretty-printing is quite a lot. I'll leave out --fixed-width.

@dstufft
Copy link
Member

dstufft commented May 5, 2016

I think the proposed format is superior to the old format in basically every way except that the old format currently exists so I'd like to push back on @pfmoore's first point a minute, about the only time I can imagine the old format being preferable is if there is already a script parsing that output so what if instead of just leaving the old format as the default we phased in the new format and deprecated the old?

So we'd first add --columns and --no-columns, defaulting to --no-columns, then as we move along the deprecation path we'd switch the default to --columns, but things can still opt-out of the old format for the time being, then as we moved further along in the deprecation path we'd eventually remove --no-columns and expect everyone to switch to the new format.

@dstufft
Copy link
Member

dstufft commented May 5, 2016

Another thing I wonder, do we really need the --no-header flag? It's trivial in Python (and I imagine any programming language) to skip the first line of output from a subprocess call, and in your typical *nix environment you can do it by just piping the output through another command like tail (e.g. pip list --columns | tail -n +2). I'm not familiar with Windows but it appears you can do the same thing in powershell doing something like pip list | select -Skip 1.

@dougthor42
Copy link
Contributor Author

Well I'm happy to whip up whatever gets decided.

Deprecation
I have no issue with deprecating the current formatting, as long as someone else manages it 😅. My initial thought is to never remove the --no-columns option, but I don't really have any reason for thinking that.

Do you have any proposal for a timeline?

--no-header Flag
That's a valid point. Currently the header contains two lines rather than one, but the trivialness of removing them is the same.

Removing this option will also simplify the code a little bit.

Here's an example of how it currently looks:

C:\> pip list --columns
Package        Version      Location
-------------- ------------ -----------------------------------
jedi           0.9.0
pluggy         0.3.1
prompt-toolkit 0.60
tox            2.3.1
tqdm           4.5.0        c:\gitlab\temp\path with space\tqdm
virtualenv     15.0.1
winpython      1.5.20160402

@pfmoore
Copy link
Member

pfmoore commented May 5, 2016

I agree with pretty much everything @dstufft said. The --no-headers idea was mine, and on reflection you're right it's pretty easy to skip the header if needed. I like the idea of deprecating the current format, just wasn't sure how feasible it would be - but yeah, let's go for it.

@dougthor42 That example looks fantastic.

@dstufft
Copy link
Member

dstufft commented May 5, 2016

@dougthor42 We have a deprecation process and utilities already, you'll just need to do something like:

import warnings
from pip.utils.deprecation import RemovedInPip10Warning

if options.no_columns:
    warnings.warn(RemovedInPip10Warning, "Some message about how this is going to be removed")

From there we'll manage walking it along the deprecation process.

@dougthor42
Copy link
Contributor Author

Just FYI, PR #3654 is ready for review.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 4, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 4, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation
Projects
None yet
Development

No branches or pull requests

3 participants