Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better replicate ls output #284

Closed
bfirsh opened this issue Nov 8, 2020 · 3 comments
Closed

Better replicate ls output #284

bfirsh opened this issue Nov 8, 2020 · 3 comments
Labels
type/roadmap High-level goals. https://github.com/replicate/replicate/projects/1
Milestone

Comments

@bfirsh
Copy link
Member

bfirsh commented Nov 8, 2020

Why

The usual 2 column terminal table does not work so well for experiment data, because it quickly becomes too wide and doesn't fit.

Specific issues:

  • It gets very wide
  • There's lots of blank horizontal space if you add new params
  • It's not obvious at a glance what has changed in an experiment

How

Some ideas:

  • Make the output responsive, so if you're on a narrower terminal it displays less information, and on a wider terminal it displays more information.
  • Add more dimensions to give us more freedom to design a better interface, so each line doesn't necessarily correspond to one experiment
  • Add formatting, which tabwriter does not let us do Make tabwriter work with formatting #67

Strawman

EXPERIMENT  STARTED     STATUS   PARAMS                 BEST CHECKPOINT     LATEST CHECKPOINT
1f04f97     2020-11-04  stopped  learning_rate=0.001    6059d7b (step 17)   83cceda (step 20)
                                 num_epochs=20          accuracy=0.7366     accuracy=0.7336
                                                      
b1de1ab     2020-11-04  stopped  learning_rate=0.001    db3b439 (step 10)   ce1358f (step 20)
                                 num_epochs=20          accuracy=0.843      accuracy=0.7492

12c306a     2020-11-04  stopped  learning_rate=0.001    601ae0b (step 75)   ea78914 (step 85)   
                                 num_epochs=150         accuracy=0.8153     accuracy=0.8063

3432557     2020-11-07  stopped  learning_rate=0.001    30b5116 (step 134)  2b003cc (step 150)  
                                 num_epochs=150         accuracy=0.8396     accuracy=0.839  
                                 discrimitive_rate=0.5  

20ae2b5     2020-11-08  stopped  learning_rate=0.001    29c95b1 (step 101)  7f7a4d0 (step 150)  
                                 num_epochs=150         accuracy=0.8523     accuracy=0.8425
                                 widen_factor=10
                                 discrimitive_rate=1

e6b67aa     2020-11-08  stopped  learning_rate=0.01     fae24ed (step 5)    fae24ed (step 5)
                                 num_epochs=150         accuracy=0.5643     accuracy=0.5643
                                 widen_factor=2
                                 discrimitive_rate=1

c413e7c     2020-11-10  running  learning_rate=0.01     c19ab4e (step 112)  c19ab4e (step 112)
                                 num_epochs=150         accuracy=0.8734     accuracy=0.8734
                                 widen_factor=2
                                 discrimitive_rate=1

Notes:

  • Pure tabular view does not scale well as you add params, so params and metrics are in single columns, solving it getting to wide. The downside is pure tabular is easier to scan. If you want that, export to CSV (or wait for a GUI!).
  • Params are only displayed if non-blank, solving problem of lots of blank space when you add more params.
  • params and metrics would be truncated to fit into terminal width

Additional improvements:

  • Perhaps --plain (as per https://clig.dev/#output ) or --tabular reverts to current behavior for a tabular view and to pipe to line-based tools. If stdout is a TTY, perhaps this could output in a pager with the correct options such that it doesn't wrap and you can scroll right.
  • Perhaps there is useful formatting we can add -- e.g. to highlight was has changed in an experiment, or add visual structure. This is a tabwriter that works with formatting: https://github.com/juju/ansiterm Make tabwriter work with formatting #67
  • replicate ls gets more verbose in a compounding way because it displays all params used forever. Perhaps a more sensible default would be to display the last 25 experiments, or whatever, and replicate ls -a displays all experiments.
    • Showing latest 25 experiments, 1535 in total. Run replicate ls -a to display all experiments.
  • Perhaps we can add more metrics? Which ones should be added?

Related

@bfirsh bfirsh added the type/roadmap High-level goals. https://github.com/replicate/replicate/projects/1 label Nov 8, 2020
@andreasjansson
Copy link
Member

This is a really neat solution! A few thoughts:

  • I like the idea of an an option to revert to plain single row per experiment if you need greppable output. -w/--wide is probably a better option, since it's used by ab, ps, symbols, etc.
  • It'd be good to keep the existing behavior of showing changing params and metrics only by default, and -a showing all params and metrics.
  • I'm not that keen on only showing the first n experiments, I'd imagine most users of the Replicate CLI know how to pipe things through | head or | tail.
  • Instead maybe we could add a --only argument that lets you specify certain params and metrics you care about.

@bfirsh
Copy link
Member Author

bfirsh commented Dec 28, 2020

I'm not that keen on only showing the first n experiments, I'd imagine most users of the Replicate CLI know how to pipe things through | head or | tail.

In case it wasn't clear: the reason this is useful is it reduces the number of params that are shown, and makes it clear what you are actually doing in your experiments. Over time params will build up and up, and quite quickly you might have dozens of different params you've changed, which will make the last experiment dozens of rows long.

The current algorithm shows params that are different in the displayed list. So, if Replicate showed more recent experiments, it would only show the params you've actually changed recently. These params are probably more relevant to your work, and will make it easier to eyeball what you have done. This isn't possible with head or tail.

There are alternate solutions to this, I think. Maybe we need a different algorithm for determining what params are shown. Maybe this is related to experiment grouping #297, and won't be an issue if experiments are grouped.

Anyway -- this is probably a separate piece of work (why I filed it under "additional improvements"!) but just writing this to be clear.

@bfirsh
Copy link
Member Author

bfirsh commented Jan 7, 2021

This is now mostly implemented in #442

Additional work in #462, #463.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/roadmap High-level goals. https://github.com/replicate/replicate/projects/1
Projects
None yet
Development

No branches or pull requests

2 participants