Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds __str__ which generates a text rendering #262

Merged
merged 4 commits into from Nov 30, 2019
Merged

Adds __str__ which generates a text rendering #262

merged 4 commits into from Nov 30, 2019

Conversation

HDembinski
Copy link
Member

@HDembinski HDembinski commented Nov 28, 2019

The original idea was to add .show() to generate the text rendering, with possibly some options on how to render. Then .__str__(self) would call .show(). However, this does not work well in the REPL, unless .show() returns a special string class that has a modified __repr__(self):

In [8]: h.show()
Out[8]: 'histogram(regular(10, 0, 1, metadata="None", options=underflow | overflow))\n              +--------------------------------------------------------------+\n[-inf,   0) 0 |                                                              |\n[   0, 0.1) 0 |                                                              |\n[ 0.1, 0.2) 0 |                                                              |\n[ 0.2, 0.3) 0 |                                                              |\n[ 0.3, 0.4) 0 |                                                              |\n[ 0.4, 0.5) 0 |                                                              |\n[ 0.5, 0.6) 0 |                                                              |\n[ 0.6, 0.7) 0 |                                                              |\n[ 0.7, 0.8) 0 |                                                              |\n[ 0.8, 0.9) 0 |                                                              |\n[ 0.9,   1) 0 |                                                              |\n[   1, inf) 4 |============================================================= |\n              +--------------------------------------------------------------+\n'

All that can be done, but I think this is overengineering. It is ok if the lowest layer just does something simple. I don't think we should even make it configurable. Upper layers will most likely override the display anyway with something custom.

@HDembinski
Copy link
Member Author

I think this should be included in 0.6. It does not add a new interface that we have to define and we are free to change what __str__ returns at any time.

@HDembinski HDembinski marked this pull request as ready for review November 28, 2019 11:44
@HDembinski
Copy link
Member Author

I also took the liberty to use the same repr and str for the cpp skin and the main skin.

@henryiii
Copy link
Member

I don't think we should expose things like:

  • metadata="None" (not a string!)
  • options=underflow | overflow (not how options are built)
  • histogram (lower cased)

In the regular Histogram. I'd rather wait and try to do this correctly. It would be a fine feature for 0.7.0.

Also, an object's str should not change depending on the terminal width...

@henryiii
Copy link
Member

henryiii commented Nov 28, 2019

.show() would print directly to the terminal, and therefore would know the terminal width and would not come up as a string. There would be an option to return a string instead for use in __str__.

@henryiii
Copy link
Member

I'm not horribly against this if you really want it in, by the way.

@HDembinski
Copy link
Member Author

If .show() prints to the terminal, it cannot be called from __str__

@HDembinski
Copy link
Member Author

I see no reason to delay this, as __str__ can be refined later. I will try to remove the first line generated by the cpp code, then the things you criticise "I don't think we should expose things like:" go away.

@HDembinski
Copy link
Member Author

And I don't want __str__ to rebin. :)

@HDembinski
Copy link
Member Author

Also, an object's str should not change depending on the terminal width...

It would feel weird for me for repr, agreed, but not for __str__.

@HDembinski
Copy link
Member Author

I am really open to change this, if this solution turns out to be bad, but I really think that fancy printing is better placed in an upper level library like hist.

@HDembinski
Copy link
Member Author

Having __str__ now is good to get some experience with this solution and we don't break anyone's code if we change it later.

@henryiii
Copy link
Member

henryiii commented Nov 28, 2019

Things that produce strings should not check the terminal width. If you do:

with open(output, 'w') as f:
    print(histogram, file=f)

that should not depend on your current terminal width!

histogram.show() should print to a configurable stream, sys.stdout by default. Then if you do histogram.show(out=f), it detects this is a file with no width and does the correct thing, and __str__ can use a StringIO instance, which also correctly detects that there is no width.

Having __str__ now is good to get some experience with this solution and we don't break anyone's code if we change it later.

That's why I'm okay with this if you want it in. I like not having the histogram printout in the C++ repr, since it is sometimes seen when you get an error, and having a huge repr is hard to manage. I do think it should be __str__ on the cpp histogram.

And I don't want __str__ to rebin.

I'd like it to rebin (eventually) for some really large number, like 1000, just so we don't kill terminals with printouts if someone tries to print it out (and could be a global configurable, like Numpy). This way it only affects users who would get a massive printout accidentally (and show would be fine not rebinning at all by default). Numpy used to do this a long time ago, and I had an 18,000,000 element array that used to kill it if accidentally displayed.

@henryiii henryiii mentioned this pull request Nov 28, 2019
@henryiii
Copy link
Member

henryiii commented Nov 28, 2019

Actually, something like the example above or:

with open(output, 'w') as f:
    f.write(histogram)

Should not easily truncate/shink/rebin, so I think I just talked myself (or you did, not sure) out of str truncating. I still would be 100% okay with an emergency truncation roughly equivalent to Numpy's default (we could even use Numpy's setting as it's large at 1000, and f.write(h.view()) would truncate at the same time as f.write(h) would rebin).

@HDembinski
Copy link
Member Author

The good thing about __str__() is that we change this later, without breaking people's code. My personal preference is to see everything by default, because I see print(hist) as something alike a debugging statement. If you want a proper display, you would not use a text rendering. I assume the text rendering is primarily used to do quick checks. I would prefer to not make __str__() configurable in boost-histogram. I think it is perfectly fine if libraries build on top of boost-histogram override __str__(). If in practice it turns out that that one of those solutions is better, then I am of course open to adapt that solution.

@HDembinski HDembinski merged commit 069abf7 into develop Nov 30, 2019
@HDembinski HDembinski deleted the add_show branch November 30, 2019 09:50
HDembinski added a commit that referenced this pull request Dec 1, 2019
* unified repr and str for cpp and main interface, added tests
* only show pure histogram rendering (we have repr for the type), fall back to repr if rank > 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants