Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hail] teach tables how to HTML #5666

Merged
merged 14 commits into from Mar 28, 2019
Merged

[hail] teach tables how to HTML #5666

merged 14 commits into from Mar 28, 2019

Conversation

@danking
Copy link
Collaborator

@danking danking commented Mar 21, 2019

Take a look at the docs for IPython.display.display.

I preserve the user's ability to specify a custom handler. The handler is no longer given a string but an object that has a sensible __str__ and __repr__. Moreover, this object has a _repr_html_ which Jupyter uses to display an HTML table. Detecting what frontend is being run is done by IPython.display.display.

I use the _Show shim class to avoid having tables themselves print as HTML.

I also check for terminal size and use that to pick n and width.

Should _hl_repr live here?

Resolves #5663, #2847

@@ -279,7 +279,7 @@ task testPython(type: Exec, dependsOn: shadowJar) {
'--color=no',
'-r a',
'--html=build/reports/pytest.html',
'--self-contained-html',
'--self-contained-html', '-vv', '--maxfail=1',
Copy link
Collaborator

@tpoterba tpoterba Mar 21, 2019

disable?

t = t.flatten()
fields = list(t.row)

formatted_t = t.select(**{k: hl_format(v) for (k, v) in t.row.items()})
Copy link
Collaborator

@tpoterba tpoterba Mar 21, 2019

This won't work - it'll try to call hl_format on expressions. hl_format needs to be called on strings after the take localizes them to python

Copy link
Collaborator Author

@danking danking Mar 22, 2019

This isn't hl.format, its hl_format, which is defined above to escape after calling Table._hl_repr. This is basically the same code as what show previously did except I removed truncation and added cgi.escape.

Copy link
Collaborator

@tpoterba tpoterba Mar 22, 2019

wait, but isn't this calling cgi.escape inside of a TableMapRows (select) above? How does that work?

Copy link
Collaborator Author

@danking danking Mar 22, 2019

Copy link
Collaborator Author

@danking danking Mar 22, 2019

import html
print(html.escape(hl.str("tlocus<GRCh37>"))._ir)
(Apply replace (Apply replace (Apply replace (Apply replace (Apply replace (Str "tlocus<GRCh37>") (Str "&") (Str "&amp;")) (Str "<") (Str "&lt;")) (Str ">") (Str "&gt;")) (Str "\"") (Str "&quot;")) (Str "'") (Str "&#x27;"))

Copy link
Collaborator

@tpoterba tpoterba Mar 22, 2019

I don't really like this -- I think there's nothing that prevents the cgi or html implementation from changing and breaking this, and Hail string manipulation is also extremely slow right now.

Can we just call hl_format on each string coming out in line 1367?


if has_more:
n_rows = len(rows)
s += f"<p>showing top { n_rows } { 'row' if n_rows == 1 else 'rows' }</p>\n"
Copy link
Collaborator

@tpoterba tpoterba Mar 21, 2019

we have a plural utility function in hail.utils.misc

Copy link
Collaborator Author

@danking danking Mar 22, 2019

fixed

handler(self._show(n, width, truncate, types))
if n is None or width is None:
import shutil
(columns, lines) = shutil.get_terminal_size((80, 10))
Copy link
Collaborator

@tpoterba tpoterba Mar 21, 2019

very nice.

@danking
Copy link
Collaborator Author

@danking danking commented Mar 22, 2019

e.g. Jupyter. Looks like Jupyter also supports terminal size, and it chooses 40 in this case, which means Jupyter told us 50 lines, which seems a bit much. In reality you can fit about 35 HTML table lines in that browser window including the Jupiter header (which eats about 5 lines).

Screen Shot 2019-03-22 at 12 04 39 PM

@danking
Copy link
Collaborator Author

@danking danking commented Mar 22, 2019

IPython
Screen Shot 2019-03-22 at 12 09 18 PM

Copy link
Collaborator

@tpoterba tpoterba left a comment

move hl_format

@tpoterba
Copy link
Collaborator

@tpoterba tpoterba commented Mar 22, 2019

doctest failure from show() changing the width, it seems

s = "{" + hl.delimit(hl.map(lambda x: Table._hl_repr(x[0]) + ":" + Table._hl_repr(x[1]), hl.array(v)), ",") + "}"
elif v.dtype == hl.tstr:
s = hl.str('"') + hl.expr.functions._escape_string(v) + '"'
elif isinstance(v.dtype, (hl.tstruct, hl.tarray)):
Copy link
Collaborator

@tpoterba tpoterba Mar 22, 2019

you didn't change this code, but this should be ttuple not tarray

Copy link
Collaborator

@tpoterba tpoterba left a comment

So awesome.

elif v.dtype == hl.tstr:
s = hl.str('"') + hl.expr.functions._escape_string(v) + '"'
elif isinstance(v.dtype, (hl.tstruct, hl.ttuple)):
s = "(" + hl.delimit([Table._hl_repr(v[i]) for i in range(len(v))], ",") + ")"
Copy link
Collaborator

@tpoterba tpoterba Mar 25, 2019

needs to check for empty array - that can't be type-imputed

Copy link
Collaborator

@tpoterba tpoterba Mar 25, 2019

(this is your test failure)

Copy link
Collaborator Author

@danking danking Mar 27, 2019

le sigh. fixed.

Copy link
Collaborator

@tpoterba tpoterba left a comment

boom!

@danking danking merged commit c1d08a7 into hail-is:master Mar 28, 2019
1 check passed
@danking danking deleted the table-html branch Dec 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants