Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download component #863

Closed
wants to merge 14 commits into from
Closed

Download component #863

wants to merge 14 commits into from

Conversation

emilhe
Copy link
Contributor

@emilhe emilhe commented Sep 13, 2020

As per the discussion #216, this pull request is an attempt to merge my unofficial Download component (currently residing in the dash-extensions library) into dash-core-components. The simplest use case is download of raw data,

import dash
import dash_html_components as html

from dash.dependencies import Output, Input
from dash_core_components import Download

app = dash.Dash(prevent_initial_callbacks=True)
app.layout = html.Div([html.Button("Download", id="btn"), Download(id="download")])


@app.callback(Output("download", "data"), [Input("btn", "n_clicks")])
def func(n_clicks):
    return dict(content="Hello world!", filename="hello.txt")


if __name__ == "__main__":
    app.run_server()

In addition to the Download component itself, a few utility functions are included that makes it easy to download files,

import dash
import dash_html_components as html
import dash_core_components.express as dcx

from dash.dependencies import Output, Input
from dash_core_components import Download

app = dash.Dash(prevent_initial_callbacks=True)
app.layout = html.Div([html.Button("Download", id="btn"), Download(id="download")])


@app.callback(Output("download", "data"), [Input("btn", "n_clicks")])
def func(n_clicks):
    return dcx.send_file("./tests/integration/download/download-assets/Lenna.jpeg")


if __name__ == "__main__":
    app.run_server()

and data frames,

import dash
import pandas as pd
import dash_html_components as html
import dash_core_components.express as dcx

from dash.dependencies import Output, Input
from dash_core_components import Download

# Example data.
df = pd.DataFrame({"a": [1, 2, 3, 4], "b": [2, 1, 5, 6], "c": ["x", "x", "y", "y"]})
# Create app.
app = dash.Dash(prevent_initial_callbacks=True)
app.layout = html.Div([html.Button("Download", id="btn"), Download(id="download")])


@app.callback(Output("download", "data"), [Input("btn", "n_clicks")])
def func(n_clicks):
    return dcx.send_data_frame(df.to_csv, "mydf.csv")


if __name__ == "__main__":
    app.run_server()

I was not really sure where to put these util functions, so for now i have put them in a new express module in dash_core_components. I guess there are (at least) a few points to discuss,

  • Could the syntax be better?
  • Where should the util functions reside?

I have added tests for all three cases (raw content, file, data frame), but i haven't added an example in the Demo, as i haven't been able to get it running (see #805 ).

EDIT: I am also having some issues with lint; running black locally, my tests .py files are OK. But on the build server they are only OK for python 2.7, not for python 3.6 and 3.7 (locally i am running 3.7).

@alexcjohnson
Copy link
Contributor

@emilhe Thanks so much for this PR, and my sincere apologies for not reviewing it earlier. Looks very promising!

A few general thoughts before I get into specific code comments:

  • Don't put anything into dash_core_components/ manually - this whole dir is expected to be generated by the build step. Fortunately you can just move the file into dash_core_components_base/ and the build step will copy it over 🎉
  • Re: where should the util functions reside: Curious to hear @chriddyp or @Marc-Andre-Rivet chime in, but my gut reaction would be to call the module something like utils.py but also import the user-facing functions into __init__.py (again, in dash_core_components_base/) and document their usage from the top level, ie dcc.send_file etc.
  • re: Black - dcc gets black from dash[dev] which currently pins it to 19.10b0 - do you have a different version? At some point we should update that, but this is likely going to require changes throughout the dash core so for now it's probably easiest to just work with that version.
  • Re: the demo - don't worry about that, we're not maintaining that anymore and should probably remove it. Once this PR is about ready to merge we'll want to add it to https://github.com/plotly/dash-docs

"to_parquet": True,
"to_msgpack": True,
"to_stata": True,
"to_pickle": True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This True/False looks funny - as if False means "not a known writer"
Perhaps the values could just be send_bytes/send_string?

That said, are the functions send_bytes and send_string useful on their own? With writer as the first arg it feels pretty specific to pandas, and it doesn't look like a helper (or b64 encoding) is necessary for strings anyway - you just use dict(content=the_string, filename=the_filename) right? Does the same apply to byte strings? If that's right and these are only useful internally I'd give them underscored names like _send_pd_string - or even inline them in send_data_frame and call this dict _pandas_writes_bytes - that would be DRYer since the only differences between the two functions are BytesIO vs StringIO and whether you .encode() after .getvalue().

Copy link
Contributor Author

@emilhe emilhe Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, i guess that would actually make more sense :)

Yes, I consider the send_bytes and send_string functions to be usefull on their own. The writer can be any object, not just pandas writers. The most common use case is probably sending excel files, but it could be anything. It's not more than a few weeks ago since i got the latest request on how to send a particular file type, in this case pdf files created by pdfrw. Since pdfrw creates binary files, the solution is

send_bytes(lambda buf: pdfrw.PdfWriter().write(buf, my_pdf_data), filename="my_pdf.pdf")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can confirm send_bytes is useful (I was the one who asked about pdfrw). It's working great within my dash app currently

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense. My main concern was that you have to pass a function in, and I imagine in a lot of these situations you'll already have a string or bytestring so it would feel funny to have to wrap that in a lambda. What if we allowed that first parameter to be either a writer function or a string / bytestring? For character strings that's only a slight simplification vs constructing the dict yourself, but still may be easier for users to learn. For bytestrings the argument feels more solid, as send_bytes ensures we get the encoding right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you point. That being said, i haven't yet encountered a case where I have a bytes object, passing around a writer seems to be more common (at least in my experience). I have now refactored the code and made some minor changes so that,

  • send_bytes now accepts a bytes object or a writer compatibale with BytesIO
  • send_string now accepts a str object or a writer compatiable with StringIO

While send_string doesn't simplify much as compared to constructing the dictionary manually, i actually like that there are util methods for all common use cases. This way, the data structure (a dict with certain properties) is hidden from the user; it essentially becomes an implementation detail.

emilhe and others added 2 commits October 20, 2020 07:16
Co-authored-by: Alex Johnson <johnson.alex.c@gmail.com>
* Reverting test files from black 20.8b1 to 19.10b0 (for compatibility)
* Moving express.py to dash_core_components_base + added import in __import__ (+ updated tests accordingly)
@emilhe
Copy link
Contributor Author

emilhe commented Oct 20, 2020

  • I have moved the express module to dash_core_components_base and added it to the __init__.py file. It indeed works (post build), and I really like that you now have one less import statement :)

  • I'll await further comments before proceeding with renaming, i.e. I have kept the express name for now.

  • Ah yes, I had version 20.8b1. I have now downgraded to the appropriate version, and it seems to have solved all of the ci lint issues!

  • Great. I figured documentation should be added at some point.

@alexcjohnson
Copy link
Contributor

@emilhe Sorry about the long delay here - I think I've fixed a couple of remaining minor issues, could you give me (ie maintainers) edit permissions on this PR to push my changes?

@alexcjohnson alexcjohnson mentioned this pull request Feb 6, 2021
@alexcjohnson
Copy link
Contributor

Closing in favor of #926 (which is just the same branch plus some updates)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants