Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better display of binary data on arbitrary query results page #1051

Closed
simonw opened this issue Oct 25, 2020 · 6 comments · Fixed by #1061
Closed

Better display of binary data on arbitrary query results page #1051

simonw opened this issue Oct 25, 2020 · 6 comments · Fixed by #1061

Comments

@simonw
Copy link
Owner

simonw commented Oct 25, 2020

https://latest.datasette.io/fixtures?sql=select+rowid%2C+data+from+binary_data+order+by+rowid+limit+101

fixtures__select_rowid__data_from_binary_data_order_by_rowid_limit_101_and_Switch_to__blob_render_extension_for_BLOB_downloads_·_Issue__1050_·_simonw_datasette

Problem: if these were larger fields that HTML page could have multiple megabytes of Python binary string representations on it.

It should behave more like the regular table view does:

fixtures__binary_data__2_rows

@simonw
Copy link
Owner Author

simonw commented Oct 25, 2020

Providing a binary download link here is actually extremely difficult.

The problem is that the SQL query itself represents data that can change from one moment to the next. It's no good showing a "Binary: 55 bytes" message that links to that same SQL query but with a .blob extension and arguments to select the particular result, because the data may change in a way that causes that query to return a different row - at which point the download link will give you the wrong data, not the 55 bytes you asked for.

So providing a download link risks being misleading.

@simonw
Copy link
Owner Author

simonw commented Oct 25, 2020

This is bad though, because if I want to provide binary data in CSV as requested in #1034 I need some way of providing that data.

Which suggests to me that the base64 option is the only one that can make sense for arbitrary SQL queries represented as CSV. Download links won't work.

@simonw
Copy link
Owner Author

simonw commented Oct 26, 2020

Crazy idea: generate a signed URL containing a base64 of the gzip of the binary content (to try and reduce size).

No: this will blow through URL limits in various hosting providers and possibly even browsers. It could be made to work a little bit more reliably with some extra JavaScript that turns it into a download on the browser-side, but that would be hideously complicated.

Also the signed bit doesn't prevent people from generating SQL queries that generate nasty binary blobs for download.

I'm beginning to think that restricting this feature to just table view, not query view, is a better idea. Query view can still get at the binary using JSON and base64.

@simonw
Copy link
Owner Author

simonw commented Oct 26, 2020

I still need to improve the current binary display on the query page though, where it outputs a Python b'...' literal.

@simonw
Copy link
Owner Author

simonw commented Oct 29, 2020

OK, alternative idea. The .blob output renderer from #1050 gets to see multiple rows at once.

For an arbitrary SQL query, how about if I link to this?

/db.blob?sql=...&_blob_column=data&_blob_hash=bc4c24181ed3ce666

Then the output renderer loops through all of the data results that are available to it and, if one of them hashes to that value, serves up that data?

If no matches are found it can show an error message telling you that the link has expired (presumably because the underlying database has changed since the link was generated).

I think this might be the best solution to the problem.

@simonw simonw mentioned this issue Oct 29, 2020
2 tasks
simonw added a commit that referenced this issue Oct 29, 2020
* _blob_hash= checking plus refactored to use new BadRequest class, refs #1050
* Replace BlobView with new .blob renderer, closes #1050
* .blob downloads on arbitrary queries, closes #1051
@simonw
Copy link
Owner Author

simonw commented Oct 29, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant