-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve OMERO.table download usability #300
Conversation
omeroweb/webgateway/views.py
Outdated
@@ -2963,6 +2963,12 @@ def _table_query(request, fileid, conn=None, query=None, lazy=False, **kwargs): | |||
if request.GET.get("limit") is not None | |||
else rows | |||
) | |||
if limit > settings.MAX_TABLE_DOWNLOAD_ROWS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we've repeated the pattern of returning HTTP 200 with error
in a number of places but is that something we actually want to repeat here? Perhaps we use HTTP 429 [1]? Might be outside the scope here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another potential candidate might be HTTP 413 https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/413?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also an option, @sbesson. Again possibly outside the scope of this PR but having a consistent and easily identifiable way of signaling to users that they are asking too much of the server at any one time is probably a very good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since _table_query()
is used directly as a view method table_query = login_required()(jsonp(_table_query))
(although we don't think we use that URL directly), it should return a custom HttpResponse
if we want any status other than a 200
.
However, since it is also used by other view methods such as object_table_query()
, tableData = _table_query(request, annotation["file"], conn, **kwargs)
we'd have to check here for:
if isinstance(tableData, HttpResponse):
return tableData
The alternative is to have @jsonp
decorator handle a dict with error
and status
keys and and update the HttpResponse
it returns accordingly. This could lead to unexpected results if you really wanted to return a JsonResponse
with error
and status
keys and a different actual status code (unlikely)? It keeps the views methods a bit cleaner, but maybe it's too much 'magic' behaviour of @jsonp
and we might want to get rid of json-padding
callback behaviour at some point.
Strong feelings either way?
But probably better to handle in a different PR since it's orthogonal to the changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened a separate PR for discussion there: #301
@chris-allan Other than the response status discussion above, do you (or others at GS) have any other feedback on this and/or are able to test? (see description) |
Currently on our testing infrastructure with tables containing ~500K fairly sizeable rows. @erindiel will provide some feedback @will-moore. |
Thanks for these changes @will-moore. We have a few points of feedback:
|
I've added a Cancel button and fixed the other 2 minor points. |
In the last commit, I've handled failed requests by simply re-trying them. I've not spent too much time on making this UI look slick. Hopefully the Cancel button is clear enough: |
The errors @erindiel was seeing on our testing system were a direct result of the race condition seen by @kkoz outlined in ome/omero-py#292. Traceback for reference:
Once we fix that with ome/omero-py#292 and make a release I think errors will be a rare event. Right now they're a given on almost every download on a production like system setup. /cc @joshmoore |
@erindiel @chris-allan Let me know if you need any other fixes/updates to this: I'm around till Thursday. |
Thanks @will-moore. We confirmed that we can cancel the download, and the UI for this is clear. We can create the race condition by downloading the same table in two separate windows, and confirmed that the retries fix the issue of missed rows. What do you think about retrying ~5 times, then failing the entire download? Multiple failed requests on the same batch likely reflect a more fundamental issue with OMERO.tables or the particular file (although you'd be unlikely to reach this point if that was the case). Overall, a great improvement in user experience, so thanks again! |
@erindiel I just show a simple |
How's this PR looking / working for you @erindiel? |
Everything here is good from our perspective, thanks @will-moore. |
@@ -687,6 +687,13 @@ def leave_none_unset_int(s): | |||
"Prevent multiple files with total aggregate size greater than this " | |||
"value in bytes from being downloaded as a zip archive.", | |||
], | |||
"omero.web.max_table_download_rows": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when release, a release of the documentation will be required so this new parameter is available in the doc
const rowCount = filter ? parseInt("{{ meta.totalCount }}") : parseInt("{{ meta.rowCount }}"); | ||
const tableName = "{{ data.name }}.csv"; | ||
// Use 10 batches, or max 3000 rows per batch | ||
const MAX_BATCH_ROWS = 3000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want those values to be configured too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need to make it configurable unless we know it's useful. Adding code that will likely never be used (and testing it etc) seems like poor use of time.
Like any feature, if we find there's a user who wants/needs it then we can add it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably also dangerous. Using any more than 3000 rows is likely to cause server problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try with other values to see a change in performance? Or is this the "optimal" value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @erindiel tested various values and reported 3000 as best at #300 (comment)
Previously it was 1000 (see a3c1c48).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
Thanks all. Merging. |
This improves the experience and options when downloading OMERO.tables as CSV from
/webclient/omero_table/FILEID/
.?query=colname>value
) then we offer another option of download the Whole table using the current filter (which wasn't possible before), and showing progress.To test UI:
Whole Table
downloads the whole table with progress shown and that theCurrent View
downloads just the current page. Check the CSV files with Excell etc.Whole Table
button should behave as before. TheCurrent View
button should download the current page of filtered results and theWith Filter
button should download ALL the filtered results (with progress bar). Check the CSV files with Excell etc that they have all the expected rows.?limit=11000
and this should return an error page with appropriate message. E.g. https://merge-ci.openmicroscopy.org/web/webclient/omero_table/406232/?limit=11000 (user-3)?limit=5000
To test other URLs:
&limit=1000
OR a query with fewer than 10k results should be fine: e.g. https://merge-ci.openmicroscopy.org/web/webgateway/table/Project.datasetLinks.child.imageLinks.child/105665/query/?query=Y>52000&limit=1000
OR https://merge-ci.openmicroscopy.org/web/webgateway/table/406232/query/?query=Y>52000 should be finecc @chris-allan