-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Large Guestbooks #3609
Comments
May require/benefit from further optimizations though. [#3609]
I'd like to quickly review (with @scolapasta?) the plan for the guestbook-responses page. |
…liminating the extra lookup for custom questions and answers, that was adding one extra query per every guestbookresponse in the search results. (#3609)
… guestbook data. Lots of optimizations and fixes. Will add more info in #3609 explaining what's been done and how things are supposed to work now.
Going to make a PR and move this into code review. Will add a few more lines here explaining what's been done, how it's supposed to be working now and how to test stuff. |
I assigned myself to do some code review of pull request #4057 and my first questions are:
Judging from https://help.hmdc.harvard.edu/Ticket/Display.html?id=246137 a large guestbook is 1288 rows ( For testing, it'll be easiest to just use a production database. |
As of b05a026 I noticed some "???" next to "Collected Data" when you preview a guestbook: |
@pdurbin -- good catch, I missed that popup on the dataset pg. I will get a fix in ASAP. |
I wrote a little script in e869900 to help me create two thousand guestbook entries, which I was able to download just fine as of b05a026 from the pull request. Then I went back and tested dd55c08 on the develop branch and I was able to download them there too. I haven't noticed anything objectionable in the pull request apart from a where logging could be reduced. It sounds like @mheppler is going to fix the "???" I found above. @landreev if there's anything specific you want a code reviewer to look for, please let us know. Thanks. |
To do list:
|
@pdurbin the one with 150K responses may be the largest we have. "Download All" button on that page will try to download the guestbook entries for the entire dataverse, meaning about 180K responses. You cannot do that in production currently (will get a 500 error), and you cannot download the results for either of these guestbooks. I tested my patch with the prod. database on vm5; you can now download the results for these largest guestbooks in fairly reasonable time. As for the 1288 rows - note that that was the result of the query that Kevin ran for them on one specific dataset. The manage-guestbooks and guestbook-results pages operate on entire dataverses. |
…to change the display limit on the number of guestbook entries. (#3609)
Added a documentation section on the display limit. |
Summary of the changes, for QA: The download-as-CSV functionality has been optimized, for both the "download all (responses for the dataverse)" and "download the results for the given dataverse and guestbook"; on the manage-guestbooks and guestbook-results pages, respectively; Added help tip text to both pages that explains that the downloaded results are going to be in CSV; that they are importable into Excel/Google Sheets; and encouraging them to use this method if they need to further reorganize the results and/or select the results for specific datasets, files, etc. Fixed the filename for the download function (it was getting chopped on the first space in the name of the dataverse, losing the ".csv" extension in the process). Also, for very large guestbooks (for example: https://dataverse.harvard.edu/guestbook-responses.xhtml?dataverseId=99&guestbookId=9), with the current implementation you cannot even get to the "download button". The page will take a long time to load, then finally fail with a 500. That is because the current implementation tries to display all the guestbook entries on the page as well. Part of the failure is because the retrieval was not very efficient. But even with that optimized, loading 150K entries on the page is still not a good idea: It will take a long time for the browser to render, no matter what you do; and it's probably not very useful to a user either. |
This comes up from time to time when a user with a very large guestbook wants to download it but can't. See RT 246137 for an example of a repeating request. So for now this requires us to run a database query and send the results to the user whenever they need updated info.
Please note this is separate from the bug where some guestbooks can be downloaded in FF and Safari but not Chrome ( #3581 ) . At first glance they seem the same but they are not.
The text was updated successfully, but these errors were encountered: