Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom home page for Harvard Dataverse #5053

Closed
TaniaSchlatter opened this issue Sep 13, 2018 · 29 comments
Closed

Custom home page for Harvard Dataverse #5053

TaniaSchlatter opened this issue Sep 13, 2018 · 29 comments

Comments

@TaniaSchlatter
Copy link
Member

TaniaSchlatter commented Sep 13, 2018

Harvard dataverse stakeholders would like a custom homepage for the repository. The goals are to:

  • promote Harvard dataverse
  • increase use of Harvard dataverse
  • demonstrate the depth and breadth of the Harvard Dataverse
  • present Harvard Dataverse as part of Harvard, and the Harvard libraries.

There are two phases planned; the image shown below and a revised version adding a visualization and links to an About Harvard Dataverse site. See the Design project doc, with links to related files for details.

harvard_dataverse_beta_mock6_3

@pdurbin
Copy link
Member

pdurbin commented Oct 1, 2018

@matthew-a-dunlap here's the doc we were just looking at that has notes from tech hours and later meetings with design: https://docs.google.com/document/d/1AtvODFY8aNQSE9BleeQEf5GkFsDWrziUB-IX-1xA0Y8/edit?usp=sharing

@mheppler
Copy link
Contributor

mheppler commented Oct 1, 2018

Still waiting on new mockups from @TaniaSchlatter for the Recent Activity section, but here are some of the decisions we discussed in the design mtg.

  • Instead of a general published datasets (which might show harvested results) and file downloads (which would require a new API plugged into guestbook) make all three columns published datasets by dataverse categories (like the third "journal" column is in the mockup)
  • Based on the metrics on dataverse.org, the three category columns discussed were journals (2%), researchers (30%) and research projects (43%) -- but there maybe more discussion needed on these if the queries don't produce recent activity results expected
  • The activity API's for datasets based dataverse category was hoped to be something that would also provide value to other installations who wanted to use this custom homepage template for their installation

@TaniaSchlatter
Copy link
Member Author

There is a new mock up in the doc and descriptions of dynamic content including our development discussions, and discussions with library folks: https://docs.google.com/document/d/1AtvODFY8aNQSE9BleeQEf5GkFsDWrziUB-IX-1xA0Y8/edit

@matthew-a-dunlap
Copy link
Contributor

matthew-a-dunlap commented Oct 4, 2018

This issue is likely blocked by #3616 , in which the image urls provided by our search API are unusable. I say "likely" because I have some skepticism around using the search api in general for this page. I created a prototype javascript block for testing the "most recent datasets by category" functionality, and it seems pretty slow, but there are a number of tricks left to improve it (loading the content async, combining the two search calls into one endpoint, solr caching, etc).

@landreev landreev self-assigned this Nov 2, 2018
@pdurbin
Copy link
Member

pdurbin commented Nov 2, 2018

As I mentioned at standup the other day, the scope of this issue has been increased slightly to add new "naked" endpoints such as...

  • /api/info/metrics/dataverses
  • /api/info/metrics/datasets
  • /api/info/metrics/files
  • /api/info/metrics/downloads

These return counts and if we're rethinking the namespace of the API perhaps we should add the word "count" in there like either of these...

  • /api/info/metrics/counts/dataverses
  • /api/info/metrics/dataverses/counts

... but not breaking backward compatibility with what's in production already.

I loaded up b1e685e on https://dev1.dataverse.org and learned that MyData is broken, throwing the following exception, so I'm moving this back into dev.

java.lang.NullPointerException at edu.harvard.iq.dataverse.search.SearchServiceBean.search(SearchServiceBean.java:246)

If we write some tests for MyData for this issue we will be ahead of the game for #5042.

@landreev
Copy link
Contributor

landreev commented Nov 2, 2018

@pdurbin if it's back in dev., who's working on it?
I will remove my name from it then.

@matthew-a-dunlap
Copy link
Contributor

As @pdurbin mentioned, we've updated the Metrics with new endpoints. We feel there may still be room for improvement with the api namespaces as well as the documentation for those namespaces.

The main open question we've had is how we organize our calls. In the doc they are currently listed as counts and other, but could be changed to by object and by facet. Regardless of the naming we have gone back and forth over whether the paths to the calls should reflect these categories, or if they should stay as is with the shorter structure.

Below is the current doc page.
screen shot 2018-11-02 at 12 18 21 pm

@matthew-a-dunlap
Copy link
Contributor

matthew-a-dunlap commented Nov 7, 2018

Here are up-to-date release / testing notes on this issue:

  • Setup steps:
  • Functionality added:
    • Metrics
      • New endpoint for all-time total GET https://$SERVER/api/info/metrics/$type
        • Old endpoint GET https://$SERVER/api/info/metrics/$type/toMonth should still work
      • Past x days endpoint GET https://$SERVER/api/info/metrics/$type/pastDays/$days
        • Note that the caching on this metric is different from all the others (which are set via jvm option). It is cached every day regardless
      • Dataverses by subject GET https://$SERVER/api/info/metrics/dataverses/bySubject
    • Search inside multiple dataverses (e.g. subtrees). For example: https://demo.dataverse.org/api/search?q=data&subtree=birds&subtree=cats
      • We ended up not using this for the homepage but the functionality is still there. It involved touching the permissions for what objects show up on search calls (e.g. whether unpublished/restricted files should show in a search).
    • Search with/without doing database queries on the entities for additional facet data. This functionality was there but not configurable via api calls, now it is available via the API and turned off by default
    • Change to how thumbnails are queried via api. If there is no thumbnail, Dataverse returns an error. Should not impact anything inside Dataverse itself, only the api endpoint was touched.
    • New values indexed:
      • categoryOfDataverse / identifierOfDataverse added to schema and indexed. dvName is also indexed for dataset (no schema add for this).
        • These all update if the dataset is moved. They reflect the immediate parent dataverse
        • Note that categoryOfDataverse / identifierOfDataverse were named differently as to ensure datasets don't show up in searches for the specific dataverse fields.

@kcondon
Copy link
Contributor

kcondon commented Nov 8, 2018

Issues:
-Categories, all recent links throw exceptions when root dataverse not named root.

@mheppler
Copy link
Contributor

mheppler commented Nov 8, 2018

@kcondon The hardcoded "root" dataverse name was something known during development and accepted as this is a custom homepage template, and the dataverse name can be customized along with the other content, like which categories to show recent datasets from, because other installations might not have "Journals" or "Research Groups".

If this isn't clear in the documentation, we should add that changing of the root dataverse alias is required in the javascript and other links for those to work.

@kcondon
Copy link
Contributor

kcondon commented Nov 8, 2018

@mheppler was just concerned about deploying to production where it would act the same and was not aware of the cause of the problem, plus no errors indicating what the problem was.

@kcondon
Copy link
Contributor

kcondon commented Nov 8, 2018

@mheppler you mean this?

Note that the custom-homepage.html and custom-homepage-dynamic.html files provided have multiple elements that assume your root dataverse still has an alias of "root". While you were branding your root dataverse, you may have changed the alias to "harvard" or "librascholar" or whatever and you should adjust the custom homepage code as needed.

Kind of low on the usability scale IMHO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants