Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The 'Advanced Search' doesn't seem to work for publication date. #4908

Closed
mdmADA opened this issue Jul 31, 2018 · 10 comments
Closed

The 'Advanced Search' doesn't seem to work for publication date. #4908

mdmADA opened this issue Jul 31, 2018 · 10 comments
Assignees
Labels
UX & UI: Design This issue needs input on the design of the UI and from the product owner

Comments

@mdmADA
Copy link
Contributor

mdmADA commented Jul 31, 2018

Dataverse v4.6.1 although it doesn't seem to work on dataverse.harvard.edu's version either

The 'Advanced Search' doesn't seem to work for publication date.

For example, if I go to dataverse.harvard.edu -> Advanced Search and enter '2018-07-30' in Dataset->Production Date, no results are returned (the error message that there are no Dataverses, datasets,etc.... is rendered). dsPublicationDate:2018-07-30 is in the search box at the top left of the page.

But there are several datasets with this publication date.

I get the same behaviour on dataverse.ada.edu.au

Is the Advanced Search not intended to be used this way?
Does the date have to be specified in a different format?
Does some other parameter have to be entered?
Does solr have to be reindexed?

The API does not give any results either:

https://dataverse.harvard.edu/api/search?q=*&type=dataset&fq=publicationDate:2018-07-30

{"status":"OK","data":{"q":"*","total_count":0,"start":0,"spelling_alternatives":{},"items":[],"count_in_response":0}}

Perhaps I am not specifying this API call correctly either...

Thanks for any insight.

Just FYI:
ADA wants to be able to send updates to stakeholders at the end of every week to inform as to which datasets have been published that week. Using builtin functionality would be ideal rather than querying the database directly.

@djbrooke
Copy link
Contributor

djbrooke commented Aug 1, 2018

Thanks for the report @mdmADA. I was able to verify this as well.

@pameyer
Copy link
Contributor

pameyer commented Aug 1, 2018

@djbrooke This may be related to an earlier indexing issue #4558

@pdurbin
Copy link
Member

pdurbin commented Aug 7, 2018

Yes, very similar to #4558 in the sense that a field (dsPublicationDate) isn't being indexed. Here's a dump from scripts/search/query of what's in Solr for a dataset on http://phoenix.dataverse.org at the moment:

      {
        "id":"dataset_10",
        "entityId":10,
        "dataverseVersionIndexedBy_s":"4.9.1",
        "identifier":"doi:10.5072/FK2/ZVNFPS",
        "dsPersistentId":"doi:10.5072/FK2/ZVNFPS",
        "persistentUrl":"https://doi.org/10.5072/FK2/ZVNFPS",
        "dvObjectType":"datasets",
        "dateSort":"2018-08-07T00:14:29.222Z",
        "dateFriendly":"Aug 6, 2018",
        "publicationStatus":["Published"],
        "publicationDate":"2018",
        "dsPublicationDate":"2018",
        "isHarvested":false,
        "metadataSource":"Root",
        "datasetVersionId":1,
        "citation":"Spruce, Sabrina, 2018, \"Spruce Goose\", https://doi.org/10.5072/FK2/ZVNFPS, Root, V1",
        "citationHtml":"Spruce, Sabrina, 2018, \"Spruce Goose\", <a href=\"https://doi.org/10.5072/FK2/ZVNFPS\" target=\"_blank\">https://doi.org/10.5072/FK2/ZVNFPS</a>, Root, V1",
        "dsDescriptionValue":["What the Spruce Goose was really made of."],
        "authorName":["Spruce, Sabrina"],
        "authorName_ss":["Spruce, Sabrina"],
        "datasetContactName":["Sabrina Spruce"],
        "nameSort":"Spruce Goose",
        "title":"Spruce Goose",
        "depositor":"Spruce, Sabrina",
        "subtreePaths":["/7",
          "/7/8"],
        "parentId":"8",
        "parentName":"Spruce",
        "_version_":1608097070076395520}

Obviously, we should fix this regression (and #4558 as well) but a workaround would be to search on dateSort instead but because this is a date field rather than a string, you have to use a special syntax described in #2291. I see above that it's still being indexed.

@djbrooke
Copy link
Contributor

djbrooke commented Aug 7, 2018

Thank for the research @pdurbin - I assigned to you to give a brief overview in our next estimation session.

@pdurbin
Copy link
Member

pdurbin commented Aug 7, 2018

a field (dsPublicationDate) isn't being indexed

Duh. Yes it is. It's indexed as "dsPublicationDate":"2018" in the JSON example I gave above.

@mdmADA what this means is that dsPublicationDate does work fine but your expectation of how it works doesn't match how it actually works. You want to to enter a full date like "2018-07-30" but the field only accepts the year like "2018" and the tool tip doesn't give you any help to know this:

screen shot 2018-08-07 at 3 56 41 pm

If you enter "2018" into "Publication Date", you get results, like this:

screen shot 2018-08-07 at 3 57 45 pm

The easiest fix would be to update the tool tip to mention that "YYYY" is the expected format. @mdmADA is than an acceptable solution? Should the display name for the field on the Advanced Search page be changed from "Publication Date" to "Publication Year"? Please let us know what your thoughts are on the "definition of done" for this issue. Thanks for pointing this out!

If it helps most of the thinking for the design of how dates work in Dataverse 4 is captured in the "Dataverse 4.0 Dates (Creation Date, Publication Date, etc.) and Sorting" doc at https://docs.google.com/document/d/1DWsEqT8KfheKZmMB3n_VhJpl9nIxiUjai_AIQPAjiyA/edit?usp=sharing from 2014. Here's a screenshot:

screen shot 2018-08-07 at 4 02 17 pm

In statements like 'The facet for "Publication Date" will be 2014' above you can tell that YYYY was being used. Here's where the code lives:

murphy:dataverse pdurbin$ git diff src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java
diff --git a/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java
index fb991f1..1d71aed 100644
--- a/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java
+++ b/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java
@@ -1129,6 +1129,7 @@ public class IndexServiceBean {
             Calendar calendar = Calendar.getInstance();
             calendar.setTimeInMillis(dataset.getPublicationDate().getTime());
             int YYYY = calendar.get(Calendar.YEAR);
+            logger.info("addDatasetReleaseDateToSolrDoc YYYY: " + YYYY);
             solrInputDocument.addField(SearchFields.PUBLICATION_DATE, YYYY);
             solrInputDocument.addField(SearchFields.DATASET_PUBLICATION_DATE, YYYY);
         }
murphy:dataverse pdurbin$ 

@pdurbin pdurbin added the UX & UI: Design This issue needs input on the design of the UI and from the product owner label Aug 7, 2018
@mdmADA
Copy link
Contributor Author

mdmADA commented Aug 8, 2018

Thanks for the investigation and response, Phil.

I prefer having a label reflect what is expected without having to hover over a tooltip to find out what is expected. Especially since it is not obvious from the DV interface that the labels have associated tooltips so users may not know to hover over the label to find out more.

So my preference for "done" would be to have 'Publication Year' as the label with extra info in the tooltip if necessary.

In saying that, though, from a user-as-data-downloader view, is the publication year enough? Would they want month? Day? Example: Let's see what's been published in the past 6 months. Or since Aug. 1.

As a user-as-data-manager, ADA needs to be able to query at the year/month/day level. We have created a solution using metabase to query the database directly but it would be useful to be able to query for datasets published since a specific date, and between 2 dates, through the API if not the UI. We are using an external tool to do that but it would be nice to use Dataverse itself. So maybe that is a feature request...

Thanks again!

M.

@pdurbin
Copy link
Member

pdurbin commented Aug 8, 2018

@mdmADA you're welcome. For the tooltip problem, comments on #3925 are very welcome.

For the main issue, we're probably going to estimate it tomorrow so we're going to have to decide the scope of it. I'll try to record the decision here but I may call upon you to create follow up issues depending on how big of a change we decide to take on for this issue. I appreciate all of your thoughts on this!

@djbrooke
Copy link
Contributor

djbrooke commented Aug 8, 2018

We'll update the tool tip and the field name to reflect that this is year only, not a specific day. If we want to provide the option to search for a specific day, that can be added as another issue in github.

@pdurbin pdurbin self-assigned this Aug 10, 2018
pdurbin added a commit that referenced this issue Aug 10, 2018
Except for the Metadata tab on the dataset page, where we show YYYY-MM-DD.
@pdurbin
Copy link
Member

pdurbin commented Aug 10, 2018

In fd22db6 as part of pull request #4942 I changed "Publication Year" to "Publication Date" and the tooltip. Please note that because YYYY-MM-DD format appears on the metadata tab of datasets, I left that as "Publication Date". Here are some screenshots:

screen shot 2018-08-10 at 9 52 26 am

screen shot 2018-08-10 at 9 52 34 am

screen shot 2018-08-10 at 9 52 47 am

I'm moving this to code review at https://waffle.io/IQSS/dataverse

@mdmADA please note that the scope of this change is pretty narrow. Please keep an eye on this issue and think about other features or changes you'd like in the future. Thanks again for reporting this usability issue!

@kcondon kcondon self-assigned this Aug 13, 2018
kcondon added a commit that referenced this issue Aug 13, 2018
change Publication Year to Publication Date #4908
@kcondon kcondon closed this as completed Aug 13, 2018
@pdurbin
Copy link
Member

pdurbin commented Aug 13, 2018

@mdmADA pull request #4942 has been merged and you can play around with the new behavior at http://phoenix.dataverse.org or in your own build of the "develop" branch. Again, we kept the scope of this issue small so please create follow up issues as needed for other date-related issues that are on your mind. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
UX & UI: Design This issue needs input on the design of the UI and from the product owner
Projects
None yet
Development

No branches or pull requests

6 participants