New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search API: document how the dateSort field support date range queries (i.e. datasets published in the last week) #2291

Closed
pdurbin opened this Issue Jun 26, 2015 · 2 comments

Comments

Projects
None yet
4 participants
@pdurbin
Member

pdurbin commented Jun 26, 2015

@erinspace from https://osf.io/share/ asked "Is there a way we can query for all results within a certain date range? Ideally, at least at day granularity? For example, we’d love to be able to query for all results between 2015-05-01 and 2015-05-10."

A similar question came up from the Dataverse Community Meeting 2015. I'm pretty sure in came up in the context of the upcoming Archivematica integration as well.

Anyway, most of an email I sent to people at @CenterForOpenScience as an answer. It seems to be working for their use case so I should probably document this as part of the Search API.

Related is @markwilkinson asking about which fields can be searched. This "dateSort" field is not driven by dynamic metadata. Rather, it's hard coded in https://github.com/IQSS/dataverse/blob/4.0/src/main/java/edu/harvard/iq/dataverse/search/SearchFields.java#L155 but I don't have any plans to change it. Data driven fields (from metadata blocks) are currently only exposed via an API that's blocked from the outside. Making that API public should probably be a separate ticket and is related to #1510.


The email to @CenterForOpenScience

I don't document the "dateSort" field at http://guides.dataverse.org/en/latest/api/search.html but it's what we us internally to mean "create date or published date" (we use published date when it's available) but it's searchable, a real* date type (so we can do range queries on it), and what we use for sorting on the home page (we default to "newest first").

There's granularity down to seconds at least, from what I can tell. See also https://cwiki.apache.org/confluence/display/solr/Working+with+Dates

Others have asked for something similar so we should probably make a ticket about this... make range queries over dates a supported and documented feature of the Search API.

I kind of assume only datasets are of interest (not files or dataverses) but you can include all types. Note that depending on the number of results, you might need to iterate with the "start" cursor per the API Guide:

curl -s --globoff "https://dataverse.harvard.edu/api/search?key=$API_TOKEN&type=dataset&sort=date&order=asc&q=*&fq=dateSort:[2015-05-01T00\:00\:00Z+TO+2015-05-10T00\:00\:00Z]" | jq '.data.items[] | {name,published_at}' | head -12

{
  "name": "Replication Data for: A Seat in China’s Rubber Stamp Parliament: What is it Worth and Which Companies Receive One?",
  "published_at": "2015-05-01T02:38:54Z"
}
{
  "name": "Integrated Behavioural and Biological Assessment - III (Karnataka - FSWs)",
  "published_at": "2015-05-01T03:42:54Z"
}
{
  "name": "Integrated Behavioural and Biological Assessment -II (Karnataka - High Risk Groups)",
  "published_at": "2015-05-01T03:44:24Z"
}

Related: #70 and #370

@posixeleni

This comment has been minimized.

Show comment
Hide comment
@posixeleni

posixeleni Jan 21, 2016

Contributor

@pdurbin would it be possible for you to add this information into the Search API guide? I have a feeling more and more people will want to be able to search on this to only retrieve the freshest published data for their integrations with us.

Contributor

posixeleni commented Jan 21, 2016

@pdurbin would it be possible for you to add this information into the Search API guide? I have a feeling more and more people will want to be able to search on this to only retrieve the freshest published data for their integrations with us.

@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016

raprasad added a commit that referenced this issue Sep 27, 2016

#2291 troubleshooting where file save is called from ingestServiceBea…
…n -- e.g. why is it not going through DataFileServiceBean.save

raprasad added a commit that referenced this issue Oct 4, 2016

@pdurbin

This comment has been minimized.

Show comment
Hide comment
@pdurbin

pdurbin Jun 28, 2018

Member

We seem to be surviving just fine without this documentation. Closing! 😄

Member

pdurbin commented Jun 28, 2018

We seem to be surviving just fine without this documentation. Closing! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment