Skip to content
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.

Provide mechanism to get info on total number of results #297

Closed
acka47 opened this issue Jan 13, 2014 · 17 comments
Closed

Provide mechanism to get info on total number of results #297

acka47 opened this issue Jan 13, 2014 · 17 comments
Assignees

Comments

@acka47
Copy link
Contributor

acka47 commented Jan 13, 2014

Currently, API users don't get any information of the total number of results a specific query yields. This is no problem if the number of hits is < 50 as all results show up in one document. If the number of hits is > 50 one would have to page through all results and add all results together to get to know the total number (which would add unnecessary API load.
Thus, the total number of hits should be delivered by default with with every query response or users should have the possibility to somehow ask for the total number of hits.

@acka47
Copy link
Contributor Author

acka47 commented Jan 15, 2014

As said in #230 (comment) I would look into using the Sindice Search Vocabulary for providing information on a query and its results with total number of results being one part of this information.

@literarymachine
Copy link

Bump - when using the API to discover link targets for workflows such as @edoweb's, this is a really important feature.

@fsteeg
Copy link
Contributor

fsteeg commented May 14, 2014

+1, we also need this for NWBib.

Technically this is no problem at all, the only issue is how to serve it as RDF.

How do we make statements about the query? Can we simply add a generic statement like:

<http://sindice.com/vocab/search#Query> <http://sindice.com/vocab/search#totalResults> 42 .

Or do we need to include the actual query in some way?

@literarymachine
Copy link

I would argue that the query itself can be referenced by it's URL, e.g. http://api.lobid.org/resource?name=Faust. So maybe this would be cleaner?

<http://api.lobid.org/resource?name=Faust> <http://sindice.com/vocab/search#totalResults> 42 .

Also see #230 (comment)

@acka47
Copy link
Contributor Author

acka47 commented May 15, 2014

The first thing we should do before implementing this is discontinuing support for other RDF serializations than JSON-LD (and RDFa) in query results (as @jschnasse already proposed some time ago).

The results of curl -H "Accept: text/turtle" http://lobid.org/resource?name=Faust are of no use anyway, because they don't deliver information on ranking. JSON-LD already is the default, see curl http://lobid.org/resource?name=Faust and it already provides ranking information through delivering an array. Conversely, this means that content negotiation for other RDF serializations (turtle, N-Triples) should only be supported for querying resource URIs like http://lobid.org/resource/HT014834500. (We could differentiate these two options (conneg for RDF vs. no conneg for RDF) by the URL schemes http://api.lobid.org vs. http://lobid.org.)

This said, here's my first take on how to provide the search results in JSON-LD:

{
    "@context": { 
        "search": "http://sindice.com/vocab/search#",
        "issued": {
            "@id": "http://purl.org/dc/terms/issued",
            "@type": "xsd:dateTime"
            }
    },
    "@type": "search:Query",
    "@id": "http://api.lobid.org/resource?name=Faust",
    "issued": "2014-05-15T09:34:51",
    "search:first": "http://api.lobid.org/resource?name=Faust",
    "search:last": "http://api.lobid.org/resource/$lastResultPage",
    "search:next": "http://api.lobid.org/resource/$secondResultPage",
    "search:itemsPerPage": "50",
    "search:totalResults": "4273",
    "search:result": [
        { "@id": "$FirstSearchResultInNestedJSON-LD" },
        { "@id": "$FirstSearchResultInNestedJSON-LD" },
        { "further": "results" }
    ]
}

(This example applies for the ``?format=fullquery. For?format=id` etc. the result would be adjusted accordingly.)

@literarymachine
Copy link

The first thing we should do before implementing this is discontinuing support for other RDF serializations than JSON-LD (and RDFa) in query results (as @jschnasse already proposed some time ago).

The first thing you should do is ask if someone uses other RDF serializations. Second of all - if search results are valid JSON-LD, what stops you from providing other serializations?

<http://api.lobid.org/resource?name=Faust>
  a search:Query ;
  dc:issued "2014-05-15T09:34:51" ;
  search:first <http://api.lobid.org/resource?name=Faust> ;
  search:last <http://api.lobid.org/resource/$lastResultPage> ;
  search:next <http://api.lobid.org/resource/$secondResultPage> ;
  search:itemsPerPage "50" ;
  search:totalResults "4273" ;
  search:result (
    <http://lobid.org/resource/HT014009200>
    <http://lobid.org/resource/HT013815600>
    <http://lobid.org/resource/CT003041072>
    ...
  ) .

Yes, this uses RDF-lists, which are not the prettiest thing in the world. But to make the JSON-LD not only implicitly (based on JSON-syntax that is) deliver the proper result order, they would have to be modeled as a list anyways.

@acka47
Copy link
Contributor Author

acka47 commented May 15, 2014

@literarymachine Do you use the RDF serializations for lobid search results? :)

@literarymachine
Copy link

Yes, I currently use ntriples.

@fsteeg
Copy link
Contributor

fsteeg commented May 15, 2014

Implementing the changes suggested by @acka47 would imply an incompatible 2.0 API release, since the basic response structure would be completely different: we'd be returning a JSON object as the top level element, instead of an array.

The approach that I started implementing yesterday is to add a new entity for the query itself as the first entity in the response. In JSON-LD, this would look like this:

[{
  @id: "http://api.lobid.org/resource?name=Faust&format=full",
  http://sindice.com/vocab/search#totalResults: 4218
},{
  @graph:
  ...
}]

Removing the RDF search results has the same issue: we'd break API, we can only do this in a 2.0 release. The approach above would e.g. yield this if N-Triples are requested:

curl -L --header "Accept: text/plain" "http://localhost:9000/resource?name=Faust"

<http://api.lobid.org/resource?name=Faust> <http://sindice.com/vocab/search#totalResults> "4218"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://d-nb.info/gnd/118524461> <http://d-nb.info/standards/elementset/gnd#preferredNameForThePerson> "Delacroix, Eug\u00E8ne"^^<http://www.w3.org/2001/XMLSchema#string> .
...

@acka47
Copy link
Contributor Author

acka47 commented May 15, 2014

As interim solution the proposal by @fsteeg would be ok with me. Would this also be ok for you, @literarymachine, for the time being?

@literarymachine
Copy link

As interim solution the proposal by @fsteeg would be ok with me. Would this also be ok for you, @literarymachine, for the time being?

+1 (see #297 (comment))

We should probably discuss any further (and incompatible) modifications of the result format somewhere else. Just a short addition to #297 (comment): marking the search:result in the JSON-LD as an ordered list can be done from within the context:

"@context": {
    "search": "http://sindice.com/vocab/search#",
    "issued": {
        "@id": "http://purl.org/dc/terms/issued",
        "@type": "xsd:dateTime"
    },
    "search:result" {
        "@container": "@list"
    }
}

@fsteeg
Copy link
Contributor

fsteeg commented May 15, 2014

Deployed for testing:

http://test.lobid.org/resource?name=Typee&format=full&size=1

curl -L --header "Accept: text/plain" "http://test.lobid.org/resource?name=Typee&size=1"

@literarymachine
Copy link

+1

@fsteeg
Copy link
Contributor

fsteeg commented May 19, 2014

Deployed to production, closing:

http://lobid.org/resource?name=Typee&format=full&size=1

curl -L --header "Accept: text/plain" "http://lobid.org/resource?name=Typee&size=1"

@fsteeg fsteeg closed this as completed May 19, 2014
@fsteeg
Copy link
Contributor

fsteeg commented May 19, 2014

Reopening, does not work correctly for RDF+XML:

curl -L -H "Accept: application/rdf+xml" "http://test.lobid.org/resource/HT009442672"

Has two root elements, should have only one. See possibly similar issue in hbz/oerworldmap.

@fsteeg
Copy link
Contributor

fsteeg commented May 19, 2014

Deployed fix to staging (don't add query info on path-style requests for single resources):

http://test.lobid.org/resource?name=Typee&format=full&size=1
curl -L -H "Accept: text/plain" "http://test.lobid.org/resource?name=Typee&size=1"
curl -L -H "Accept: application/rdf+xml" "http://test.lobid.org/resource/HT009442672"

@fsteeg
Copy link
Contributor

fsteeg commented May 20, 2014

Deployed to production, closing.

Opened new issue for RDF+XML results for multiple hits with one root: #463 (independent from the topic of this issue, did already return multiple root elements before total number of results were added).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants