Wrong count previews in owner facet #207

fsteeg · 2017-02-03T10:30:07Z

Since owners are based on exemplar aggregations, and aggregation requests have a limited size, the owner counts are wrong (just the owners of the most frequent X exemplar, which are actually all 1). To fix this, we have to improve the efficiency of the aggregations processing to enable an aggregations request with unlimited size for exemplars.

fsteeg · 2017-02-03T10:31:21Z

This can be reproduced with any queries returning high result counts, e.g. owner facet for:
http://lobid.org/resources/search?q=k%C3%B6ln

fsteeg · 2017-02-06T10:12:28Z

The basic problem here is that we are faceting over a field (the item owner) that's not in our data. This approach won't work for the entire catalog: if we query everything, we'd have to get all items, and create the owner facet from that.

Instead, I suggest we add an exemplar.owner field, so for example in http://lobid.org/resources/HT012213725?format=json we'd have:

"exemplar": [{
  "id": "http://lobid.org/items/HT012213725:DE-6:ZD%207381#!",
  "owner": "http://lobid.org/organisations/DE-6",
  "label": "lobid Bestandsressource"
}],

That way, we could simply facet over exemplar.owner directly, which would give us all owners (not all items, as with the current facet, which is based on exemplar.id).

What do you think @dr0i @acka47? If it makes no sense to expose the owner in the data (but I do think it's useful for API usage), we could also create an internal Elasticsearch field or a custom aggregation. If we do want to expose it, we should add it on the Metafacture level.

acka47 · 2017-02-06T10:58:43Z

+1 from me. I already proposed embedding item information in the instance data, see #140. We might just reopen that issue.

dr0i · 2017-02-07T09:50:00Z

Using a child aggregation on our data querying "köln" seems to come with a plausible result:

"hits" : {
"total" : 569.808,
 ...
"aggregations" : {
"items" : {
  "doc_count" : 1.686.515,
  "top-isil" : {
  ...
    "buckets" : [ {
      "key" : "http://lobid.org/organisations/DE-38",
      "doc_count" : 172.288
    } ...

I can imagine that the factor 3 in ration resources/items is a result of libraries holding more than one item. Is this acceptable or do you really want to have a ration of 1? Though I doubt that if we take the data from the child into the parent and subsequently have e.g. 3 same exemplar.owner.id (reflecting the fact of multiple holdings of a manifestation (aka "resource")) an aggreagation about this would would result in that 1/1 ration (without tinkering with filter or something).

fsteeg · 2017-03-01T15:57:35Z

Reopening, see discussion starting in #278 (comment).

acka47 · 2021-03-25T09:16:13Z

This came up again, see #1169, where @hagbeck wrote:

From the Aleph based index we're getting 1.334.514 records [1]
The facet "Bestand in Bibliotheken" in the Aleph based index shows 1.471.170 records.

[1] http://lobid.org/resources/search?owner=http%3A%2F%2Flobid.org%2Forganisations%2FDE-290%23%21&aggregations=owner

I pointed out this problem in #278 (comment):

Isn't the underlying mechanism that the facet gives the number of items while the query result lists the FRBR manifestations (or in bibframe-speak: instances)?

TobiasNx · 2023-01-16T14:04:44Z

This came up again in context of the comparison of ALMA and ALEPH resources of UB Münster. Idealy this should be fixed before ALMA Fix replaces ALEPH-Morph. #1601

acka47 · 2023-03-10T13:34:34Z

@blackwinter will take a look whether this should be added to milestone DigiBib or not.

blackwinter · 2023-03-14T14:22:18Z

We would not be affected by this issue.

fsteeg added bug ready labels Feb 3, 2017

fsteeg self-assigned this Feb 3, 2017

fsteeg added working and removed ready labels Feb 6, 2017

fsteeg added a commit that referenced this issue Feb 6, 2017

Use Aggregations object, don't parse full JSON response (see #207)

0466619

fsteeg added a commit that referenced this issue Feb 6, 2017

Cache individual aggregation requests (see #207)

8e653a6

fsteeg assigned acka47 and dr0i and unassigned fsteeg Feb 6, 2017

fsteeg mentioned this issue Feb 6, 2017

Fix owner aggregation #214

Closed

fsteeg added review and removed working labels Feb 6, 2017

fsteeg unassigned acka47 Feb 6, 2017

dr0i added working and removed ready labels Feb 7, 2017

dr0i assigned fsteeg, acka47 and ChristophEwertowski and unassigned dr0i Feb 7, 2017

dr0i added review and removed working labels Feb 7, 2017

fsteeg added working and removed working labels Feb 13, 2017

fsteeg removed their assignment Feb 13, 2017

dr0i added deploy and removed review labels Feb 14, 2017

fsteeg closed this as completed in aa19036 Feb 14, 2017

fsteeg removed the deploy label Feb 14, 2017

fsteeg added a commit that referenced this issue Feb 17, 2017

Use Aggregations object, don't parse full JSON response (see #207)

c452a34

fsteeg added a commit that referenced this issue Feb 17, 2017

Cache individual aggregation requests (see #207)

8b0ac36

ChristophEwertowski mentioned this issue Mar 1, 2017

Wrong facets for multi word queries #278

Closed

fsteeg reopened this Mar 1, 2017

fsteeg added the ready label Mar 1, 2017

fsteeg self-assigned this Mar 1, 2017

fsteeg removed the ready label Jun 28, 2017

fsteeg added the ready label Jun 11, 2018

acka47 added this to Ready in lobid board Apr 8, 2019

acka47 removed the ready label Apr 9, 2019

acka47 moved this from Ready to Backlog in lobid board Dec 3, 2020

acka47 mentioned this issue Mar 25, 2021

facet value "Bestand in Bibliotheken" differs from factual results #1169

Closed

acka47 mentioned this issue Jan 16, 2023

Different aggregated results in Bestände-Facette and Results #1601

Closed

acka47 assigned blackwinter and unassigned fsteeg Mar 10, 2023

blackwinter assigned fsteeg and unassigned blackwinter Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong count previews in owner facet #207

Wrong count previews in owner facet #207

fsteeg commented Feb 3, 2017

fsteeg commented Feb 3, 2017

fsteeg commented Feb 6, 2017

acka47 commented Feb 6, 2017

dr0i commented Feb 7, 2017

fsteeg commented Mar 1, 2017

acka47 commented Mar 25, 2021

TobiasNx commented Jan 16, 2023 •

edited

acka47 commented Mar 10, 2023

blackwinter commented Mar 14, 2023

Wrong count previews in owner facet #207

Wrong count previews in owner facet #207

Comments

fsteeg commented Feb 3, 2017

fsteeg commented Feb 3, 2017

fsteeg commented Feb 6, 2017

acka47 commented Feb 6, 2017

dr0i commented Feb 7, 2017

fsteeg commented Mar 1, 2017

acka47 commented Mar 25, 2021

TobiasNx commented Jan 16, 2023 • edited

acka47 commented Mar 10, 2023

blackwinter commented Mar 14, 2023

TobiasNx commented Jan 16, 2023 •

edited