Investigate why some cells have no values #58

Robsteranium · 2021-04-14T14:19:55Z

With all geo codelists selected, some rows (2 and 5) have no cell values in the geo column.

Not sure how this could be. Maybe their dimension points to a codelist but they don't use any of it's values...

Robsteranium · 2021-04-15T10:02:14Z

This is being caused by some codes being in more than one scheme, e.g.

{
  "@id": "http://data.europa.eu/nuts/code/UKC",
  "label": "NORTH EAST (ENGLAND)",
  "scheme": [
    "http://data.europa.eu/nuts/scheme/2010",
    "http://data.europa.eu/nuts/scheme/2016",
    "http://data.europa.eu/nuts/scheme/2013",
    "data/gss_data/trade/ons-international-trade-in-services-by-subnational-areas-of-the-uk#scheme/location",
    "data/gss_data/trade/ons-international-exports-of-services-from-subnational-areas-of-the-uk#scheme/service-origin-geography",
    "data/gss_data/trade/ons-quarterly-country-and-regional-gdp#scheme/reference-area"
  ]
}

This breaks the assumption that we can group codes by codelist in the cells.

We could ofc still do this, but then the same UKC code would appear 6 times (once under each scheme). Indeed we could already see that the same code is used in other dataset-specific schemes in other rows - that's the very purpose of the table!

We could try to filter the list of schemes to those relevant - e.g. removing those dataset-specific schemes from other datasets. Even if we could easily determine this we would still have the multiple harmonised schemes (here one per NUTS version). This might be useful information, but it's not particularly relevant to the dataset search/ comparison because the filters themselves express all the user cares about codelist versions (whether their code of interest is present).

We might just need to remove the codelist grouping altogether. This grouping is less important given that mixing schemes within datasets will be rarer than between them. We could still possibly provide this information (e.g. with a popover) but not use it to structure the layout. Instead we'd just show an ellipsised list of codes.

The facet match would then have the codelist level removed, looking instead like:

{:facets
  ({:name "Geography",
    :dimensions
    ({:ook/uri
      "data/gss_data/trade/ons-quarterly-country-and-regional-gdp#dimension/reference-area",
      :codes
      ({:ook/uri "http://data.europa.eu/nuts/code/UKC",
        :ook/type "skos:Concept",
        :priority ["2" "6"],
        :label "NORTH EAST (ENGLAND)",
        :narrower
        ["http://data.europa.eu/nuts/code/UKC1"
         "http://data.europa.eu/nuts/code/UKC2"],
        :broader
        ["http://data.europa.eu/nuts/code/UK"
         "data/gss_data/trade/international-trade-in-services-by-subnational-areas-of-the-uk#concept-scheme/location/nuts"],
        :notation "UKC",
        :scheme
        ["http://data.europa.eu/nuts/scheme/2010"
         "http://data.europa.eu/nuts/scheme/2016"
         "http://data.europa.eu/nuts/scheme/2013"
         "data/gss_data/trade/ons-international-trade-in-services-by-subnational-areas-of-the-uk#scheme/location"
         "data/gss_data/trade/ons-international-exports-of-services-from-subnational-areas-of-the-uk#scheme/service-origin-geography"
         "data/gss_data/trade/ons-quarterly-country-and-regional-gdp#scheme/reference-area"],
        :used "false"})})})}

In fact we might like to enrich this with codelist labels if we're going to show them in a popover.

Robsteranium · 2021-04-15T12:55:33Z

Ok, working this through... it gets confusing because you can mix schemes by facet even with 1:1 dimension:codelist because the facet combines dimensions. We can distinguish these using the dimensions as grouping variable (rather than codelists as originally planned).

Robsteranium · 2021-04-22T10:14:37Z

We've now used dimension as a grouping variable and lifted the query size limits. This seems to fill most of the blanks but some remain.

e.g. this search for Germany doesn't seem to include an example code on for the "ONS UK total trade" dataset. The count is correct (filters observations for Germany) but the cell is blank and the link is wrong.

Robsteranium · 2021-04-22T15:29:02Z

This can sometimes be cause be sparsity e.g. this search shows a dataset which does include "BOP Services" and "Exports", but the first-matched observation for "BOP Services: Net financial transactions" doesn't match "Flow: Exports".

There may sometimes be no single observation that does both or it might be that the collapse just doesn't happen to find one with both (which might be solved by #52).

Robsteranium · 2021-04-29T12:48:05Z

I've got a draft implementation for #52 which doesn't appear to solve either of the above two cases ☹️

Robsteranium · 2021-04-29T13:14:41Z

I've recreated the example from above with all geo codelists selected using the latest data from the beta environment. Now all the cells are populated.

Robsteranium · 2021-04-29T14:03:21Z

Redoing the above example for Germany with the new data confirms this is still a problem.

Robsteranium · 2021-04-30T09:22:13Z

Each of the previous examples is now solved on #68 (this mostly consists of increasing the default query size from 10).

One example was due to the child-dimension not being tied to the facet's parent dimension via rdfs:subPropertyOf.

Closing for now but we can re-open if new examples appear.

Robsteranium added the question Further information is requested label Apr 14, 2021

Robsteranium self-assigned this Apr 15, 2021

Robsteranium added the bug Something isn't working label Apr 22, 2021

kirahowe added the data Related to some underlying/upstream data issue label Apr 23, 2021

Robsteranium closed this as completed Apr 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate why some cells have no values #58

Investigate why some cells have no values #58

Robsteranium commented Apr 14, 2021

Robsteranium commented Apr 15, 2021

Robsteranium commented Apr 15, 2021

Robsteranium commented Apr 22, 2021

Robsteranium commented Apr 22, 2021

Robsteranium commented Apr 29, 2021

Robsteranium commented Apr 29, 2021

Robsteranium commented Apr 29, 2021

Robsteranium commented Apr 30, 2021

Investigate why some cells have no values #58

Investigate why some cells have no values #58

Comments

Robsteranium commented Apr 14, 2021

Robsteranium commented Apr 15, 2021

Robsteranium commented Apr 15, 2021

Robsteranium commented Apr 22, 2021

Robsteranium commented Apr 22, 2021

Robsteranium commented Apr 29, 2021

Robsteranium commented Apr 29, 2021

Robsteranium commented Apr 29, 2021

Robsteranium commented Apr 30, 2021