Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data.format.property unexpectedly applies to source dataset #5034

Open
aspyct opened this issue Jun 4, 2019 · 4 comments
Open

data.format.property unexpectedly applies to source dataset #5034

aspyct opened this issue Jun 4, 2019 · 4 comments
Assignees
Labels
Area - Data & Transform Blocked 🕐 For issues that are blocked by other issues Bug 🐛 P2 Important Issues that should be fixed soon

Comments

@aspyct
Copy link

aspyct commented Jun 4, 2019

I'm trying to split one dataset into two different data object. Each data must be a specific property deep inside the json of the dataset.

The vega-lite source comprises this:

{
    "$schema": "https://vega.github.io/schema/vega-lite/v3.json",
    "datasets": {
      "es_response": {
        "aggregations": {
          "histogram": [
            {"a": 30,"b": 28}, {"a": 40,"b": 55}, {"a": 50,"b": 43},
            {"a": 60,"b": 91}, {"a": 70,"b": 81}, {"a": 80,"b": 53},
            {"a": 90,"b": 19}, {"a": 100,"b": 87}, {"a": 110,"b": 52}
          ],
          "percentiles": {
            "values": [
              {
                "key" : 50.0,
                "value" : 100
              }
            ]
          }
        }
      }
    },
    "layer" : [
      {
        "data": {
          "name": "es_response",
          "format": { "property": "aggregations.histogram" }
        },
        ...
      },
      {
        "data": {
          "name": "es_response",
          "format": { "property": "aggregations.percentiles.values" }
        },
        ...
      }
    ]
  }

What I wanted to express with this was, in english:

Here's my dataset.
The data of the first layer is the property aggregations.histogram of the dataset.
The data of the second layer is the property aggregations.percentiles.values of the dataset.

When compiled to vega (non-lite), it applies a data.format.property to the dataset directly. Thus data_1 is correct, but data_2 fails completely.

"data": [
    {
      "name": "es_response",
      "format": {"property": "aggregations.histogram"},
      "values": {
        "aggregations": {
          "histogram": [
            {"a": 30, "b": 28},
            {"a": 40, "b": 55},
            {"a": 50, "b": 43},
            {"a": 60, "b": 91},
            {"a": 70, "b": 81},
            {"a": 80, "b": 53},
            {"a": 90, "b": 19},
            {"a": 100, "b": 87},
            {"a": 110, "b": 52}
          ],
          "percentiles": {"values": [{"key": 50, "value": 100}]}
        }
      }
    },
    {
      "name": "data_1",
      "source": "es_response",
      "transform": [
        {
          "type": "filter",
          "expr": "datum[\"a\"] !== null && !isNaN(datum[\"a\"]) && datum[\"b\"] !== null && !isNaN(datum[\"b\"])"
        }
      ]
    },
    {
      "name": "data_2",
      "source": "es_response",
      "transform": [
        {
          "type": "filter",
          "expr": "datum[\"value\"] !== null && !isNaN(datum[\"value\"])"
        }
      ]
    }
  ],

Not sure what would be the way to do this split with vega lite (or even vega, actually). Any other way to suggest?

See the full example on Vega editor

@aspyct aspyct added the Bug 🐛 label Jun 4, 2019
@kanitw kanitw added this to the x.x Data & Transforms milestone Jun 4, 2019
@kanitw kanitw added P2 Important Issues that should be fixed soon P3 Should be fixed at some point and removed P2 Important Issues that should be fixed soon P3 Should be fixed at some point labels Jun 4, 2019
@domoritz
Copy link
Member

domoritz commented Jun 4, 2019

I believe that this is due to a limitation in Vega. We need to have a dataflow that starts with one source and then two datasets with different format.property values. Unfortunately, Vega does not support that. Unfortunately, I don't see an easy way to achieve this with Vega transforms either.

@jheer do you think it makes sense to add a new transform for this purpose or should any data source support format.property?

@chucklam
Copy link
Contributor

I came across an almost exact problem yesterday. I wanted to extend the standard choropleth example to show state boundaries. I thought to re-use the same topojson file since it already has those state boundaries, but as @aspyct mentioned, vega-lite doesn't like to split a dataset into multiple data sources.

For my example, I'm referencing the dataset through a url, so my hack was to add a ? to the end of the url in the second instance, to trick Vega/Vega-lite into treating them as separate datasets.

Check out line 35 in this example in Vega editor. Just remove the ? and the problem will come back.

Vega-lite does load the same data twice, so this is not ideal. But it got things working with minimal fuss.

If there won't be a clean solution to this for a while, perhaps we should document this hack somewhere? (Stack Overflow?)

@domoritz
Copy link
Member

This should be fixed in Vega and then we can resolve the issue here as well.

@domoritz
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Data & Transform Blocked 🕐 For issues that are blocked by other issues Bug 🐛 P2 Important Issues that should be fixed soon
Projects
None yet
Development

No branches or pull requests

5 participants