Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: update to Vega-Lite 5 #2517

Closed
wants to merge 15 commits into from
Closed

WIP: update to Vega-Lite 5 #2517

wants to merge 15 commits into from

Conversation

ChristopherDavisUCI
Copy link
Contributor

(Draft version only!)

Surprisingly the biggest obstacle so far hasn't been params but a change to layer. I think in the newest Vega-Lite schema, charts in a layer are not allowed to specify height or width, which seems to break many Altair examples. Here is a minimal example that doesn't work:

no_data = pd.DataFrame()

c = alt.Chart(no_data).mark_circle().properties(
    width=100
)

c+c

I don't see a good way to deal with that. Do you have a suggestion?

I've read the list of "breaking changes" for the Vega-Lite 5.0.0 release and don't see anything that seems related to this, so it does make me wonder if maybe I misunderstand the cause of the problem.

Other things:

  • I usually try some tests by running things in Jupyter notebook, but since making the change to Vega-Lite 5 that hasn't worked for me. Instead I get the following message in red the first time I try to display a chart, and then subsequent times I just get a blank response: Error loading script: Script error for "vega-util", needed by: vega-lite http://requirejs.org/docs/errors.html#scripterror It does work in Jupyter Lab.
  • Some of the code changes have been experiments trying to learn how the old selection fits with the new parameter. It might be best to redo this code later now that I see more of the big picture.

@mattijn
Copy link
Contributor

mattijn commented Nov 10, 2021

width within a layered mark seems the culprit, observe:

# python 3
from urllib.request import urlopen
import json
import jsonschema  # 3.2.0

def validate(vl_spec, vl_schema="v5.1.0"):
    schema = json.load(urlopen(f'https://vega.github.io/schema/vega-lite/{vl_schema}.json'))
    spec = json.loads(vl_spec)
    jsonschema.validate(spec, schema)
    
vl_spec = """
{
  "layer": [{"mark": "circle", "width": 100}, {"mark": "circle"}],
  "data": {"values": []}
}
"""
validate(vl_spec)
ValidationError: Additional properties are not allowed ('mark' was unexpected)
On instance:
    {'mark': 'circle', 'width': 100}

The above validates still OK in v4.17.0 (validate(vl_spec, vl_schema='v4.17.0').

When placing the width outside the layered mark it validates without error in v5.1.0, like this:

vl_spec = """
{
  "width": 100,
  "layer": [{"mark": "circle"}, {"mark": "circle"}],
  "data": {"values": []}
}
"""
validate(vl_spec)

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 10, 2021

If we wanted, we could add logic to alt.layer that will move any width/height specification in child charts to the top level. I believe there is already some existing logic along these lines for other properties.

It's a minor thing, but it would be nice to remain backward-compatible on this.

@ChristopherDavisUCI
Copy link
Contributor Author

Thank you! Do you have an example of a property where we do something similar to this? (It doesn't have to be in LayerChart.) If I have a template to use, I should be able to do the same thing for width and height.

So far at least with selection it's seemed straightforward to keep everything backwards compatible.

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 10, 2021

What I had in mind was the _combine_subchart_data step in alt.LayerChart: https://github.com/altair-viz/altair/blob/e11ab10bf2239af7db4b56ed61877756c431395f/altair/vegalite/v4/api.py#L2333
Basically this looks for common datasets in the layers, and moves them up to the top level. This so you can do things like this:

base = alt.Chart(dataframe).encode(...)
chart = base.mark_point() + base.mark_line()

and only embed a single copy of the serialized dataframe in the output. We could do something similar for width and height: extract them from the underlying layers & apply them to the parent chart, raising an error if there are any mismatches.

@joelostblom
Copy link
Contributor

I've read the list of "breaking changes" for the Vega-Lite 5.0.0 release and don't see anything that seems related to this, so it does make me wonder if maybe I misunderstand the cause of the problem.

I looked around a bit and I think this was initially introduced in VL4.1.1, but then reverted and included with a deprecation warning because Jake reported that it broke Altair. That deprecated code was removed in 5.0 in this PR, which is probably why it is breaking now. It seems like in addition to width and height, the view property is also affected so maybe we need to include that in this PR as well.

@ChristopherDavisUCI
Copy link
Contributor Author

Thank you @joelostblom for tracking that down. Would you be able to give me an example where we use this view property in Altair? That would help with my testing. I tried searching through the documentation examples and didn't immediately find one.

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 10, 2021

One relevant piece of info here is that the Vega-lite renderer maintains backward compatibility for some things (such as passing width and height to within a layer) even though the Vega-Lite schema now forbids them. Because we validate based on the schema, Altair is actually a bit stricter than the Vega-Lite renderer.

We have a couple options for dealing with that:

  1. Accept that Altair is stricter than vega-lite, and pass on those errors to users (the "do nothing" approach)
  2. maintain a list of allowable departures from the schema.
  3. do some preprocessing at initialization for specific issues
  4. stop validating based on the schema

I would lean toward (3) here, as (1) would lead to poorer usability, and my sense is that (2) would add an undue maintenance burden, and (4) would be a large departure from Altair's current implementation

@joelostblom
Copy link
Contributor

@ChristopherDavisUCI I am not sure how much it is actually used, but alt.Chart does accept a view parameter, so I am guessing we should handle that the same as height and width (since it was deprecated in the same VL PR) and move it up to the top level of layered charts:

alt.Chart(alt.Data(values=[{}]), height=200, view=alt.ViewConfig()).mark_bar().to_dict()
{'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}},
 'data': {'values': [{}]},
 'mark': 'bar',
 'height': 200,
 'view': {},
 '$schema': 'https://vega.github.io/schema/vega-lite/v4.8.1.json'}

and using view instead of width in the validation example by @mattijn above gives the same ValidationError:

vl_spec = """
{
  "layer": [{"mark": "circle", "view": {}}, {"mark": "circle"}],
  "data": {"values": []}
}
"""

@ChristopherDavisUCI
Copy link
Contributor Author

ChristopherDavisUCI commented Nov 11, 2021

@joelostblom Thank you!

@jakevdp Thanks for the pointer to _combine_subchart_data. I think I was able to adapt that to get the layer examples working.

Edit: My current guess for the below example is we should just accept it needing to switch "selection" to "param". It seems reasonable to me that using an invalid property name in the dictionary is not going to have a nice fix.

Aside from rendering examples (literally zero of which are working), there's just one example that doesn't work that I know of, Scatter Plot with Minimap, because it explicitly specifies "selection": zoom.name. If we change it to "param": zoom.name then it works. Do you see a robust way to deal with that? I looked through the schema and I think "selection" as a property name only exists in "Config".

"""
Scatter Plot with Minimap
-------------------------
This example shows how to create a miniature version of a plot
such that creating a selection in the miniature version
adjusts the axis limits in another, more detailed view.
"""
# category: scatter plots

import altair as alt
from vega_datasets import data

source = data.seattle_weather()

zoom = alt.selection_interval(encodings=["x", "y"])

minimap = (
    alt.Chart(source)
    .mark_point()
    .add_selection(zoom)
    .encode(
        x="date:T",
        y="temp_max:Q",
        color=alt.condition(zoom, "weather", alt.value("lightgray")),
    )
    .properties(
        width=200,
        height=200,
        title="Minimap -- click and drag to zoom in the detail view",
    )
)

detail = (
    alt.Chart(source)
    .mark_point()
    .encode(
        x=alt.X(
            "date:T", scale=alt.Scale(domain={"selection": zoom.name, "encoding": "x"})
        ),
        y=alt.Y(
            "temp_max:Q",
            scale=alt.Scale(domain={"selection": zoom.name, "encoding": "y"}),
        ),
        color="weather",
    )
    .properties(width=600, height=400, title="Seattle weather -- detail view")
)

detail | minimap

@ChristopherDavisUCI
Copy link
Contributor Author

ChristopherDavisUCI commented Nov 11, 2021

Here's a first example of using VariableParameter. Is that basically how we want it to work? Should we try to add functionality so opacity=alt.ExprRef("sample/1000") can be replaced by maybe something like opacity=alt.expr("sample/1000") or opacity=alt.expr(f"{size_var}/1000")?

import altair as alt
from vega_datasets import data

cars = data.cars.url

year_slider = alt.binding_range(min=0, max=1000, step=5)
size_var = alt.variable(bind=year_slider, value=200, name="sample")

c = alt.Chart(cars).mark_circle(size=size_var, opacity=alt.ExprRef("sample/1000")).encode(
    x='Miles_per_Gallon:Q',
    y='Horsepower:Q',
).add_variable(size_var)

c

sample

The chart could also be defined this way:

c = alt.Chart(cars).mark_circle().encode(
    x='Miles_per_Gallon:Q',
    y='Horsepower:Q',
    size = alt.value(size_var),
    opacity = alt.value(alt.ExprRef("sample/1000"))
).add_variable(size_var)

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 11, 2021

That's nice! I think it would be cool if we could use the variables directly in the expression, i.e. something like opacity = size_var / 1000 which would basically output that same ExprRef.

We do a similar thing with alt.datum.value / 1000

@mattijn
Copy link
Contributor

mattijn commented Nov 11, 2021

I played a bit in ChristopherDavisUCI@0b34ec6 and now this works:

cars = data.cars.url

year_slider = alt.binding_range(min=0, max=1, step=0.05)
size_var = alt.variable(bind=year_slider, value=0.2, name="sample")

c = alt.Chart(cars).mark_circle(
    size=size_var*1000,
    opacity=size_var + 0.2
).encode(
    x='Miles_per_Gallon:Q',
    y='Horsepower:Q',
).add_variable(size_var)

c

but this won't, opacity=size_var + 0.2 + 0.1:
TypeError: unsupported operand type(s) for +: 'ExprRef' and 'float'

Because the __add__ function includes the ExprRef:

    def __add__(self, other):
        return ExprRef(expr.core.BinaryExpressionReference("+", self.name, other))

It get stuck in the second round of addition, since size_var + 0.2 has become an ExprRef and cannot add the 0.1 . Another solution is needed, but maybe its a step forward for someone else.

@mattijn
Copy link
Contributor

mattijn commented Nov 12, 2021

I can completely imagine this should be done very different, but now this also works:

import altair as alt
year_slider = alt.binding_range(min=0, max=1000, step=5)
size_var = alt.variable(bind=year_slider, value=200, name="sample")
expr = size_var ** 2 + 0.1
expr
ExprRef({
  expr: pow(sample,2) + 0.1
})

Where size_var ** 2 + 0.1 can be used eg. for the opacity channel in above example.

Changed code in this commit ChristopherDavisUCI@530a11a

@ChristopherDavisUCI
Copy link
Contributor Author

Thanks @mattijn! I glanced and didn't immediately follow, but I will look more slowly later.

Is it obvious to you why the charts don't display for me in Jupyter notebook? Do you think it's in the same category as the changes you made related to vegalite_mimebundle from last round?

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 12, 2021

If you're using the jupyterlab/mimebundle renderer, that will not work, because the vega-lite extension does not yet support vl 5 Switch to the default renderer.

Even with the default renderer, you can still run into issues if charts with previous vega-lite versions are displayed in the notebook, because once one version of vega-lite is loaded, new ones will not load even if the version does not match. Try clearing all outputs, saving the notebook, then reloading the page.

If you're using jupyterlab, the same is true of any notebook tab you have opened in the history of the session, because different notebooks' JS & CSS environments are not sandboxed from each other (don't even get me started...). So close all notebooks, clear outputs from ones you plan to open, click reload the browser, and then open the notebooks you want to use.

@mattijn
Copy link
Contributor

mattijn commented Nov 12, 2021

Basic idea is to add a number and comparison protocol to the Variable and ExprRef class. This includes methods for mathematical operations such as addition and instance comparison like equality checking. Addition (+) is represented as __add__(self, other) which carry out self + other.
So an expression x + y is interpreted as x.__add__(y).
In this case, within this __add__ method we use the defined name of the Variable (eg. sample) append the operand + and the definition of other and return this as an ExprRef. Given the following Variable:

import altair as alt
year_slider = alt.binding_range(min=0, max=1000, step=5)
size_var = alt.variable(bind=year_slider, value=200, name="sample")
size_var
Variable('sample', VariableParameter({
  bind: BindRange({
    input: 'range',
    max: 1000,
    min: 0,
    step: 5
  }),
  name: 'sample',
  value: 200
}))

And:

size_var.name
'sample'

Then size_var + 1 is interpreted as size_var.__add__(1). Where the __add__ method does size_var.name + 1 which becomes sample + 1. This result is returned as ExprRef("sample + 1").
Therefor

er = size_var + 1
er
ExprRef({
  expr: sample + 1
})

And then the same works if these mathematical operations are also defined as methods for ExprRef, where the defined expr is used instead of name (ExprRef.expr).
So:

er + 2

becomes er.__add__(2), which does er.expr (in this case this is sample + 1) append + 2, so it becomes sample + 1 + 2 and is again returned as ExprRef as such:

ExprRef({
  expr: sample + 1 + 2
})

The actual implementation is a different story though. I tried to integrate with the existing code in core.py, but had to duplicate code with only very little differences to get it to work. The same for the added code in api.py, this could probably be delegated or moved to somewhere more sensible. All in all, it's more like prototype code.

@ChristopherDavisUCI
Copy link
Contributor Author

In the following example (which doesn't work at the moment), size_var is of type Variable. What's the right way to define the Variable class so that the following code will work. (Notice that size=size_var is being specified inside of encode, not inside of mark_circle. The example does work if you move size=size_var into the mark_circle method.)

import altair as alt
from vega_datasets import data

cars = data.cars.url

year_slider = alt.binding_range(min=0, max=1, step=0.05)
size_var = alt.variable(bind=year_slider, value=0.2, name="sample")

c = alt.Chart(cars).mark_circle().encode(
    x='Miles_per_Gallon:Q',
    y='Horsepower:Q',
    size=size_var,
).add_variable(size_var)

c

I'd like to get something like this working and then add on the arithmetic functionality that @mattijn described above.

@mattijn
Copy link
Contributor

mattijn commented Nov 14, 2021

Hmm, per https://vega.github.io/vega-lite/docs/parameter.html, I don't think an expression reference is supported as input for an encoding channel. Within an encoding channel it can can only be included as predicate and for data extents (but then as param and not as expr).

@ChristopherDavisUCI
Copy link
Contributor Author

@mattijn Ah, thank you, very interesting!

Playing around on the Vega editor, I think something like this is allowed, where the "expr" has been wrapped as a "value":

{
  "params": [
    { "name": "barHeight", "value": 5}
  ],
  "data": {
    "values": [
      {"a": "A"}, {"a": "B"}, {"a": "C"}
    ]
  },
  "mark": {
    "type": "bar"
  },
  "encoding": {
    "x": {"field": "a", "type": "nominal"},
    "y": {"value": {"expr": "barHeight"}}
  }
}

I guess on the Altair side, we should allow something like alt.value(bar_variable). Do you think we should allow further abbreviations? Maybe not, since we don't allow y = 5, for example (only y = alt.value(5))?

@mattijn
Copy link
Contributor

mattijn commented Nov 15, 2021

Ah yes I see, yes that make sense. So then you can eventually do the same for datum and field. Something like this:

import altair as alt
from vega_datasets import data

cars = data.cars.url

mpg_slider = alt.binding_range(min=0, max=50, step=1)
rule_mpg = alt.variable(bind=mpg_slider, value=0.2, name="mpg_rule")

dots= alt.Chart(cars).mark_circle().encode(
    x='Miles_per_Gallon:Q',
    y='Horsepower:Q'
)

yrule = alt.Chart().mark_rule(strokeDash=[12, 6], size=2).encode(
    y=alt.datum(rule_mpg)  # <-- how to infer type? eg. alt.Y(alt.datum(rule_mpg), type='quantitative') ?
).add_variable(rule_mpg)

dots + yrule

Open the Chart in the Vega Editor

image

And:

import altair as alt
from vega_datasets import data

cars = data.cars.url

x_options = alt.binding_select(options=["Horsepower", "Acceleration"])
x_var = alt.variable(bind=x_options, value="Horsepower", name="var_x")

dots = alt.Chart(cars).mark_circle().encode(
    x=alt.field(x_var),  # <--- not existing yet on altair side and how to infer type?
    y='Horsepower:Q'
).add_variable(x_var)

Open the Chart in the Vega Editor (not possibe, as raised in vega/vega-lite#7365)

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 15, 2021

One piece of context: the reason we require y=alt.value(5) rather than y=5 is because values may be strings, and otherwise y='x' would cause ambiguity between 'x' as a value and 'x' as a field name.

@mattijn
Copy link
Contributor

mattijn commented Nov 22, 2021

Gentle ping @domoritz, we have a question regarding #2517 (comment), do you know if something happened or is missing related to vega-utils, vega-embed since VL5?
Edit: or any hints how to debug deeper?

@ChristopherDavisUCI
Copy link
Contributor Author

Thanks @mattijn, I also asked @domoritz on Slack and he suggested it should be something more basic than what I thought. Probably just a version needs to be updated. I will investigate tonight.

@domoritz
Copy link
Member

Could you generate a standalone HTML page where you get the error with vega-util? I can take a look.

@ChristopherDavisUCI
Copy link
Contributor Author

Thank you! Does this count as a standalone html page? link on Google drive

I produced it by making a Jupyter notebook with the error and then downloading that notebook as an html file.

I can definitely do more debugging myself also, now that I know it should be an out-of-date reference.

@mattijn
Copy link
Contributor

mattijn commented Nov 23, 2021

I used this piece of code to debug. It works OK in JupyterLab (to start do from cmd jupyter-lab), but not in Jupyter Notebook (to start do from cmd jupyter notebook):

It is based on the spec_to_html() Altair function.

from altair.utils.html import spec_to_html
from IPython.core.display import display, HTML
import json

dct = json.loads("""{
  "data": {
    "values": [
      {"a": "A", "b": 28},{"a": "B", "b": 55},{"a": "C", "b": 43},{"a": "D", "b": 91},
      {"a": "F", "b": 53},{"a": "G", "b": 19},{"a": "H", "b": 87},{"a": "I", "b": 52}
    ]
  },
  "mark": "bar",
  "encoding": {
    "x": {"field": "a", "type": "nominal"},
    "y": {"field": "b", "type": "quantitative"}
  },
  "height": 200,
  "width": 200
}
"""
)
# dct  # this seems ok

h = spec_to_html(
    spec=dct,
    mode='vega-lite',
    vega_version='5.21.0', 
    vegaembed_version='6.20.2', 
    vegalite_version='5.1.1',
    requirejs=False,
    base_url='https://cdn.jsdelivr.net/npm',
    fullhtml=True,
    template='standard'
)

# print(h)  # seems valid HTML

display(HTML(h))  # this doesn't work

For Jupyter Notebook I get this error in the console log:

VM48:17 
        
       Uncaught ReferenceError: vegaEmbed is not defined
    at <anonymous>:17:8
    at b (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:2)
    at Pe (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:2)
    at S.fn.init.append (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:2)
    at OutputArea._safe_append (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:59205)
    at OutputArea.append_display_data (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:59412)
    at OutputArea.append_output (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:59092)
    at OutputArea.handle_output (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:59003)
    at output (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:60836)
    at Kernel._handle_output_message (main.min.js?v=24786799f8766c957f062e6b8f0ea89d00055170b051b7d318e97aedfb60dea047c35f2dfd92a435f058e984b092f71bc1fc7d28aacf0c330637e94e2ac72e5d:62597)

EDIT: Based on this line: https://github.com/altair-viz/altair/blob/8a8642b2e7eeee3b914850a8f7aacd53335302d9/altair/vega/v5/display.py#L73 I observe that the default parameter for template is universal. If I change the template parameter to universal, I get this error (only once per session, I've to restart Jupyter Notebook to see it again):

VM31:27 
        
       Uncaught Error loading script: Script error for "vega-util", needed by: vega-lite
http://requirejs.org/docs/errors.html#scripterror

Basically meaning I can reproduce it with the following HTML string:

from IPython.core.display import display, HTML
h = """<div id="vis"></div>
<script type="text/javascript">
  (function(spec, embedOpt){
    let outputDiv = document.currentScript.previousElementSibling;
    if (outputDiv.id !== "vis") {
      outputDiv = document.getElementById("vis");
    }
    const paths = {
      "vega": "https://cdn.jsdelivr.net/npm/vega@5.21.0?noext",
      "vega-lib": "https://cdn.jsdelivr.net/npm/vega-lib?noext",
      "vega-lite": "https://cdn.jsdelivr.net/npm/vega-lite@5.1.1?noext",
      "vega-embed": "https://cdn.jsdelivr.net/npm/vega-embed@6.20.2?noext",
    };

    function loadScript(lib) {
      return new Promise(function(resolve, reject) {
        var s = document.createElement('script');
        s.src = paths[lib];
        s.async = true;
        s.onload = () => resolve(paths[lib]);
        s.onerror = () => reject(`Error loading script: ${paths[lib]}`);
        document.getElementsByTagName("head")[0].appendChild(s);
      });
    }

    function showError(err) {
      outputDiv.innerHTML = `<div class="error" style="color:red;">${err}</div>`;
      throw err;
    }

    function displayChart(vegaEmbed) {
      vegaEmbed(outputDiv, spec, embedOpt)
        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));
    }

    if(typeof define === "function" && define.amd) {
      requirejs.config({paths});
      require(["vega-embed"], displayChart, err => showError(`Error loading script: ${err.message}`));
    } else if (typeof vegaEmbed === "function") {
      displayChart(vegaEmbed);
    } else {
      loadScript("vega")
        .then(() => loadScript("vega-lite"))
        .then(() => loadScript("vega-embed"))
        .catch(showError)
        .then(() => displayChart(vegaEmbed));
    }
  })({"data": {"values": [{"a": "A", "b": 28}, {"a": "B", "b": 55}, {"a": "C", "b": 43}, {"a": "D", "b": 91}, {"a": "F", "b": 53}, {"a": "G", "b": 19}, {"a": "H", "b": 87}, {"a": "I", "b": 52}]}, "mark": "bar", "encoding": {"x": {"field": "a", "type": "nominal"}, "y": {"field": "b", "type": "quantitative"}}, "height": 200, "width": 200}, {"mode": "vega-lite"});
</script>

"""
display(HTML(h)) 

Only once per session, I've to restart Jupyter Notebook to see it again (not just restart the kernel)

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 23, 2021

The error is coming from requirejs. JupyterLab does not use requirejs, while Jupyter Notebook does, and the universal template is the only HTML output that attempts to support requirejs. I suspect the issue is in one of the vega javascript sources, the requirejs mode specifies a dependency on vega-util. I'm not sure whether that's intended.

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 23, 2021

You can see the requirement declared at the top of the vega-lite v5 source: https://cdn.jsdelivr.net/npm/vega-lite@5.1:

!function(e,t){"object"==typeof exports&&"undefined"!=typeof module?t(exports,require("vega-util"),require("vega")) ...

Again, this problem will only arise in an environment where requirejs is loaded. If it's intended on the Vega-Lite side, it means that an extra library is necessary when creating vega-lite charts in a requireJS context. We could add this to the universal renderer:

const paths = {
      "vega": "https://cdn.jsdelivr.net/npm/vega@5.21.0?noext",
      "vega-lib": "https://cdn.jsdelivr.net/npm/vega-lib?noext",
      "vega-lite": "https://cdn.jsdelivr.net/npm/vega-lite@5.1.1?noext",
      "vega-embed": "https://cdn.jsdelivr.net/npm/vega-embed@6.20.2?noext",
      "vega-util": "https://cdn.jsdelivr.net/npm/vega-util?noext",
    };

But since it's not mentioned anywhere in the vega-embed documentation, I suspect this is a bug in the vega-lite.js build process. @domoritz might have a more definitive answer.

@domoritz
Copy link
Member

Good catch. Since Vega exports all the utilities from Vega-Util, we try to replace all imports from Vega-Util with imports from Vega in the bundle. It should happen with https://github.com/vega/vega-lite/blob/12bee8b450c1176434b5166ca5f42515c41a78db/rollup.config.js#L36 but evidently does not here. I will fix this now.

@mattijn
Copy link
Contributor

mattijn commented Nov 23, 2021

No typescript user, but here it seems mentioned as an external dependency? https://github.com/vega/vega-lite/blob/12bee8b450c1176434b5166ca5f42515c41a78db/rollup.config.js#L81

@domoritz
Copy link
Member

I think I fixed the issue in vega/vega-lite#7823. It turns out rollup doesn't let us use both external and global. We have to alias the dependencies manually. The original goal was to reduce code duplication and import functionality from the Vega bundle instead of bundling it again.

@domoritz
Copy link
Member

Can you try Vega-Lite 5.2.0?

@ChristopherDavisUCI
Copy link
Contributor Author

I checked and it's working now in Jupyter Notebook! Thanks @domoritz for the very fast work!

@domoritz
Copy link
Member

Thank y'all for isolating the issue. It helped tremendously.

@ChristopherDavisUCI
Copy link
Contributor Author

@jakevdp Can I ask, how do you think selection/variable/parameter should exist in the Vega-Lite 5 version of Altair? Do you like the idea of all three existing, or should there only be selection and variable or maybe only parameter?

Currently as @mattijn pointed out above, there are .add_selection, .add_variable, and .add_parameter methods defined on Chart objects that all do the exact same thing. (The first two are aliases for .add_parameter.)

I have some ad hoc code right now to keep old examples from breaking. For example, if a selection has init specified, I change that to value. And if type='single' or type='multi' is specified, I change that to type='point'. And if a selection has empty specified, I have some code to move that into alt.condition or into transform_filter. Does that all sound okay, or is it a longterm headache to have this sort of code to fix one-off problems?

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 24, 2021

My feeling on this is we should stick to the language of the Vega-Lite schema; that is, use the name parameter, the method add_parameter, etc. For the 5.0 release, we could keep the previous selection functions around, but have them raise a DeprecationWarning mentioning the new syntax, and that the selection functionality will be removed in a future release. What do you think?

@ChristopherDavisUCI
Copy link
Contributor Author

That sounds good to me and it should be pretty easy to implement!

@ChristopherDavisUCI
Copy link
Contributor Author

@mattijn pointed out this old PR: #1629

Is this a good time to try out adding the code from there?

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 24, 2021

Is this a good time to try out adding the code from there?

Probably not... I'm not entirely sure about the approach in that PR. The fundamental issue is that currently for schema objects, attribute access returns the attribute. That is,

import altair as alt
x = alt.X(field='x')
print(x.field)
# x

In order to have the semantics proposed there, attribute access would have to return a function that sets the attribute; that is, we'd have to change the API of Altair objects to be something more like this:

x = alt.X().field('x')
print(x.field)
# <bound method X.field of <__main__.X object at 0x7fbf5f349d00>>
print(x['field'])
# x

That's certainly doable, but it's a fundamental departure from Altair's current object model. We could discuss whether the advantages of that outweight the disadvantages, but I think that's not something that should be done as part of this PR.

@ChristopherDavisUCI
Copy link
Contributor Author

Okay cool. I'll ask no more questions until after Thanksgiving!

@mattijn
Copy link
Contributor

mattijn commented Nov 25, 2021

I raised it primarily because of the parameter definition. When having a chart with a few sliders, you have to define them now as follow:

rad_slider = alt.binding_range(min=0, max=100, step=1)
rad_var = alt.variable(bind=rad_slider, value=0, name="radius")

rad2_slider = alt.binding_range(min=0, max=100, step=1)
rad_var2 = alt.variable(bind=rad_slider, value=50, name="radius2")

theta_slider = alt.binding_range(min=-2*np.pi, max=2*np.pi)
theta_var = alt.variable(bind=theta_slider, value=-0.73, name="theta_single_arc")

theta_slider2 = alt.binding_range(min=-2*np.pi, max=2*np.pi)
theta2_var = alt.variable(bind=theta_slider, value=0.73, name="theta2_single_arc")

corner_slider = alt.binding_range(min=0, max=50, step = 1)
corner_var = alt.variable(bind=corner_slider, value=0, name="cornerRadius")

pad_slider = alt.binding_range(min=0, max=np.pi/2)
pad_var = alt.variable(bind=pad_slider, value=0, name="padAngle")

I feel it is a bit verbose and wish there is a shorthand similar to #1629, so it can be done as an one-liner, eg.

pad_var = alt.param(value=0, name='padAngle').bind(alt.range(min=0, max=np.pi/2))

But maybe too much for now.

@ChristopherDavisUCI
Copy link
Contributor Author

If a parameter in Vega-Lite is defined by

{
  "name": "sel",
  "select": {"type": "point", "fields": ["Miles_per_Gallon"], "toggle": false}
}

what would be a good way to define that parameter in Altair? Is something like the following okay, or is it too verbose?

s = alt.selection_point(fields=["Miles_per_Gallon"], toggle=False)
p = alt.parameter(select=s)

As far as I can tell, there's no "Parameter" defined in the Vega-Lite schema. There is an

"anyOf": [
  {
    "$ref": "#/definitions/VariableParameter"
  },
  {
    "$ref": "#/definitions/SelectionParameter"
  }
]

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 27, 2021

Having to wrap alt.selection in alt.parameter seems a bit cumbersome. But I haven't thought deeply about this question

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 27, 2021

Maybe alt.selection_single by itself should return the full parameter definition?

@ChristopherDavisUCI
Copy link
Contributor Author

I started a new Pull Request #2528 so that I could clean up some of the commit history. It uses the most recent suggestion of mostly following the terminology of Vega-Lite schema, but allowing selection_point etc to define parameters.

I'll probably close this PR in a day or two unless someone says I should leave it open.

Thanks and see you at #2528!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants