Update to Vega-Lite 4.17.0 #2513

ChristopherDavisUCI · 2021-11-01T15:01:59Z

Hi again,

@mattijn and I have tried to update the Pull Request from last week, taking into account your suggestions and trying to get all the tests (that I know about) to work.

Here are some of the main changes from the current Altair release:

Changed the Vega-Lite schema version to 4.17.0.
Added definitions of DatumSchemaGenerator and DatumChannelMixin in generate_schema_wrapper.py.
Updated the loop in generate_vegalite_channel_wrappers here to allow for the possibility of a datum.
Updated infer_encoding_types from altair/utils/core.py here to recognize Datum class names.
Changed get_valid_identifier in tools/schemapi/utils.py to deal with some symbols like [] appearing in certain names in the Vega-Lite schema. (Removing those symbols led to some duplicated class names.)
Updated TopLevelMixin from altair/vegalite/v4/api.py to allow for layer to be repeated, in addition to row and column.
Updated the Encoding Channel Options part of the docs here. (I wasn't sure if there was a way to auto-generate these groupings of for example Row with Column with Facet.)
Added some examples to the example gallery here and some additional documentation about mark_arc and layering in repeat here.
Applied a temporary fix and linked Vega-Lite issue to get test_vegalite_to_vega_mimebundle to work here.

Main outstanding issue that we know of:

Starting with v4.9, the format of TopLevelRepeatSpec changed in the Vega-Lite schema. We have some code to deal with this by editing the definition of TopLevelRepeatSpec in the downloaded JSON file. We've tried asking on the Vega-Lite Slack channel and as a Vega-Lite GitHub issue, but it seems like the issue is on the Altair side, not the Vega-Lite side. If the rest of the changes seem mostly in good shape, I can focus on finding a more adequate solution.

This is not part of the current Pull Request, but to get some of the tests to work, we made the following changes to altair_viewer:

Save https://unpkg.com/vega-lite@4.17.0/build/vega-lite.min.js as vega-lite-4.17.0.js in altair_viewer\altair_viewer\scripts
Add "4.17.0" in listing.json in altair_viewer\scripts

Thank you for any feedback!

Sample417

The way the schema is written for TopLevelRepeatSpec gives the current code trouble. We patch it in an ad hoc way, and should later find a more sustainable method.

jakevdp · 2021-11-01T16:07:06Z

This looks great! Thanks for taking a stab at it.

In terms of the RepeatSpec issue, I do think this is an Altair issue rather than a Vega-Lite issue. In the past, all top-level specs have been objects rather than unions, but I don't think deviating from this is indicative of a bug. It's just something the Altair wrappers need to be expanded to handle.

I think rather than patching the JSON, the best fix would be to define the expected API in this class: https://github.com/altair-viz/altair/blob/8a8642b2e7eeee3b914850a8f7aacd53335302d9/altair/vegalite/v3/api.py#L1824

That is, you can derive that from VegaLiteSchema and set the _schema element manually. That way there's no mucking with the Vega-Lite schema itself.

jakevdp · 2021-11-01T16:09:58Z

Regarding the testing: the way I've done it in the past is to add a branch to altair viewer with the necessary changes, and temporarily adjust the CI in this branch to point to that. Then when we're happy with everything, we cut a new altair-viewer release and then change this CI back to normal.

mattijn · 2021-11-02T06:45:25Z

From here: 3fc9a86 I see you update master of altair_viewer to include support for VL4.8.1. Once altair-viz/altair_viewer#43 is merged we can follow same principle.

1eef3de and d594587 now point to master of altair_viewer

ChristopherDavisUCI · 2021-11-04T23:05:46Z

Hmm, this isn't an error I've seen before; is it obvious what it means? Trying it on my computer I got a similar error, and it seemed to point to line 278 here: https://github.com/ChristopherDavisUCI/altair/blob/25baa86dbd921a30761bcc5300b0c06397eae99d/altair/sphinxext/altairplot.py#L275-L282

jakevdp · 2021-11-04T23:23:55Z

Ah, yeah I think the issue there is that our default class wrapper thing is too aggressive now. alt.InlineDataSet has a type signature that looks something like anyOf(List(float), List(string), List(boolean), string), which means that any object that in from_dict(), any object that matches these types will be wrapped in an InlineDataSet class.

We probably don't want that.

So we'll need a way to distinguish between anyOf schemas that we do want to wrap with the parent type vs. ones that we don't.

Perhaps we could explicitly register "simple" schemas (like {type: "string"}, {type: "array", "items": {type: "float"}}, etc.) that we don't want to be wrapped with schema classes?

jakevdp · 2021-11-04T23:28:40Z

Another piece of context here: it's very tempting to introduce specific patches or checks to fix this kind of bug as it comes up... the problem with that is that as Vega-Lite evolves, those kinds of one-off changes become very difficult to manage and maintain. It's a mistake I made early on in writing Altair. That's why, for each of these issues, my inclination is to try to solve the most general version of the problem that's surfacing, because then maybe future upgrades will be just a bit easier.

ChristopherDavisUCI · 2021-11-05T02:31:45Z

I experimented a little with some of the examples and most of them still seem to work correctly. Here is the smallest example I could find that does not work:

d = {'data': {'values': [{}]},
       'mark': {'type': 'point'}}

alt.Chart.from_dict(d).to_json()

Here is the result of alt.Chart.from_dict(d).to_dict():

{'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}},
 'data': {'name': 'data-a21cafb4c405e6997671a02e578b9b1e'}, 
 'mark': {'type': 'point'}, 
 '$schema': 'https://vega.github.io/schema/vega-lite/v4.17.0.json', 
 'datasets': {'data-a21cafb4c405e6997671a02e578b9b1e': InlineDataset([{}])}}

jakevdp · 2021-11-05T02:35:11Z

Ah, that’s helpful. My guess is there’s a to_dict call missing somewhere in the dataset handler.

mattijn · 2021-11-05T08:16:21Z

I'm not sure if my commit resolves the effect or the cause. But before, this:

'data': {'values': [{}]}

resolved into alt.InlineData([{}]).
Where it now resolves into alt.InlineData(alt.InlineDataset([{}]).

So I apply alt.InlineDataset([{}]).to_dict() within the data consolidation function.

jakevdp · 2021-11-05T13:37:21Z

altair/vegalite/v4/api.py

+            if isinstance(data.values, core.InlineDataset):
+                values = data.values.to_dict()
+            else:
+                values = data.values


Nice find! This seems like the right fix. Maybe make it a bit more robust and do

values = data.to_dict()['values']

jakevdp · 2021-11-05T15:35:56Z

I think this is looking pretty good! What do you think? Anything missing at this point?

Before merging, I'll create a new altair_viewer release and we can change the CI requirements back to normal.

ChristopherDavisUCI · 2021-11-05T15:41:48Z

I think this is looking pretty good! What do you think? Anything missing at this point?

Nothing missing that I'm aware of!

mattijn · 2021-11-05T15:55:16Z

Merge-ready! +1

joelostblom · 2021-11-05T15:55:45Z

This is awesome! Thank you all so much for working on this PR, the notifications from this thread have been my favorite morning read over the last few days ❤️

jakevdp · 2021-11-06T12:30:59Z

In altair-viz/altair_viewer#45 I added the most recent vega & vega-embed versions to altair viewer. I triggered a re-run of the CI here to make sure there are no issues.

jakevdp · 2021-11-06T14:20:03Z

Alright, altair-viewer 0.4.0 is released! We can remove the temporary reference to its github branch from this PR

mattijn · 2021-11-06T14:32:13Z

I checked if the tests use altair_viewer version 0.4.0, and this seems to be the case. All good.

jakevdp · 2021-11-08T13:56:32Z

Thanks all - I think we should merge this!

Given the number of commits & nonlinear history, I think I'll opt to squash and merge – I believe that will maintain correct authorship (@ChristopherDavisUCI will be in the author field, @mattijn will be listed in "Co-authored By" and I would be in the committer field). Is everyone happy with that?

ChristopherDavisUCI · 2021-11-08T14:03:31Z

Great! That definitely sounds good to me. (It will be especially nice to get rid of some of those commits going back and forth between different fixes for TopLevelRepeatSpec.)

mattijn · 2021-11-08T14:05:34Z

Perfect

jakevdp · 2021-11-08T14:06:58Z

Alright, it's done! Thanks all, this is really great work 🎉

ChristopherDavisUCI · 2021-11-08T16:32:53Z

Awesome! Will there be a corresponding "release"?

This has been a very fun (and fast-moving) process so I'd like to try playing around with vega-lite version 5. Is it better if I make a pull request early and then you can see what we're trying along the way, or should I wait until we're "finished" or "stuck" and then make the pull request? I think many changes will be necessary related to selections. In general I'm very open to suggestions, like "make fewer commits" or "make smaller self-contained pull requests" or whatever.

Thanks for all the attention!

jakevdp · 2021-11-08T16:53:39Z

Regarding a release... I'd like to iron-out the "default data" issue discussed above. I also did a bit of maintenance stuff this morning (updating CI jobs, etc.) but yes, it would be great if we could get a release candidate out sometime this week.

Regarding VL5 - that would be great if you want to open a draft PR! The biggest issue I anticipate there is that VL5 replaced "selections" with a more general "params" specification, so I suspect a number of things will have to change to accommodate that.

ChristopherDavisUCI · 2021-11-09T05:56:09Z

I've experimented a little with Vega-Lite v5 on this branch, although I might want to redo some things now that I better understand what's going on.

Surprisingly the biggest obstacle so far hasn't been params but a change to layer. I think in the newest Vega-Lite schema, charts in a layer are not allowed to specify height or width, which seems to break many Altair examples. Here is a minimal example that doesn't work:

no_data = pd.DataFrame()

c = alt.Chart(no_data).mark_circle().properties(
    width=100
)

c+c

I don't see a good way to deal with that. Do you have a suggestion?

I've read the list of "breaking changes" for the Vega-Lite 5.0.0 release and don't see anything that seems related to this, so it does make me wonder if maybe I misunderstand the cause of the problem.

mattijn · 2021-11-09T08:05:46Z

Hi Chris, if you move your last comment to here #2425, or start a new pull request titled 'WIP: update to Vega-Lite 5' and add your observations there, than I'm more than happy to join the process (and maybe others as well!). It would be nice if we can get further with fewer help from Jake.

Its indeed super awesome to align a few brains and go forward, but be careful, it's addicting and if its your spare-time than other things also probably like attention;)

But if we can give ourselves a Christmas gift with a new Altair release, either based on VL4.17 or 5, then I think we are doing great.

ChristopherDavisUCI · 2021-11-09T13:55:12Z

Thanks @mattijn and yes, "addicting" is a good word for it! Thanks for the suggestion; I made a new draft PR ~~#2516~~ #2517

ChristopherDavisUCI and others added 25 commits October 23, 2021 14:48

Include datum in channel wrappers

cc2f9ec

Correct names like NameDatumValue

ede3c68

Update altair/utils/core.py with Datum

fcf005b

Update DatumChannelMixin

5485c23

Replace [] with Array in get_valid_identifier

5ae3399

Allow "week" as time unit

f45e75b

Update allowable time units

6411078

Update TopLevelRepeatSpec

2c78afa

include layer for repeat

0ba0c79

removed TopLevelRepeatSpec, its the schema

3251337

auto-generated files

aeea9f3

Merge pull request #1 from mattijn/sample417

8b41830

Sample417

Dealing with TopLevelRepeatSpec

8305ac3

The way the schema is written for TopLevelRepeatSpec gives the current code trouble. We patch it in an ad hoc way, and should later find a more sustainable method.

Formatting updates

105c924

Update "Encoding Channel Options" part of docs

0a15c5d

Examples of Datum and mark_arc

cec423e

Remove old ad hoc code for TopLevelRepeatSpec

ab9ad94

add linked vegalite issue

7b76665

remove None values in compiled vega so test pass

6fafc33

add linked vl-issue

dfcbc2f

add more documentation on new arc mark

9a365d4

Merge branch 'sample417' into patchtoplevelrepeat

07fe2c3

Update generate_schema_wrapper.py

2ed2814

Update to allow the correct spec options in TopLevelRepeatSpec

08d4b3d

Replace spaces with underscores in IMDB column names

771e036

mattijn added 2 commits November 2, 2021 07:48

Update build.yml

1eef3de

Update docbuild.yml

d594587

Update schemapi.py

25baa86

serialize values InlineDataset within InlineData

c10c57e

jakevdp reviewed Nov 5, 2021

View reviewed changes

improve robustness

93e8467

remove link to master of altair_viewer

464b203

jakevdp merged commit 55bb4b7 into vega:master Nov 8, 2021

ChristopherDavisUCI deleted the sample417 branch November 9, 2021 05:56

jakevdp mentioned this pull request Nov 9, 2021

Inject empty data at top level for easier datum encodings #2515

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to Vega-Lite 4.17.0 #2513

Update to Vega-Lite 4.17.0 #2513

ChristopherDavisUCI commented Nov 1, 2021

jakevdp commented Nov 1, 2021

jakevdp commented Nov 1, 2021

mattijn commented Nov 2, 2021 •

edited

ChristopherDavisUCI commented Nov 4, 2021

jakevdp commented Nov 4, 2021 •

edited

jakevdp commented Nov 4, 2021

ChristopherDavisUCI commented Nov 5, 2021

jakevdp commented Nov 5, 2021

mattijn commented Nov 5, 2021 •

edited

jakevdp Nov 5, 2021

mattijn Nov 5, 2021

jakevdp commented Nov 5, 2021

ChristopherDavisUCI commented Nov 5, 2021

mattijn commented Nov 5, 2021

joelostblom commented Nov 5, 2021 •

edited

jakevdp commented Nov 6, 2021

jakevdp commented Nov 6, 2021

mattijn commented Nov 6, 2021 •

edited

jakevdp commented Nov 8, 2021 •

edited

ChristopherDavisUCI commented Nov 8, 2021

mattijn commented Nov 8, 2021

jakevdp commented Nov 8, 2021

ChristopherDavisUCI commented Nov 8, 2021

jakevdp commented Nov 8, 2021 •

edited

ChristopherDavisUCI commented Nov 9, 2021

mattijn commented Nov 9, 2021

ChristopherDavisUCI commented Nov 9, 2021 •

edited

Update to Vega-Lite 4.17.0 #2513

Update to Vega-Lite 4.17.0 #2513

Conversation

ChristopherDavisUCI commented Nov 1, 2021

jakevdp commented Nov 1, 2021

jakevdp commented Nov 1, 2021

mattijn commented Nov 2, 2021 • edited

ChristopherDavisUCI commented Nov 4, 2021

jakevdp commented Nov 4, 2021 • edited

jakevdp commented Nov 4, 2021

ChristopherDavisUCI commented Nov 5, 2021

jakevdp commented Nov 5, 2021

mattijn commented Nov 5, 2021 • edited

jakevdp Nov 5, 2021

Choose a reason for hiding this comment

mattijn Nov 5, 2021

Choose a reason for hiding this comment

jakevdp commented Nov 5, 2021

ChristopherDavisUCI commented Nov 5, 2021

mattijn commented Nov 5, 2021

joelostblom commented Nov 5, 2021 • edited

jakevdp commented Nov 6, 2021

jakevdp commented Nov 6, 2021

mattijn commented Nov 6, 2021 • edited

jakevdp commented Nov 8, 2021 • edited

ChristopherDavisUCI commented Nov 8, 2021

mattijn commented Nov 8, 2021

jakevdp commented Nov 8, 2021

ChristopherDavisUCI commented Nov 8, 2021

jakevdp commented Nov 8, 2021 • edited

ChristopherDavisUCI commented Nov 9, 2021

mattijn commented Nov 9, 2021

ChristopherDavisUCI commented Nov 9, 2021 • edited

mattijn commented Nov 2, 2021 •

edited

jakevdp commented Nov 4, 2021 •

edited

mattijn commented Nov 5, 2021 •

edited

joelostblom commented Nov 5, 2021 •

edited

mattijn commented Nov 6, 2021 •

edited

jakevdp commented Nov 8, 2021 •

edited

jakevdp commented Nov 8, 2021 •

edited

ChristopherDavisUCI commented Nov 9, 2021 •

edited