Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add method-based attribute setting to encoding classes #1629

Closed
wants to merge 1 commit into from

Conversation

jakevdp
Copy link
Collaborator

@jakevdp jakevdp commented Jul 26, 2019

Do not merge until Altair 4.0.

This is a new prototype of how to set encoding attributes via methods rather than arguments. So rather than

alt.Chart(data).mark_point().encode(
    alt.X('column:Q', axis=alt.Axis(None), scale=alt.Scale(zero=False), title='my title'),
    alt.Color('color:N', legend=alt.Legend(orient='bottom'), scale=alt.Scale(scheme='viridis'))
)

you could instead do

alt.Chart(data).mark_point().encode(
    alt.X('column:Q').axis(None).scale(zero=False).title('my title'),
    alt.Color('color:N').legend(orient='bottom').scale(scheme='viridis')
)

note that the old way still works as well.

The tradeoff is that attributes of encoding classes can no longer be accessed directly via getattr, but this was not a very common pattern anyway.

Still TODO:

  • add useful documentation and keyword argument completion for these functions.
  • tests ensuring equivalence of old and new syntax.

If you'd like to try this in its current state, you can run

pip install git+https://github.com/jakevdp/altair.git@attr-access

@domoritz
Copy link
Member

domoritz commented Jul 26, 2019

I think this is awesome! I wanted to raise a few questions/concerns to make sure I fully understand the proposal.

Do we support the method-style API for nested properties or just encodings?

One thing I am worried about is how people access nested properties. For example, if I want to customize the scheme.

This is how I set the Scale's scheme property.

alt.Chart(data).mark_point().encode(
    alt.Color('color:Q').scale(scheme='viridis')
)

If I now want to customize the scheme, I need to switch from the method style to the argument style. I hope it's not going to be too surprising for people.

alt.Chart(data).mark_point().encode(
    alt.Color('color:Q').scale(scheme=alt.Scheme(name='viridis', extent=[0, 0.9]))
)

I assume that Altair supports this method syntax for everything, not just encodings, right?

alt.Chart(data).mark_point().encode(
    alt.Color('color:Q').scale(alt.Scheme().name('viridis').extent([0, 0.9]))
)

Otherwise, I worry about making a special syntax for encodings.

Formatting long specs

The other thing I think about is formatting. Is this valid Python?

alt.Chart(data).mark_point().encode(
    alt.X('column:Q')
		.axis(None)
		.scale(zero=False)
		.title('my title'),
    alt.Color('color:N')
		.legend(orient='bottom')
		.scale(scheme='viridis')
)

In the current syntax, one can write this.

alt.Chart(data).mark_point().encode(
    alt.X(
		'column:Q',
		axis=alt.Axis(None),
		scale=alt.Scale(zero=False),
		title='my title'),
    alt.Color(
		'color:N',
		legend=alt.Legend(orient='bottom'),
		scale=alt.Scale(scheme='viridis')
	)
)

Help people use the new way

Another concern is that Altair now has two ways to express the same thing. I definitely think that we want to be backward compatible but I wonder how we can help people transition to the new API without too much confusion. How difficult would it be to write a converter? Kind of like 2to3 for Python 3?

@jakevdp
Copy link
Collaborator Author

jakevdp commented Jul 26, 2019

Thanks @domoritz – a few thoughts:

Do we support the method-style API for nested properties or just encodings?

Up until now we've supported limited method-style APIs for top-level objects (e.g. alt.Chart().mark_point().encode().configure_scale()...etc.) The proposal here is to extend that kind of syntax only to encoding channel objects. It's not clear to me that extending that to further nested objects would be useful. In your example of the scheme, how would you propose a syntax for nesting method-based setting? scale() returns the parent encoding object to allow chaining. So maybe you could do

alt.Color('column', 
    scale=alt.Scale().scheme('viridis', extent=[0, 0.9])
)

but it's not clear to me that's better than

alt.Color('column').scale(
    scheme=alt.Scheme('viridis', extent=[0, 0.9])
)

particularly since nested settings like this are so much less common than their unnested counterparts. A question: how does the JS API deal with nested settings like this?

Formatting long specs

Roughly, in Python whitespace doesn't matter if you're inside a parentheses block. So for example this is not valid Python

alt.Chart()
    .mark_point()
    .encode(x='foo')

but it's really nice, which is why sometimes you come across code written like this, which is valid Python:

(alt.Chart()
    .mark_point()
    .encode(x='foo'))

Because these encoding properties generally appear within parentheses, they can take advantage of this, and this is indeed valid Python:

alt.Chart(data).mark_point().encode(
    alt.X('column:Q')
		.axis(None)
		.scale(zero=False)
		.title('my title'),
    alt.Color('color:N')
		.legend(orient='bottom')
		.scale(scheme='viridis')
)

Helping people use the new way

The old way would of course still be valid, so most code will not break (the exception is, e.g., if someone tries to access chart.encoding.x.scale and expects it to be an alt.Scale object). But I'd change all examples and demos to use the new syntax, in order to encourage its adoption.

@jcmkk3
Copy link

jcmkk3 commented Jul 26, 2019

Would it make sense to consider using “magic underscore” style arguments for nested properties beyond the method’s depth?

The Plotly.py library recently implemented that syntax based on successful usage in the Julia version of Plotly. Read more about the Plotly.py implementation in this issue.

That would turn:

alt.Chart(data).mark_point().encode(
    alt.Color('color:Q').scale(scheme=alt.Scheme(name='viridis', extent=[0, 0.9]))
)

into:

alt.Chart(data).mark_point().encode(
    alt.Color('color:Q').scale(scheme_name='viridis', scheme_extent=[0, 0.9])
)

@iliatimofeev
Copy link
Contributor

Maybe:

alt.Chart(data).mark_point().encode(
    alt.Color('color:Q', alt.Scale(alt.Scheme(name='viridis', extent=[0, 0.9]))
)

@jakevdp
Copy link
Collaborator Author

jakevdp commented Jul 29, 2019

@iliatimofeev that sort of specification was available in Altair v1, but we abandoned it in later versions.

@iliatimofeev
Copy link
Contributor

@jakevdp for encoding channels it still works in your prototype :).
As I understand the goal is to avoid doubling semantics like scale = alt.Scale()
Chaining methods look very cool on the second layer but breaks on the third. But scale = alt.Scale( scheme=alt.Scheme()) is quite regular case.

The idea of “magic underscore” is cool too but it will break compatibility with vega-lite documentation.

I guess type based notation is more universal
alt.Color(alt.Scale( alt.Scheme('blues'),alt.Domain(-1,1), clip = False)

Do you remember the reasons why it is not supported in v2?

@jakevdp
Copy link
Collaborator Author

jakevdp commented Jul 29, 2019

It's always worked for arguments of chart.encode(), but only worked for encoding channel arguments in Altair v1.

The main reason we dropped it is because it became difficult to implement in a consistent way when we moved from a traitlet-based prescriptive type system to a jsonschema-based proscriptive type system.

@iliatimofeev
Copy link
Contributor

I know two cases where "types" style wouldn't work: tooltips channel and conditions.
For tooltips, we can add an extra virtual channel: atl.Tooltips("data1:N","data2:T", alt.Tooltip("data_t",timeUnit="day"))
And for conditions add extra parameter else to channels, like: alt.Color("data1:N", alt.Else(selection,"red"))

@ellisonbg
Copy link
Collaborator

ellisonbg commented Aug 26, 2019

tl;dr - I like method chaining, but let's not break the attribute API in the process.

The original idea was for the chained method API to provide a thin optional layer on top of the attribute based approach, in particular for "single expression" interactive usage. However, for deeper API usage where you are closing, reusing, mutating, etc. Chart objects across a library, the attribute based approach remains powerful and expressive.

Because of this, the main concern I have here is breaking the attribute based API. I think that will cause a very high level of pain for users and major API breakage. The reason the existing encode method doesn't interfere with the attribute based API is that the noun (encoding) is the attribute and the very (encode) is the method. While libraries regularly break this separate of noun-based attributes and verb-based attributes, it was one that we tried to follow closely in the Altair APIs. The reason this proposed API breaks the attribute based access is precisely because it overrides the noun-based attributes to be methods. Because of that I am not in favor of this API change as is, but would welcome more method chaining that preserves the attribute based access.

@jakevdp
Copy link
Collaborator Author

jakevdp commented Aug 26, 2019

Thanks for the input, @ellisonbg. Philosophically, I like the "verb/noun" construct we started with, but note that it breaks down almost immediately when applied to the Vega-Lite schema (transform/transform and bin/bin come to mind). And what is the verb form of a setting like scale? I suppose set_scale() makes sense, but I find explicit getters & setters like this to be a bit cluttery.

What this PR comes down to is this: I think I'm willing to forego philosophical purity of a rarely-used API (nested attribute access in charts) in favor of simplicity for an often-used API (chart construction via uncluttered method chaining).

I guess my concrete question for you would be: given the constraints you're concerned about, what API would you propose for this chart? Or are you OK keeping the status quo?

alt.Chart(data).mark_point().encode(
    alt.X('column:Q', axis=alt.Axis(None), scale=alt.Scale(zero=False), title='my title'),
    alt.Color('color:N', legend=alt.Legend(orient='bottom'), scale=alt.Scale(scheme='viridis'))
)

@ellisonbg
Copy link
Collaborator

ellisonbg commented Aug 26, 2019 via email

@jakevdp
Copy link
Collaborator Author

jakevdp commented Aug 26, 2019

If we go with the method chaining, what is the API for mutating an
existing spec. Let's say I want to take the spec above, and change the x
title.

Because getattr and setattr are different code paths, chart.encoding.x.title = new_title will do what you expect.

If we go with the method chaining, how do I ask a spec the value of an
attribute?

That's what changes here... title = chart.encoding.x.title is now a function rather than the title value. You'd have to do title = chart.encoding.x['title'] to get the value out. This is the core of the change here and the core of the tradeoff – I don't think I've ever seen anyone use code that depends on accessing deeply-nested attributes like this, but I very often see code constructing charts with large lists of prop = alt.Prop(...) arguments.

@jakevdp
Copy link
Collaborator Author

jakevdp commented Aug 26, 2019

One option, which is admittedly a bit weird, would be to initialize each value with a setter function, and have that function replace itself with the value once it's called.

Then this would work:

x = alt.X('foo').title('foo')
x.title == 'foo'  # True

@domoritz
Copy link
Member

But this wouldn't, right?

x = alt.X('foo').title('foo')
xx = x.title('bar')

@jakevdp
Copy link
Collaborator Author

jakevdp commented Aug 26, 2019

Right, that wouldn't work.

@jakevdp
Copy link
Collaborator Author

jakevdp commented Aug 26, 2019

I think that broadly, the problem is that Altair was originally built on top of traitlets, and we've been slowly moving away from a traitlets-like model because it's fundamentally incompatible with a proscriptive schema-based object validation.

Now we're trying to evolve the chart construction API, and we're running into conflicts with vestiges of the original traitlets-based approach that still exist, but for no good reason other than a modicum of backward compatibility between Altair v1 and v2. If it comes down to a choice between the two, I'd personally lean toward jettisoning the traitlets-inspired model altogether in favor of a better plotting API.

@ellisonbg
Copy link
Collaborator

ellisonbg commented Oct 30, 2019 via email

@jakevdp
Copy link
Collaborator Author

jakevdp commented Oct 31, 2019

Thanks Brian - one possiblity I thought of: we could make any undefined attributes evaluate as a callable setter function. But then I could imagine users getting pretty confused by that if they try to override an existing value...

@jakevdp
Copy link
Collaborator Author

jakevdp commented Oct 31, 2019

Another possibility is to have method-based setters only for channel classes, and to replace these setters with the attributes once they're used in the chart. So this would work correctly:

chart = alt.Chart(data).encode(
  x=alt.X('x').bin(maxbins=20)
)
assert chart.x.bin.maxbins == 20

@domoritz
Copy link
Member

domoritz commented Oct 31, 2019

In the last example, would this work?

x = alt.X('x').bin(maxbins=20)

chart1 = alt.Chart(data).encode(
  x=x
)
assert chart1.x.bin.maxbins == 20

xx = x.bin(maxbins=30) # does this work???

chart2 = alt.Chart(data).encode(
  x=xx
)
assert chart2.x.bin.maxbins == 30

@jakevdp
Copy link
Collaborator Author

jakevdp commented Oct 31, 2019

@domoritz - yes, it could be made so that would work.

@jakevdp jakevdp mentioned this pull request Nov 23, 2019
@jcmkk3
Copy link

jcmkk3 commented Jan 16, 2021

It seems like we're getting close to a new major vega-lite release again. I'm wondering if it is worth revisiting this change to the API. I for one, would love to see this change. I think that it would be a usability improvement and make it easier to move between Altair and vega-lite-api. I'd be happy to help translating docs, if you decide to go forward with it.

@joelostblom
Copy link
Contributor

This would be a super cool enhancement that I would love to see as well! (but I understand probably quite time consuming to implement) I would be happy to help out with testing, writing docs, or something of similar difficulty/skill level if needed.

@jakevdp
Copy link
Collaborator Author

jakevdp commented Jan 16, 2021

We're still stalled on wrapping Vega-Lite 4.9... I think it's going to be a while before we can get to the newer Vega-Lite features, and a change like this is even more major.

@mattijn mattijn mentioned this pull request Mar 2, 2021
@ChristopherDavisUCI
Copy link
Contributor

Would the next release of Altair be a natural time to add this? I would be very happy to experiment with this PR if there is interest!

joelostblom added a commit to joelostblom/altair that referenced this pull request Apr 16, 2022
@joelostblom
Copy link
Contributor

I would love to see this too and have been thinking that a 5.0 release would be an opportune moment to introduce a big addition like this! I started working on this a little bit myself a few weeks ago and I just uploaded my branch to #2592, feel free to build on it or just copy the files and start over, whatever is easier.

As far as I understand, the biggest outstanding item from the discussion above is whether the new method-based syntax warrants losing the ability of getting and setting values via the syntax chart.encoding.x.title and instead having to do chart.encoding.x['title']. If not, we would need a workaround, maybe one of the one's Jake suggested above? I also asked about this on SO recently and it is actually possible to have chart.encoding.x.title work for setting/getting and have chart.encoding.x.title() update the value, but that could be confusing so it is probably best to avoid it.

For me personally, the added value of using the syntax proposed here outweighs the inconvenience of having to use the dictionary-like syntax to get/set values, but I think it is up to @jakevdp and @ellisonbg to decide on which direction they prefer for Altair and what would be the most helpful way to contribute here.

@domoritz
Copy link
Member

I agree this api would be super nice and that breaking the access as you suggested makes sense. I guess there may be some benefit to maintaining backwards compatibility and we could do so with a warning.

mattijn added a commit that referenced this pull request Jan 3, 2023
* Add method based syntax from #1629 to v5

* Add correct function signatures and docstrings, and initial autocompletion parameter support

* Add general param info to docstrings and pass through adding the helper class docstring when not appropriate

* Deal with docstring missing "Attributes" tag

* Fix black formatting

* Fix black formatting

* Update relative imports syntax

* Only generate new schema for the latest version of vega lite to avoid backported changes that we do not test on older versions

* Type hints for tab completion

* Fix NoneType notation

* Replace Type[None] with None

* Basic documentation for attribute setter methods

* Apply suggestions from code review

Co-authored-by: Stefan Binder <binder_stefan@outlook.com>

* define test_examples for both syntax

* renaming iter_examples for methods syntax

* add test examples using methods syntax

* improve comments

* exclude method examples dir from flake8 and black

* include commit credits original prototype: #1629

Co-authored-by: Jake VanderPlas <781659+jakevdp@users.noreply.github.com>

Co-authored-by: Joel Ostblom <joel.ostblom@gmail.com>
Co-authored-by: Stefan Binder <binder_stefan@outlook.com>
Co-authored-by: mattijn <mattijn@gmail.com>
Co-authored-by: Jake VanderPlas <781659+jakevdp@users.noreply.github.com>
@joelostblom
Copy link
Contributor

Implemented in #2795

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants