Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor stream renaming and stream preprocessors #1224

Closed
jlstevens opened this issue Mar 23, 2017 · 23 comments
Closed

Refactor stream renaming and stream preprocessors #1224

jlstevens opened this issue Mar 23, 2017 · 23 comments
Assignees
Labels
Milestone

Comments

@jlstevens
Copy link
Contributor

jlstevens commented Mar 23, 2017

Currently Stream classes have a preprocessors argument that allow for renaming and other transformations of stream parameters. This was an initial concept we introduced, primarily because we realized that stream names might clash and wanted to introduce a mechanism that could resolve these clashes.

Now we have more experience using the streams system, it no longer seems the appropriate level at which to introduce preprocessors, which we do know can be a useful concept in practice e.g we've found one example needing a preprocessor relating to datashader.

As the callback that generates elements lives on DynamicMap, this now seems to be the place to apply such transformations. The proposal here is two-fold: Streams should handle clashes only and preprocessors on DynamicMap should handle transformations, once there are no name clashes.

Resolving name clashes

Streams should now lose the preprocessors argument and gain a rename method instead:

coords = PositionXY().rename(x='long', y='lat')

As this should be possible via the constructor, this should be equivalent to:

coords = PositionXY(rename={'x':'long', 'y':'lat'})

Note that I think rename should give you a new stream instance which is identical except for the renamed values in the streams contents property.

As there can be no clashes between stream parameters on a single stream, so you only really need to use the rename in the streams argument to DynamicMap:

DynamicMap(my_callback, streams=[Tap(), DoubleTap().rename(x='dtap_x', y='dtap_y')])

or equivalently

DynamicMap(my_callback, streams=[Tap(), DoubleTap(rename=dict(x='dtap_x', y='dtap_y'))])

This resolves the fact that both Tap and DoubleTap have x and y stream parameters by default.

Note that DynamicMap can check for clashes in the constructor - I suspect we do this already. I am also wondering if map might be better as it is shorter than rename but that might be confusing as we have map methods elsewhere. Note that remap is also shorter than rename...

Preprocessors

Now that stream parameters will be disambiguated, we can have a single preprocessor at the DynamicMap level that simply maps a dictionary of kwargs to a new dictionary of kwargs. We decided to have a single preprocessor instead of a list as:

  1. This is simpler.
  2. You can easily chain preprocessors yourself e.g
preprocessor=lambda kws: foo(bar(baz(kws)))
  1. Even though the above is a lambda which won't pickle, we don't expect DynamicMaps to pickle ( @philippjfr Isn't this a separate issue to address, shouldn't we safely skip DynamicMaps when pickling/unpickling instead of failing outright?)

Note that because the preprocessor receives the renamed parameters at the DynamicMap level, it would be recommended that you apply rename inside the streams list so everything is declared together in one place (making things easier to reason about).

There should be no issue applying the rename method in the streams list for re-used streams e.g:

tap = Tap()
dtap = DoubleTap()
DynamicMap(my_callback, streams=[tap, dtap(rename=dict(x='dtap_x', y='dtap_y'))])

# Generating an explicit event
dtap.update(dtap_x=0, dtap_y=0)

Call signatures

There is an issue about picking the appropriate call signatures in DynamicMap and we have explicitly decided that this is a separate issue from the concepts discussed in this issue.

Edit: There are a few things that are more restricted under this proposal. Here is the docstring for the existing Group preprocessor at the streams level:

class Group(Preprocessor):
    """
    A preprocessor that keeps the parameter dictionary together,
    supplying it as a value associated with the given key.
    """

As this was at the streams level, you would receive a single dictionary for each stream. With a preprocessor at the DynamicMap level, you would still need to rename the stream parameters then group.

Just mentioning this for completeness as I don't think this is an important use-case (afaik we've never used Group).

@jlstevens jlstevens added this to the v1.7.0 milestone Mar 23, 2017
@jlstevens jlstevens self-assigned this Mar 23, 2017
@jbednar
Copy link
Member

jbednar commented Mar 23, 2017

I'm very happy with this proposal, thanks!

shouldn't we safely skip DynamicMaps when pickling/unpickling instead of failing outright?

That does seem like a case for skipping with a warning, so that people can use a DynamicMap in one cell without causing problems if they try to pickle everything (with export).

dtap = DoubleTap()
DynamicMap(my_callback, streams=[tap, dtap(rename=dict(x='dtap_x', y='dtap_y'))])
dtap.update(dtap_x=0, dtap_y=0)

How would this work? How would dtap be aware of anything that happened in the DynamicMap call? I thought a new stream would be created, a proxy for the old one, that does the renaming?

@jlstevens
Copy link
Contributor Author

jlstevens commented Mar 23, 2017

@jbednar Thanks, I corrected what I meant below:

tap = Tap()
dtap = DoubleTap()
DynamicMap(my_callback, streams=[tap, dtap.rename(x='dtap_x', y='dtap_y')])

# Generating an explicit event
dtap.update(dtap_x=0, dtap_y=0)

Should be the rename method, not the constructor approach.

@philippjfr
Copy link
Member

Should be the rename method, not the constructor approach.

Even so, the rename method will still return a new stream no?

@jbednar
Copy link
Member

jbednar commented Mar 23, 2017

Unless you want to provide renaming as something all streams inherently know how to do, with a lookup table on every one, so that they know their arguments as well as all aliases available for their arguments. Not certain that's a good idea...

@jlstevens
Copy link
Contributor Author

jlstevens commented Mar 23, 2017

Even so, the rename method will still return a new stream no?

Right.

Unless you want to provide renaming as something all streams inherently know how to do, with a lookup table on every one, so that they know their arguments as well as all aliases available for their arguments. Not certain that's a good idea...

Yeah, I think a new instance is a better approach.

Edit: I've realized that maybe we do mean the same thing? Each stream instance will have a bit of state to remap its parameter names as necessary. Or do you mean some class attribute level approach?

@jbednar
Copy link
Member

jbednar commented Mar 23, 2017

In which case the code becomes:

tap = Tap()
dtap = DoubleTap().rename(x='dtap_x', y='dtap_y')
DynamicMap(my_callback, streams=[tap, dtap])

# Generating an explicit event
dtap.update(dtap_x=0, dtap_y=0)

@jlstevens
Copy link
Contributor Author

jlstevens commented Mar 23, 2017

Ah, I see the issue now. Your code is right, but to be honest the recommended thing here is probably:

tap = Tap()
dtap = DoubleTap()
dmap = DynamicMap(my_callback, streams=[tap, dtap.rename(x='dtap_x', y='dtap_y')])
# ... some other use of ``dtap`` here...

# Generating an explicit event
dmap.event(dtap_x=0, dtap_y=0)

I need a better example of reusing streams...

@jbednar
Copy link
Member

jbednar commented Mar 23, 2017

Ok, now that's a version that would work, assuming event() does what I'd assume it does. Sounds good!

@jbednar
Copy link
Member

jbednar commented Mar 23, 2017

Are you sure that's the recommended approach, though? In your version, if someone does:

dtap.update(dtap_x=0, dtap_y=0)

it's now an error (no such argument dtap_x), and if someone does:

dtap.update(x=0, y=0)

then there's no error but nothing will happen, because DynamicMap knows nothing about dtap (only the renamed version). In my suggested code there is no handle sitting around to the original, non-renamed version of dtap to confuse people.

@jlstevens
Copy link
Contributor Author

That is now my question. @philippjfr Is there any reason to re-use streams at all except to save typing?

In other words, I expect two streams instances of the same type with the same state (target, remapping etc) to behave as the same thing. Any reason the actual identity of the instance might matter?

@philippjfr
Copy link
Member

That is now my question. @philippjfr Is there any reason to re-use streams at all except to save typing?

I don't think so, as long as they both point to the same source I think they will even be hooked up to the same bokeh Callback.

@jlstevens
Copy link
Contributor Author

Ok great, all these things are mostly equivalent though I agree my version leaves a useless handle around. Combining suggestions here is what I would probably use in the end:

tap = Tap()
dtap = DoubleTap(rename=dict(x='dtap_x', y='dtap_y'))
dmap = DynamicMap(my_callback, streams=[tap, dtap])

# Generating an explicit event
dtap.update(dtap_x=0, dtap_y=0)
# Or
dmap.event(dtap_x=0, dtap_y=0)

@jlstevens
Copy link
Contributor Author

jlstevens commented Mar 23, 2017

Actually, maybe this is an argument that rename should just have side-effects and return self? Now I'm wondering if returning a new instance just complicates things.

Edit: I still prefer remap as a name by the way.

@philippjfr
Copy link
Member

Actually, maybe this is an argument that rename should just have side-effects and return self? Now I'm wondering if returning a new instance just complicates things.

Seems like a really bad idea, in case you did use the stream somewhere else previously.

@jlstevens
Copy link
Contributor Author

Good point. And how do you feel about calling it remap?

@philippjfr
Copy link
Member

Personally I think rename is clearer, but I suppose I could be persuaded.

@jlstevens
Copy link
Contributor Author

I think remap is better because:

  1. It is shorter
  2. A rename method sounds like the stream has a name and you are changing it e.g stream.rename('bob').
  3. Conversely remap suggests there can be more than one thing you are remapping i.e a dictionary (also known as a map).

@jbednar What do you prefer?

@jlstevens
Copy link
Contributor Author

jlstevens commented Mar 23, 2017

I also think it is a better name for what it does. The docstring could be something like:

"""
Remap the stream parameter names to avoid name clashes with other streams
"""

It also makes sense as a mathematical map i.e a one-to-one relationship as all parameters need some sort of name (even if left as the default).

@philippjfr
Copy link
Member

I'll let @jbednar decide.

@jbednar
Copy link
Member

jbednar commented Mar 23, 2017

I think either one is accurate, since the underlying operation is arguably remap_names. I think I prefer how "rename" includes the word "name" to imply that it's only about naming, not other things that could be remapped (e.g. values between different coordinate systems).

@philippjfr
Copy link
Member

I think I prefer how "rename" includes the word "name" to imply that it's only about naming, not other things that could be remapped (e.g. values between different coordinate systems).

That matches my feelings about it.

@jlstevens
Copy link
Contributor Author

Ok, rename it is then.

@jlstevens jlstevens mentioned this issue Mar 29, 2017
6 tasks
@jlstevens
Copy link
Contributor Author

I think everything discussed in this issue has been addressed by the PR above. The rename suggestion was implemented as suggested here but the preprocess suggestion changed into a transform method on streams (an approach I now think is a lot better).

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants