Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Aggregation user input _index #2031

Merged
merged 3 commits into from Oct 2, 2017

Conversation

Projects
None yet
3 participants
@bpostlethwaite
Copy link
Member

commented Sep 22, 2017

fixes #2024
and adds _index to fullData aggregation.aggregations to make back referencing into user data possible.

cc @alexcjohnson

@bpostlethwaite bpostlethwaite requested a review from etpinard Sep 22, 2017

@bpostlethwaite bpostlethwaite force-pushed the aggregationInputIndicies branch from 9e3088d to 13dd5c6 Sep 22, 2017

add and test aggregation back _index.
This necessitated a change in plots.js to prevent the supply defaults
step running twice with the second run acting on fullData rather then
userData.

@bpostlethwaite bpostlethwaite force-pushed the aggregationInputIndicies branch from 13dd5c6 to 75e6e48 Sep 22, 2017

@etpinard

This comment has been minimized.

Copy link
Member

commented Sep 22, 2017

Very interesting fix. Transforms that do not expand traces shouldn't have to go through supplyTransformDefaults twice. At the moment, all transforms but groupby don't expand traces. So from a pref standpoint, this PR is 👌

cc @rreusser who in #1830 had to go-around a few transform design problems.

@@ -1039,9 +1039,21 @@ plots.supplyTransformDefaults = function(traceIn, traceOut, layout) {
_module = transformsRegistry[type],
transformOut;

/*
* Supply defaults may run twice. First pass runs all supply defaults steps

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

Questions:

  • I assume this refers to the way supplyTraceDefaults runs on traces and then again on expanded traces?
  • Which _module is added? I think that sometimes refers to the trace module and sometimes to the transform module (and every now and then to the base plot module, though usually it's called baseModule or something, I think)

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

Answer to second: the transform module.

This comment has been minimized.

Copy link
@bpostlethwaite

bpostlethwaite Sep 28, 2017

Author Member

Yep that is what it is referring to

* and adds the _module to any output transforms.
* If transforms exist another pass is run so that any generated traces also
* go through supply defaults. This has the effect of rerunning
* supplyTransforms. If the transform does not have a `transform` function

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

*supplyTransformDefaults?

* it could not have generated any new traces and the second stage is
* unnecessary. We detect this case with the following variables.
*/
var isSecondStage = transformIn._module && transformIn._module === _module,

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

Is a third stage possible if multiple transforms expand things multiple times? (e.g. groupby + groupby) I think this makes sense to me, but just making sure transform module identity equates to isSecondStage. (Or is it isNotFirstStage?) My understanding is just a little loose here.

This comment has been minimized.

Copy link
@bpostlethwaite

bpostlethwaite Sep 28, 2017

Author Member

right, more than 2 stages... need to think about that. Suggestions?

This comment has been minimized.

Copy link
@bpostlethwaite

bpostlethwaite Sep 28, 2017

Author Member

Right I think we are ok. I should call this variable isNotFirstStage though. Thanks

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

Things get fundamentally a bit shaky after one level. Like groupby + groupby. It does mostly what you'd expect, but it turns out styling is the big challenge. Think SQL. You massage the query until you have a nice query that spits out data. Then you insert it into a plot and you're happy. Plotly's transforms incrementally apply styles as a sequence of transforms is applied so that you'd have to track styles as well as the pathway a data point took through the query to really sort things out after the fact. Though I agree that I think we're okay.

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

Sorry, specifics: groupby adds a _group property to traces so that you can figure out which group split generated a trace. But it should really be tracking that as a transform index and a group value. This comes into play with transform vs. calcTransform since you start having to worry about when a trace was split/grouped/styled/etc.

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

Apologies if that's still a bit loose on the specifics. /cc @alexcjohnson since we've talked this through before in a fair bit of detail. I think our conclusion at the time was that the basic use cases are handled just fine and account for 97+% of realistic use-cases.

@bpostlethwaite

This comment has been minimized.

Copy link
Member Author

commented Sep 28, 2017

@rreusser some changes here: ec0a65f

Does that seem more clear? It seems to be more honest about the possibility of more than 2 stages.

});

expect(enabledAggregations[0].target).toEqual('x');
expect(enabledAggregations[0]._index).toEqual(0);

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

Is there any way to test this that doesn't rely upon an underscored property? Like could supplyTransformDefaults delete the first aggregation if it's overridden by another? Would that be adequate to sort things out in the workspace or does the workspace need to know indices?

This comment has been minimized.

Copy link
@bpostlethwaite

bpostlethwaite Sep 28, 2017

Author Member

The only reason for indicies is for Workspace. supplyTransformDefaults doesn't care about them.

This comment has been minimized.

Copy link
@bpostlethwaite

bpostlethwaite Sep 28, 2017

Author Member

I need a way in Workspace to take a fullData transform Aggregration and write back to the correct userData index. I need some way to correlate userData indicies with the fullData equivalents.

This comment has been minimized.

Copy link
@bpostlethwaite

bpostlethwaite Sep 28, 2017

Author Member

maybe it shouldn't be underscored? Something like .inputIndex. the only problem with that is there are precedents for things being added to fullData for consumption by users that have an underscore. _input for example.

This comment has been minimized.

Copy link
@rreusser

rreusser Sep 28, 2017

Contributor

So the workspace accesses indices directly? (That's okay, just making sure. plotly.js's 'private' properties are fairly publicly accessed, so it's nothing new. Just means they need to be modified with great care.)

This comment has been minimized.

Copy link
@bpostlethwaite

bpostlethwaite Sep 28, 2017

Author Member

yep, and the workspace is the only thing that needs them. Plotly.js doesn't care as it rebuilds new arrays each time.

@rreusser

This comment has been minimized.

Copy link
Contributor

commented Sep 28, 2017

Above corner-case comments notwithstanding, this all looks reasonable to me.

@etpinard

This comment has been minimized.

Copy link
Member

commented Oct 2, 2017

Ok. Looks like @rreusser approves this.

💃 for me @bpostlethwaite

This will be part of 1.31.0 set to be released no-later than October 5.

@bpostlethwaite bpostlethwaite merged commit f617aef into master Oct 2, 2017

1 check passed

ci/circleci Your tests passed on CircleCI!
Details

@bpostlethwaite bpostlethwaite deleted the aggregationInputIndicies branch Oct 2, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.