Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Aggregation user input _index #2031

Merged
merged 3 commits into from
Oct 2, 2017
Merged

Conversation

bpostlethwaite
Copy link
Member

fixes #2024
and adds _index to fullData aggregation.aggregations to make back referencing into user data possible.

cc @alexcjohnson

This necessitated a change in plots.js to prevent the supply defaults
step running twice with the second run acting on fullData rather then
userData.
@etpinard
Copy link
Contributor

etpinard commented Sep 22, 2017

Very interesting fix. Transforms that do not expand traces shouldn't have to go through supplyTransformDefaults twice. At the moment, all transforms but groupby don't expand traces. So from a pref standpoint, this PR is 👌

cc @rreusser who in #1830 had to go-around a few transform design problems.

@@ -1039,9 +1039,21 @@ plots.supplyTransformDefaults = function(traceIn, traceOut, layout) {
_module = transformsRegistry[type],
transformOut;

/*
* Supply defaults may run twice. First pass runs all supply defaults steps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions:

  • I assume this refers to the way supplyTraceDefaults runs on traces and then again on expanded traces?
  • Which _module is added? I think that sometimes refers to the trace module and sometimes to the transform module (and every now and then to the base plot module, though usually it's called baseModule or something, I think)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer to second: the transform module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that is what it is referring to

* and adds the _module to any output transforms.
* If transforms exist another pass is run so that any generated traces also
* go through supply defaults. This has the effect of rerunning
* supplyTransforms. If the transform does not have a `transform` function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*supplyTransformDefaults?

* it could not have generated any new traces and the second stage is
* unnecessary. We detect this case with the following variables.
*/
var isSecondStage = transformIn._module && transformIn._module === _module,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a third stage possible if multiple transforms expand things multiple times? (e.g. groupby + groupby) I think this makes sense to me, but just making sure transform module identity equates to isSecondStage. (Or is it isNotFirstStage?) My understanding is just a little loose here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, more than 2 stages... need to think about that. Suggestions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I think we are ok. I should call this variable isNotFirstStage though. Thanks

Copy link
Contributor

@rreusser rreusser Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things get fundamentally a bit shaky after one level. Like groupby + groupby. It does mostly what you'd expect, but it turns out styling is the big challenge. Think SQL. You massage the query until you have a nice query that spits out data. Then you insert it into a plot and you're happy. Plotly's transforms incrementally apply styles as a sequence of transforms is applied so that you'd have to track styles as well as the pathway a data point took through the query to really sort things out after the fact. Though I agree that I think we're okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, specifics: groupby adds a _group property to traces so that you can figure out which group split generated a trace. But it should really be tracking that as a transform index and a group value. This comes into play with transform vs. calcTransform since you start having to worry about when a trace was split/grouped/styled/etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if that's still a bit loose on the specifics. /cc @alexcjohnson since we've talked this through before in a fair bit of detail. I think our conclusion at the time was that the basic use cases are handled just fine and account for 97+% of realistic use-cases.

@bpostlethwaite
Copy link
Member Author

@rreusser some changes here: ec0a65f

Does that seem more clear? It seems to be more honest about the possibility of more than 2 stages.

});

expect(enabledAggregations[0].target).toEqual('x');
expect(enabledAggregations[0]._index).toEqual(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to test this that doesn't rely upon an underscored property? Like could supplyTransformDefaults delete the first aggregation if it's overridden by another? Would that be adequate to sort things out in the workspace or does the workspace need to know indices?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason for indicies is for Workspace. supplyTransformDefaults doesn't care about them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a way in Workspace to take a fullData transform Aggregration and write back to the correct userData index. I need some way to correlate userData indicies with the fullData equivalents.

Copy link
Member Author

@bpostlethwaite bpostlethwaite Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it shouldn't be underscored? Something like .inputIndex. the only problem with that is there are precedents for things being added to fullData for consumption by users that have an underscore. _input for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the workspace accesses indices directly? (That's okay, just making sure. plotly.js's 'private' properties are fairly publicly accessed, so it's nothing new. Just means they need to be modified with great care.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, and the workspace is the only thing that needs them. Plotly.js doesn't care as it rebuilds new arrays each time.

@rreusser
Copy link
Contributor

Above corner-case comments notwithstanding, this all looks reasonable to me.

@etpinard
Copy link
Contributor

etpinard commented Oct 2, 2017

Ok. Looks like @rreusser approves this.

💃 for me @bpostlethwaite

This will be part of 1.31.0 set to be released no-later than October 5.

@bpostlethwaite bpostlethwaite merged commit f617aef into master Oct 2, 2017
@bpostlethwaite bpostlethwaite deleted the aggregationInputIndicies branch October 2, 2017 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregation transform throws when aggregations not defined
3 participants