Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Stack Transform #3771

Merged
merged 19 commits into from
May 22, 2018
Merged

Add Stack Transform #3771

merged 19 commits into from
May 22, 2018

Conversation

invokesus
Copy link
Member

@invokesus invokesus commented May 18, 2018

Fix #3311

To be done:

  • Documentation
  • Examples
    • Basic Example
    • Diverging Stacked Bar
    • Mosaic / Marimekko Chart
  • Test coverage
  • Let's add @hide to the description of StackTransform to hide it from the schema for now (and re-compile schema) so we can merge this after you address the unaddressed comments.
  • For examples you wanna add, please add _future suffix to the example files so they can be included even though stack transform wouldn't be in the schema yet.

@invokesus invokesus requested a review from kanitw May 18, 2018 19:13
} else if (isStack(t)) {
const stack = head = StackNode.makeFromTransform(head, t);

for (const field of keys(stack.producedFields())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is called repeatedly through this method. In a separate PR, we should refactor this part #3772.

Copy link
Member

@kanitw kanitw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start. Please see comments below.

I appreciate that you try to make atomic commits overall!

Some tips:

  1. Try to structure so that "refactor existing code" (like "Refactor to generate as property during make.") come earlier.
  2. Try to avoid make later commits overriding prior changes. (Otherwise these commits are no longer atomic, and then reviewers need to look at more diff than they need to.)

Note: Github currently has a stupid bug that "hide some of my comments". Please make sure to expand each of them and address them accordingly.

image

/**
* Stack measure's field. Used in makeFromTransform
*/
stack?: string;
Copy link
Member

@kanitw kanitw May 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this different from field? Isn't it exactly the same thing?

Note that the we should design the component such that it represents information that's equilvalent to what we need in the output Vega stack transform. (We should later decouple things that we need for impute to a separate ImputeNode)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one should I keep stack or field?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also name it stackField.

(We later do rename it in the assemble method anyway):

const {facetby, field: stackField, dimensionFieldDef, impute, offset, sort, stackby} = this._stack;

out[this._stack.field + '_start'] = true;
out[this._stack.field + '_end'] = true;

const out = this._stack.as.reduce((result, item) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return this without declaring?

*/
sort: VgSort;
sort: VgSort | SortField; // TODO Simplify the sort. make uses VgSort.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why union type here? Why not converting SortField to VgSort right away?

*/
field: string;
field?: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will keep either stack or field and make that compulsory.

const dimensionField = dimensionFieldDef ? vgField(dimensionFieldDef, {binSuffix: 'mid'}): undefined;
// Detects whether assemble call is from Node created from make or MakeFromTransform
// TODO Create a stricter function to check that.
if (this._stack.stackby) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add this additional nesting level. See the rationale in the comment on the "else" line.

src/transform.ts Outdated
*/
groupby: string[];
/**
* Mode for stacking marks. Default is 'zero'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use our standard default value format.

 Mode for stacking marks.

__Default value:__ `"zero"`

src/transform.ts Outdated
sort?: SortField ;
/**
* Output field names.
* y2 or x2 is optional if only one name is provided the end will be “<name>_end”
Copy link
Member

@kanitw kanitw May 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People might not get what is x2 and y2. Maybe:

Output field names. This can be either a string or an array of strings with two elements denoting the name for the fields for stack start and stack end respectively.
If a single string is provided, the end field will be "_end".

Copy link
Member Author

@invokesus invokesus May 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a single string is provided, the end field will be "_end".

Right now if a single string(eg. "values") is provided, the end field is "values_end".

"color": {"field": "c", "type": "ordinal",}
}
});
describe('StackNode.make', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makeForEncoding?

} else {
const {stack, groupby, offset, sort, as} = this._stack;
let normalizedAs: Array<string>;
if (isAsValidArray(as)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this logic for as appearing here too? I don't think this is necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just see that you removed this in a later commit.

@@ -23,7 +23,7 @@ function getStackByFields(model: UnitModel): string[] {

export interface StackComponent {
// TODO it would be cleaner to compose the two interfaces to create this one.
// TODO remove Used in *comments
// TODO Rename make to makeFromTransform
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename make to makeFromEncoding?

@vega vega deleted a comment from invokesus May 18, 2018
Copy link
Member

@kanitw kanitw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a few more comments :)

(For examples, you might want to separate examples to another branch after addressing these few comments as we're still debating about it on Slack.)

value: 0
});
}
if(facetby) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding this if(facetby) {?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this since I haven't read how facetby is generated, but the general logic behind the if is that there are 3 code segments in assemble:

  • Segment 1: when this._stack has been created by makeFromEncoding where impute: true
  • Segment 2: when this._stack has been created by makeFromEncoding
  • Segment 3. when this._stack has been created by makeFromTransform

The existing code structure led me to believe that Segment 1 and Segment 2 are not disjoint and both can be executed for a single input spec.

To maintain a flat structure, initially this._stack with impute:true is handled, then all the this._stack created by makeFromEncoding(including the ones operated on in the previous step), then finally the ones created by makeFromTransform which is disjoint to Segment 1 and 2.

facetby is undefined for any this._stack created by makeFromTransform . So those cases skip the if(facetby) { block.

src/transform.ts Outdated
* Output field names. This can be either a string or an array of strings with
* two elements denoting the name for the fields for stack start and stack end
* respectively.
* If a single string(eg."val") is provided, the end field will be "val_end".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string(eg."val")

string (e.g., "val")

/**
* Faceted field.
* Faceted field. Used in makeFromEncoding.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
// TODO Better Name for this function
function isAsValidArray(as: string[] | string): as is string[] {
return isArray(as) && as.every(s => isString(s)) && as.length ===2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really wanna keep this method, check if array.length > 1. (The 3rd, 4th, ... elements can be ignored.)

src/transform.ts Outdated
/**
* Field that determines the order of leaves in the stacked charts.
*/
sort?: SortField ;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SortField;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually shouldn't it be SortField[] like window?

public static makeFromTransform(parent: DataFlowNode, stackTransform: StackTransform) {

const {stack, groupby, as, offset='zero'} = stackTransform;
const sort = stackTransform.sort || {'field': stack, 'order': 'ascending'};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this default is needed?

@kanitw
Copy link
Member

kanitw commented May 20, 2018

Note that Github collapses a few of my comments again ><" -- make sure to look at them :)

@kanitw
Copy link
Member

kanitw commented May 21, 2018

  1. Let's add @hide to the description of StackTransform to hide it from the schema for now (and re-compile schema) so we can merge this after you address the unaddressed comments.
  2. For examples you wanna add, please add _future suffix to the example files so they can be included even though stack transform wouldn't be in the schema yet.


}

function isAsValidArray(as: string[] | string): as is string[] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isValidAsArray

groupby: stackby,
key: dimensionField,
method: 'value',
value: 0
});
}
if(facetby) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, you should look at your overall PR diff and think why you do certain things.
Here you add if (facetby) to wrap the main code for the transform assemble logic (as it creates the stack transform). Having a case that we can skip the main logic is quite questionable.

From your earlier rationale, you say that facetby is undefined in makeFromTransform. But why should that be undefined? (That's actually something you set yourself.)

FYI, previously facetby was required but somehow you make it optional in this PR. (Why?)
I actually mentioned earlier that the facetby should be an empty array [] in either case.
Thus, facetby shouldn't be optional and then you don't have to add the if here.

Just FYI, it's okay to forget some details I mentioned because this is quite a complicated code base. But feel free to ask again if you are unsure about something. :)

A good case to test whether this logic is now correct for facetby is to try to see if you can modify
https://github.com/vega/vega-lite/blob/master/examples/specs/normalized/trellis_stacked_bar_normalized.vl.json
such that you have the stack transform inside the inner spec.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
  "data": {
    "url": "data/barley.json"
  },
  "facet": {
    "column": {
      "field": "year",
      "type": "ordinal"
    }
  },
  "spec": {
    "transform": [{aggregate: ...}, {stack: ...}],
    "mark": "bar",
    "encoding": {
      ...
    }
  }
}



Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, Ham. I think now I understand where I was wrong. I made facetby a required field, and refactored assemble.
I must have missed your comment about trying out stack inside a faceted spec. It's working now.
screen shot 2018-05-23 at 12 01 30 am

let normalizedAs: Array<string>;
if (isAsValidArray(as)) {
normalizedAs = as;
} else if(typeof as === 'string') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(isString(as))

@@ -152,15 +207,14 @@ export class StackNode extends DataFlowNode {
}
return [vgField(dimensionFieldDef)];
}
return [];
return groupby || [];
Copy link
Member

@kanitw kanitw May 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with this for now.

Once you decouple Impute from stacking (#1514), we should make dimensionFieldDef no longer a part of StackComponent and just have StackNode.makeFromEncoding() create a groupby, using this method. At that point, we can basically remove facetBy and just have push new dimension in the facet to groupby instead.

(Basically, if you look at the code, the final groupby in the stack transform = groupBy that's from dimension fieldDef + facetBy)

@kanitw kanitw changed the base branch from master to feature/stack May 22, 2018 20:58
@kanitw
Copy link
Member

kanitw commented May 22, 2018

Since the code looks pretty stable at this point, I'll merge to a feature/stack branch for now. Please create another PR to merge to the feature branch.

@kanitw kanitw merged commit 797ed94 into vega:feature/stack May 22, 2018
kanitw added a commit that referenced this pull request May 24, 2018
* Add Stack Transform (#3771)

* Add Stack Transform Examples (#3798)

* Add Stacked population, Diverging Bar Chart example, Marimekko Chart of cars.json

* Add 3 Mosaic plots of cars.json

* Add compiled vega specs and svgs of the added examples

* Transpose axes of rect_mosaic_simple to maintain consistency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants