Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Axis autotype update #3070

Merged
merged 2 commits into from Oct 3, 2018

Conversation

Projects
None yet
2 participants
@alexcjohnson
Copy link
Contributor

alexcjohnson commented Oct 3, 2018

Fixes #3039 - treat boolean as category data in autotype

Fixes #2473, fixes #1413, by counting only distinct values while determining date and category autotype, as discussed in #2473 (comment). This way we don't need to include any explicit "missing" values; data with up to 2 non-numeric values for every numeric value will still be interpreted as numbers, but even a single non-numeric string (including the previous special values 'None' and '') will be interpreted as a valid category if there are no numbers present.

Note that as part of ^^ I also converted the date determination to counting distinct values. Nobody had complained about this part, but if I didn't do that there could be some strange cases where you have only date strings and numbers but our result is 'category' 馃

There's probably a way to improve performance (and make it clearer that we can't generate such strange results) by combining moreDates and category into one loop... but I don't think that's a major cost anymore at least since we switched to testing max 1000 points.

cc @etpinard @antoinerg @archmoj

y: [0, 'None', true, 'None', 'None', 'None', 'None', 'None']
});
checkTypes('linear', 'linear');
});

This comment has been minimized.

Copy link
@etpinard

etpinard Oct 3, 2018

Member

Beautiful! This test case here is enough to convince me this PR is non-breaking.

@etpinard

This comment has been minimized.

Copy link
Member

etpinard commented Oct 3, 2018

There's probably a way to improve performance (and make it clearer that we can't generate such strange results) by combining moreDates and category into one loop... but I don't think that's a major cost anymore at least since we switched to testing max 1000 points.

Hmm. I would be nice to check with a large splom (e.g. https://codepen.io/etpinard/pen/wjmqmO). You're right, looping over <1000 points shouldn't be too slow, but looping over <1000 pts once per axis ~100 times could have an impact. Related: #2549

Note: I'll check this myself off my latest splom commits.

@etpinard

This comment has been minimized.

Copy link
Member

etpinard commented Oct 3, 2018

Note: I'll check this myself off my latest splom commits.

The results are in: these commits here don't make much of a difference in axis-autotype perf.

But I never realized how much time large splom traces spend in axis_autotype. In https://codepen.io/etpinard/pen/wjmqmO, we spend roughly 40ms in axis_autotype ! I'll add a comment about this in #2549. No need to handle this now of course.

馃拑

@alexcjohnson alexcjohnson merged commit 48209a0 into master Oct 3, 2018

6 checks passed

ci/circleci: build Your tests passed on CircleCI!
Details
ci/circleci: test-image Your tests passed on CircleCI!
Details
ci/circleci: test-image2 Your tests passed on CircleCI!
Details
ci/circleci: test-jasmine Your tests passed on CircleCI!
Details
ci/circleci: test-jasmine2 Your tests passed on CircleCI!
Details
ci/circleci: test-syntax Your tests passed on CircleCI!
Details

@alexcjohnson alexcjohnson deleted the bool-cats branch Oct 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.