Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix misleadingness in vignette #55

Closed
boxuancui opened this issue Feb 9, 2018 · 1 comment
Closed

Fix misleadingness in vignette #55

boxuancui opened this issue Feb 9, 2018 · 1 comment

Comments

@boxuancui
Copy link
Owner

@boxuancui boxuancui commented Feb 9, 2018

As pointed out by @peSHIr here, the data needs to be cleaned before grouping.

The vignette aircraft examples are a bit misleading, as data needs a bit more cleanup, I think. Airbus is in the list with two different strings, McDonnell Douglas with at least three, and Canada with two. If those were first lumped together into one each, before lumping the long tail together into an "other" bin, this could make a big difference in further modeling, as Airbus would jump to largest group by far, not the third, with about half of the Airbus data being lumped into "other". #oops

@boxuancui boxuancui added the type: bug label Feb 9, 2018
@boxuancui boxuancui added this to the 0.6.0 milestone Feb 9, 2018
@boxuancui boxuancui self-assigned this Feb 9, 2018
@peSHIr
Copy link

@peSHIr peSHIr commented Feb 9, 2018

Never thought at the time to add this as an issue on github myself. Thanks for taking the time to find me on here and link to me. Kudos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.