New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Letter Value Plots #661
Letter Value Plots #661
Conversation
Added letter value plot class and plotting function. In addition, added documentation and examples.
If you want to work out the pep8 errors locally you can install |
I think the function could be called |
orient, color, palette, saturation, | ||
width, k_depth, linewidth, box_widths): | ||
|
||
if width is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These default values should be specified in the function signature for lvplot
.
This looks good on a very brief first pass, but it will be very important to get tests on the numerical internals. |
ax.plot([y, y], [x - w / 2, x + w / 2], c='k', alpha=.45, **kws) | ||
|
||
ax.scatter(outliers, np.repeat(x, len(outliers)), | ||
marker=r"$\ast$", c=color) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just a normal "o"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really have a preference as far as marker for outliers. Matplotlib uses '+' markers, so I could try to be consistent with Matplotlib or go with 'o'. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the circles you had in the example plots I just think it will be faster to draw those with "o"
than with a custom latex marker, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you could just use "d"
to be consistent with seaborn boxplots i guess...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to 'd' for consistency.
Also before I forget this should get added as a plot kind in |
I find this kind of weird: df = pd.DataFrame({"normal": np.random.normal(3, 1, 10000),
"gamma": np.random.gamma(3, .5, 10000)})
sns.lettervalueplot(data=df)
sns.violinplot(data=df, scale="width") I think I still don't quite understand how these things get drawn, but I would have expected the letter-value plots to look more like a discretized violinplot. Is that wrong? |
Your intuition wouldn't be off in this case. The widths of the boxes are determined by which width function is chosen in
|
Ah, OK. I had played around with that keyword argument, but couldn't seem to get the effect I was looking for. I guess that explains why :) |
Speaking of, I think the |
Any update on tests? |
I've added some basic tests so far and plan on adding tests for the numerical internals, e.g. testing the letter value calculations, etc. I'll push the latest edits I have today. |
Cool. By the way, I would generally advise writing the tests in the other direction. I've found that it's best to start directly testing the core numeric parts, then to pull back and test some of the plotting methods, and finally pull back further and add some high-level smoke tests for the function itself (what I assume you mean by "basic tests"). The reason is that this way lets the code coverage report be a useful guide to how complete the tests are. If you start with tests that run the function and just ask "are there some plots on the axes", it will look like the internals are "covered", but the plots very well may be wrong. It might be helpful to rename the basic tests so nose doesn't run them as you develop and use the coverage report to make sure you're really finding all the corners in the numeric code. Anyway, just some thoughts. |
Also can you add a TODO list in the original issue just so we can track the items that have come up in the comments? |
…hes, not artists.
I've completed writing the tests (making them Python 3 forward compatible) and the other tasks on the checklist. I've checked the tests (locally with nosetests) and the only failures I can see are the ones connected with other modules, not |
The Python 3 errors are in |
Yeah I checked Travis. I didn't realize |
Unfortunately I can't duplicate the PEP8 error Travis is getting (and I made sure to install |
def test_draw_missing_boxes(self): | ||
|
||
ax = cat.lvplot("g", "y", data=self.df, | ||
order=["a", "b", "c", "d"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is over-indented
Build is passing on all versions, PEP8 is all done. |
Any more comments, concerns, suggestions? |
Not trying to be bothersome but I wanted to check in a see if you had any further comments or suggestions for the PR? I completely understand if you're swamped with other stuff. |
Yeah, sorry about letting this drop. This is a substantial enough addition that I've really wanted to make time to go through it in detail and make sure I understand how everything works. Unfortunately, I've been very busy the past few months (and will be for the foreseeable future) and just have not been able to devote a large enough block of time to get my head around everything here. But since you put so much effort into this, I really do feel bad about letting this hang in limbo. I'll merge it now so that it can start to get some real use, hopefully any residual issues will get identified and ironed out before the next release. If you could try to stay on top of anything that pops up, that would be very helpful. |
Hi, May the documentation take Matplotlib 1.5 into account now ?
|
@mwaskom Sure, I can stay on top of any issues that pop up. Thanks! |
This is a change in pandas, not matplotlib, but yes that example should be updated. |
This pull request adds the
_LVPlotter
class,lettervalueplot
function, documentation, and usage examples.TODO
lvplot
box_widths
toscale
p
tooutlier_prop
outlier_prop
to take float, [0, 1].d
for internal consistency