Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial commit of dotplot #1294

Closed
wants to merge 19 commits into from
Closed

Conversation

kshedden
Copy link
Contributor

@kshedden kshedden commented Jan 6, 2014

(This is a repeat of a PR I made a few hours ago, moving to a new feature branch per Josef's request)

This is a basic dotplot function that I think might be a useful addition to statsmodels/graphics.

The term "dotplot" has been used in various ways, these are dotplots in the style of Bill Cleveland's book, similar to what is implemented in this R package:

http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/dotchart.html

Additional examples can be found in the appendix of this manuscript:

http://polisci.msu.edu/jacoby/research/dotplots/tpm/Jacoby,%20Dotplots,%20TPM%20Draft.pdf

The test_dotplot file (in statsmodels/graphics/tests) shows how it works with simulated data. These are the plots generated by the test file:

http://dept.stat.lsa.umich.edu/~kshedden/test_dotplot.pdf

Some examples with real data are here:

http://dept.stat.lsa.umich.edu/~kshedden/dotplot_gdp_example.pdf
http://dept.stat.lsa.umich.edu/~kshedden/dotplot_gas_prices.pdf

Scripts:

http://dept.stat.lsa.umich.edu/~kshedden/dotplot_gdp_example.py
http://dept.stat.lsa.umich.edu/~kshedden/dotplot_gas_prices.py

Comments and suggestions are welcome.

@@ -0,0 +1,329 @@
import numpy as np
import matplotlib.transforms as transforms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this import should be inside the function

@josef-pkt josef-pkt mentioned this pull request Jan 6, 2014
@kshedden
Copy link
Contributor Author

kshedden commented Jan 8, 2014

This (pushed just now) is a major rewrite of the dotplot function. I think all the issues raised about the initial PR have now been addressed, or are no longer relevant. Thank you for the detailed review,.

I redesigned the interface to make it more natural to use (no need to construct a complicated special data structure just to get the plot).

These "dotplots" are also known as "forest plots". Perhaps that name is more well-known? I'm open to changing the name.

The test_dotplot.py script isn't really a test script in the usual sense. I don't really know how to write nose-style tests for graphics. For myself, I need it to output a pdf because I work remotely over a terminal.

Other relevant files:

www.stat.lsa.umich.edu/~kshedden/test_dotplot.pdf
www.stat.lsa.umich.edu/~kshedden/dotplot_gdp_example.pdf
www.stat.lsa.umich.edu/~kshedden/dotplot_gas_prices.pdf

www.stat.lsa.umich.edu/~kshedden/dotplot_gdp_example.py
www.stat.lsa.umich.edu/~kshedden/dotplot_gas_prices.py

www.stat.lsa.umich.edu/~kshedden/wbdata.csv
www.stat.lsa.umich.edu/~kshedden/gas.csv

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 311844e on kshedden:dotplot into f46421a on statsmodels:master.

@josef-pkt
Copy link
Member

In a google search "forest plot" looks more informative than "dot plot", and closer to the pictures that it produces.
I don't know either name.

@josef-pkt
Copy link
Member

Given the google results, this looks like it can be useful to plot the multiple comparison.
We have one plot method in statsmodels for that that might be similar (but much less general)

@josef-pkt josef-pkt added the PR label Feb 19, 2014
@kshedden kshedden closed this Mar 26, 2014
@kshedden kshedden reopened this Mar 26, 2014
@jseabold
Copy link
Member

jseabold commented Apr 3, 2014

Can you add an quick example to the release notes with a plot? Would you also consider adding an example to the docstring. See the example here [1] and the docs code here [2] for an example.

For the actual example, if you don't have a ready use case with real data, maybe something in here will give you an idea. I think the statecrime dataset might be suited to this.

Might you also consider making an IPython notebook example [4], using either the statecrime data or your test dataset cases? This is less important for this PR, maybe a future TODO. It would be nice to have a notebook that shows off all that you can do with this though.

[1] https://github.com/statsmodels/statsmodels/blob/master/statsmodels/tsa/filters/bk_filter.py#L49
[2] https://github.com/statsmodels/statsmodels/blob/master/docs/source/plots/bkf_plot.py
[3] http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/regression_plots.html
[4] http://statsmodels.sourceforge.net/devel/dev/examples.html#file-format

@kshedden
Copy link
Contributor Author

kshedden commented Apr 5, 2014

I added two simple examples to the dotplot doc string.

Two notebooks are here:

http://nbviewer.ipython.org/urls/umich.box.com/shared/static/oxsz9tlg19clhzi422i4.ipynb

http://nbviewer.ipython.org/urls/umich.box.com/shared/static/oh717lkxczhseep71lao.ipynb

The notebooks are also on the wiki examples page.

On Fri, Apr 4, 2014 at 2:07 AM, Skipper Seabold notifications@github.comwrote:

Can you add an quick example to the release notes with a plot? Would you
also consider adding an example to the docstring. See the example here [1]
and the docs code here [2] for an example.

For the actual example, if you don't have a ready use case with real data,
maybe something in here will give you an idea. I think the statecrime
dataset might be suited to this.

Might you also consider making an IPython notebook example [4], using
either the statecrime data or your test dataset cases? This is less
important for this PR, maybe a future TODO. It would be nice to have a
notebook that shows off all that you can do with this though.

[1]
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/tsa/filters/bk_filter.py#L49
[2]
https://github.com/statsmodels/statsmodels/blob/master/docs/source/plots/bkf_plot.py
[3]
http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/regression_plots.html
[4] http://statsmodels.sourceforge.net/devel/dev/examples.html#file-format

Reply to this email directly or view it on GitHubhttps://github.com//pull/1294#issuecomment-39484655
.

@jseabold
Copy link
Member

jseabold commented Apr 5, 2014

Thanks. How persistent should (can) we expect these data URLs to be for the examples?

Would it be worth to include these datasets in the package?

@kshedden
Copy link
Contributor Author

kshedden commented Apr 6, 2014

I don't plan to remove these files, but I do think that it would be better
to host them somewhere other than my personal Box account.

On Sat, Apr 5, 2014 at 10:11 PM, Skipper Seabold
notifications@github.comwrote:

Thanks. How persistent should (can) we expect these data URLs to be for
the examples?

Would it be worth to include these datasets in the package?

Reply to this email directly or view it on GitHubhttps://github.com//pull/1294#issuecomment-39639474
.

try:
import matplotlib.transforms as transforms
if matplotlib.__version__ < '1':
raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is valid Python anymore with no exception. We (optionally) require matplotlib >= 1.2.1 now, so you don't need to check for less than v1.x.x.

@josef-pkt
Copy link
Member

I don't see anything in a quick browse. looks good to me.

aside: I just opened a bug issue for tukeyhsd plot, it might be possible to use dot_plot for creating the plot.

@josef-pkt
Copy link
Member

two items:

I had to look for ax in the long list of arguments. Would it be better last?
I didn't see any real break in the type of arguments.

dotplot or dot_plot? do we have a convention?

@jseabold
Copy link
Member

jseabold commented Apr 6, 2014

We don't really have a convention yet. Sometimes plot... sometimes plot_... sometimes ..._plot. I have to hunt sometimes. This needs some consistency at some point. I'm leaning towards ..._plot, though I don't think we need to enforce the underscore unless it's really tough to read. The functions are in the graphics namespace, so having plot first is uninformative, but it's still good to distinguish the plotting functions from other imports when we haven't properly defined __all__.

@josef-pkt
Copy link
Member

In methods plot should be prefix plot_xxx, for standalone function in graphics, plot works better as postfix.
I like consistency with underscores, because then I don't have to guess (as with some numpy functions where I need to type 3 versions to hit the right spelling)

@jseabold
Copy link
Member

jseabold commented Apr 6, 2014

Sounds reasonable to me.

@josef-pkt
Copy link
Member

I thought this was already merged.

Is there anything left to do?

I'd like to merge this, without reviewing all the matplotlib and plot option details. plots look good.

@kshedden
Copy link
Contributor Author

Yes, it's ready to merge. I've been using it and it seems stable. I just
made some minor doc and comment edits.

On Wed, May 21, 2014 at 12:42 AM, Josef Perktold
notifications@github.comwrote:

I thought this was already merged.

Is there anything left to do?

I'd like to merge this, without reviewing all the matplotlib and plot
option details. plots look good.


Reply to this email directly or view it on GitHubhttps://github.com//pull/1294#issuecomment-43651016
.

@josef-pkt
Copy link
Member

Ok, I will rebase this to get it ready for merging.

@josef-pkt
Copy link
Member

merged rebased version in PR #1681
Thanks Kerby

@josef-pkt josef-pkt closed this May 21, 2014
@josef-pkt josef-pkt added this to the 0.6 milestone May 22, 2014
@kshedden kshedden deleted the dotplot branch June 9, 2014 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants