Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc cleanup #401

Merged
merged 11 commits into from
Jan 16, 2016
Merged

Doc cleanup #401

merged 11 commits into from
Jan 16, 2016

Conversation

jbednar
Copy link
Member

@jbednar jbednar commented Jan 8, 2016

Partial cleanup, ready for comments (and merging if appropriate) but not yet including edits to the main tutorials. It would be very good if Philipp or Jean-Luc could inspect my changes carefully, looking especially at:

  1. I changed the tagline in the main site to match the one on Github.
  2. Is the tone and level of detail right on Homepage.ipynb? I tried to make it more explicit about the difference in philosophy between HoloViews and other libraries, but it's always hard to explain that.
  3. I added links to the new Bokeh tutorials from the Tutorials index, and references to Bokeh throughout, but there are probably other places to mention it and guide people towards which backend is appropriate for their purposes.
  4. Is the Pandas tutorial out of date? It talks about converting Pandas dataframes, but I thought we were now supporting using them as-is, not copying or converting. Does this need to be clarified somewhere?
  5. Can we simplify the GitHub README.rst? Right now it has a redundant copy of features.rst, which I vote should just be replaced with a link, but in any case at least needs to be updated from features.rst.
  6. There were a few locations, e.g. in the FAQ, where I wanted to document something (e.g. the aliases system), but the only documentation I could find was in a pull request, so I put in a link to the pull request. We can either leave in such links as a pragmatic measure, or pull the information out of the pull request into something we can maintain over time.
  7. I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib (with the default backend falling back to bokeh when mpl is not installed), and then tell people to choose whichever command they wish:
    • conda install holoviews # no plotting backend
    • conda install holoviews matplotlib # default backend is mpl
    • conda install holoviews bokeh # default backend is bokeh
    • conda install holoviews matplotlib bokeh # default backend is mpl
  8. I've run the tutorials, but don't know how to build the web site locally (which is a big pain and really discourages updating the web site!), so I can't tell if everything's still formatted properly.
  9. It needs links to the SciPy paper in appropriate locations, because that has more philosophical discussion that makes good background reading.
  10. I still need to go through the other tutorials, but won't be able to do that today or tomorrow.

@philippjfr
Copy link
Member

Thanks for going through it, looks good overall.

(2) Is the tone and level of detail right on Homepage.ipynb? I tried to make it more explicit about the difference in philosophy between HoloViews and other libraries, but it's always hard to explain that.

Looks fine to me.

(4) Is the Pandas tutorial out of date? It talks about converting Pandas dataframes, but I thought we were now supporting using them as-is, not copying or converting. Does this need to be clarified somewhere?

It is yes, large parts of it can be reused as material for either a Transforming Data or Columns Tutorial.

(5) Can we simplify the GitHub README.rst? Right now it has a redundant copy of features.rst, which I vote should just be replaced with a link, but in any case at least needs to be updated from features.rst.

Not sure, it's nice to have a concrete list of features on the GitHub landing page especially since it actually receives more unique visitors than the website.

(6) There were a few locations, e.g. in the FAQ, where I wanted to document something (e.g. the aliases system), but the only documentation I could find was in a pull request, so I put in a link to the pull request. We can either leave in such links as a pragmatic measure, or pull the information out of the pull request into something we can maintain over time.

We should open an issue to collect items that should get some documentation so we can prioritize.

(7) I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib

It shouldn't be too hard to drop the matplotlib dependency for the bokeh backend and fallbacks shouldn't be too hard either. Of course any composite object will continue to simply warn about any unsupported Element types.

(8) I've run the tutorials, but don't know how to build the web site locally (which is a big pain and really discourages updating the web site!), so I can't tell if everything's still formatted properly.

This should do it:

cd doc
conda install sphinx, runipy
make ipynb-doc

Problem the tests require very specific versions of freetype and matplotlib and will only work on Linux so it'll just abort. Should add an option to build the docs without tests.

@jbednar
Copy link
Member Author

jbednar commented Jan 8, 2016

Travis says https://travis-ci.org/ioam/holoviews/jobs/101181829 failed with display output mismatch, but I can't see what the output difference is...

@jbednar
Copy link
Member Author

jbednar commented Jan 8, 2016

BTW, the Introduction tutorial shows that a Numpy array will be preserved inside a HoloViews object; can the same be shown for a Pandas dataframe or other input types? If so we should say that in that location, and show it somewhere (there if appropriate, or elsewhere).

@philippjfr
Copy link
Member

BTW, the Introduction tutorial shows that a Numpy array will be preserved inside a HoloViews object; can the same be shown for a Pandas dataframe or other input types? If so we should say that in that location, and show it somewhere (there if appropriate, or elsewhere).

Should presumably go in the notebook about the Columns interface, which I'll adapt from the existing Pandas Conversion notebook.

@philippjfr
Copy link
Member

Travis says https://travis-ci.org/ioam/holoviews/jobs/101181829 failed with display output mismatch, but I can't see what the output difference is...

The display tests for two cells in the Elements tutorial are still flakey, if you see a failure in cell 43 or 46 just restart the build.

@jbednar
Copy link
Member Author

jbednar commented Jan 8, 2016

Ok, I managed to cover two more of the tutorials; still lots to go, but the intro ones are the most important and have the most changes...

@jbednar
Copy link
Member Author

jbednar commented Jan 8, 2016

Oh, and I had some changes to About (and maybe others?) stashed away; any idea how to inspect the stash manually to cut and paste those out of there? If not I'll have to recreate them...

@philippjfr
Copy link
Member

You can inspect the last stash with:

git stash show -p stash@{0}

I would seriously recommend getting magit for emacs though it makes working with git so much easier.

@jlstevens
Copy link
Contributor

Yes! I highly recommend magit!

Anyway, thank you for doing this - I'll catch up and make some comments.

@jlstevens
Copy link
Contributor

It is really great to see a thorough update of the website!

Here are my replies to your queries:

(1) I changed the tagline in the main site to match the one on Github.

Great!

(2) Is the tone and level of detail right on Homepage.ipynb? I tried to make it more explicit about the difference in philosophy between HoloViews and other libraries, but it's always hard to explain that.

Yes, the tone is right and I like the updated text. It is a bit wordy but I think that is unavoidable as it quite hard to convey what HoloViews offers with text only.

(5) Can we simplify the GitHub README.rst? Right now it has a redundant copy of features.rst, which I vote should just be replaced with a link, but in any case at least needs to be updated from features.rst.

I agree with Philipp. I use the README as the basis of the PyPI page (with minor tweaks) and the Anaconda Page. I also think it is important to have a nice list of features on the GitHub landing page.

(6) There were a few locations, e.g. in the FAQ, where I wanted to document something (e.g. the aliases system), but the only documentation I could find was in a pull request, so I put in a link to the pull request. We can either leave in such links as a pragmatic measure, or pull the information out of the pull request into something we can maintain over time.

Linking to PRs is fine and it is tricky to find a place for the miscellaneous odds and ends. The wiki is still a good place to record various tips and tricks as we encounter them but I do think we want a more visible place for newcomers to find them. I can imagine something in the style of the FAQ (which should be fore higher level questions) that is broken down into sections (rendering, styling, exporting etc). Whatever we do, it should be easily searchable.

(7) I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib.

Are you suggesting the conda package include bokeh as a dependency (already the case I believe) but not matplotlib? I do agree the bokeh backend would ideally not require matplotlib but I have no found a good way to do the equivalent of the extras_require with conda. That is why the conda package includes everything (and conda can install more dependencies a lot quicker than pip anyway).

Anyway, I hope we will be able to shift our attention to improving the website building process now as well as author new tutorials and improve the documentation general. For the time being, I'll have a closer look at the rest of the updated text and make inline comments if I have anything to suggest...


(hv.Image(np.random.rand(10,10), group=al.Spectrum, label=al.Glucose) +
hv.Image(np.random.rand(10,10), group=al.Spectrum, label=al.Water))
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think documenting this in the FAQ isn't too bad in the end. It is probably things like a collection of final_hooks examples that are hard to find a place for...

@jbednar
Copy link
Member Author

jbednar commented Jan 9, 2016

(7) I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib.

Are you suggesting the conda package include bokeh as a dependency (already the case I believe) but not matplotlib? I do agree the bokeh backend would ideally not require matplotlib but I have no found a good way to do the equivalent of the extras_require with conda. That is why the conda package includes everything (and conda can install more dependencies a lot quicker than pip anyway).

No, I'm suggesting that neither bokeh nor matplotlib should be dependencies of the HoloViews conda package, and that we ask users to specify whichever one or the other they wish to use, recommending that they install both but allowing them to install neither. conda install holoviews matplotlib is not unwieldy, and is explicitly about something the user cares about. (Whereas asking them to do conda install holoviews param would not make sense, as using Param is our own choice, not theirs.)

@jlstevens
Copy link
Contributor

... that we ask users to specify whichever one or the other they wish to use, recommending that they install both but allowing them to install neither.

I am ok with this and it makes sense to me. It does mean we need the user to install the plotting backend themselves but that makes sense if we are going to support more options (e.g plotly) in future.

I can make this change to the conda package for the next minor release i.e 1.4.2 (which will probably happen in a week or two).

@jbednar
Copy link
Member Author

jbednar commented Jan 9, 2016

Changing the conda package spec should be straightforward, and I don't think the change should affect any existing HoloViews users, because they will have already installed matplotlib previously. But of course we do need to make sure that the codebase can run in some meaningful way without either matplotlib or bokeh (presumably a test of just creating, saving, and restoring each Element and Container type, without plotting them?). And we also need to make the Bokeh plotting work when Matplotlib is not installed, which I believe requires making our own copies of a few Matplotlib objects that we use as defaults (e.g. colormaps).

@jbednar
Copy link
Member Author

jbednar commented Jan 9, 2016

BTW, is there already an issue to fix the table of contents from the Elements tutorials online, which are not formatting and linking correctly? They seem formatted fine in my own IPython session, but online are a mess: http://holoviews.org/dev/Tutorials/Elements

@philippjfr
Copy link
Member

BTW, is there already an issue to fix the table of contents from the Elements tutorials online, which are not formatting and linking correctly? They seem formatted fine in my own IPython session, but online are a mess: http://holoviews.org/dev/Tutorials/Elements

Not sure what happened there, might be a recent version of nbconvert or Sphinx that broke it. May have to create a pure HTML index.

@jbednar
Copy link
Member Author

jbednar commented Jan 11, 2016

What's the status of the Transforming Data tutorial currently linked from the sidebar? Should I just delete that? Or is there some less-than-perfect version we can slap up there? Anything is better than nothing, even if it's just some code with no text...

@philippjfr
Copy link
Member

Yes, please delete it. It doesn't exist and has been subsumed by the Sampling Data and now the Columnar Data tutorial. We do still need a Tutorial about operations but that should probably be called something different.

@jlstevens
Copy link
Contributor

Ok, we are agreed then! Good that we have decided on what is supposed to be happening and what exactly is a bug and not a feature. :-)

@jbednar
Copy link
Member Author

jbednar commented Jan 12, 2016

Anyway, as you can probably see, I'm working on the Sampling Data tutorial, which I don't think I've ever previously edited, so it's going slowly. I think slicing is making sense mostly, barring the bug above, but I'm having trouble understanding what the different high-level categories of object really are and what is supported for each meaningful category. E.g. the Sampling Data tutorial has a section "Regularly sampled or binned data in 1D", but I can't see anything about that section that requires regular sampling. E.g. the first example in it works fine if I change the sampling to irregular:

image

So I can't figure out what the real breakdown into important categories are -- slicing just works, and so isn't informative of category, discrete vs. continuous seems to be used inconsistently, regular sampling isn't needed for the cases it's said to be needed, etc. -- what's the real story? I'm trying to work it out...

@philippjfr
Copy link
Member

So I can't figure out what the real breakdown into important categories are -- slicing just works, and so isn't informative of category, discrete vs. continuous seems to be used inconsistently, regular sampling isn't needed for the cases it's said to be needed, etc. -- what's the real story? I'm trying to work it out...

Histograms are different again since their data is defined in terms of bin boundaries. We ideally also want to switch those to be Columns types eventually. Their current indexing behavior should return the value for the current bin you're in, while slicing has to include the bin boundaries.

The only real distinction that some types are treated as discrete samples of a continuous space and therefore snap. This is currently to restricted 1D Columns types (Curve, Scatter, ErrorBars) and Histograms. All other types are agnostic to their sampling.

@jbednar
Copy link
Member Author

jbednar commented Jan 12, 2016

You're also right that it doesn't really have anything to do with regular sampling, nothing about Columns, Histogram and Path types assumes or enforces anything about the data being regularly sampled. So I guess you should remove references to that.

Ok, will do.

The only real distinction that some types are treated as discrete samples of a continuous space and therefore snap. This is currently to restricted 1D Columns types (Curve, Scatter, ErrorBars) and Histograms. All other types are agnostic to their sampling.

But don't Image and other Raster types also support snapping? In that case the regular grid is important, because it allows you to compute the nearest neighbor in 2D with no search and a well-bounded uncertainty on the point position.

@philippjfr
Copy link
Member

But don't Image and other Raster types also support snapping? In that case the regular grid is important, because it allows you to compute the nearest neighbor in 2D with no search and a well-bounded uncertainty on the point position.

True, I was only referring to Columns types and the other types that could eventually become Column like. Raster types are another example of Elements that are represent discrete samples of a continuous space and therefore snap (this time in 2D). QuadMesh is again discrete but not necessarily regular.

Here's a breakdown of the way I see the different Elements:

0D Continuous: Distribution, Spikes*, BoxWhisker*
1D discrete with snapping: Scatter, Curve, Histogram, ErrorBars
1D general: Bars*
2D discrete with snapping: Raster, Image, Surface, QuadMesh
2D general: HeatMap
2D general: Points, Paths, Polygons
3D general: Scatter3D, Trisurface

Note he dimensionality here refers to the dimensionality of the key dimensions, Columns types allow any number of value dimensions to be defined.

* - can also be multi-dimensional

@jbednar
Copy link
Member Author

jbednar commented Jan 12, 2016

Thanks. Is it safe to say this?

All Element types support slicing using a syntax likee[a:b], which will return another Element of the same type, with the data from the specified range [*a*,*b*).
Some Elements also support indexing, using a syntax likee[a], which will return scalar values representing thevdims(value_dimensions) of the nearest data point. Specifically, theScatter,Curve,Histogram, andErrorBars1D Element types, and all of the 2DRaster-based types (Raster,Image,RGB,HSV,QuadMesh) support indexing, because in each case finding the nearest data point to the requested coordinate is straightforward. Other types do *not* support indexing, because doing so would represent a poorly defined search in multidimensional space, rather than just computing bin location in 2D or nearest neighbors in 1D.

@philippjfr
Copy link
Member

Thanks. Is it safe to say this?

Yes, looks good. Although indexing isn't actually disabled for the other types, you just won't be able to do it unless you hit the exact sample.

@jbednar
Copy link
Member Author

jbednar commented Jan 12, 2016

I'm also confused -- shouldn't the Sampling Data tutorial also define what .select() does? select is used in only one location, without comment, but the conclusion of the tutorial says that we now know how to select data.

@philippjfr
Copy link
Member

I'm also confused -- shouldn't the Sampling Data tutorial also define what .select() does? select is used in only one location, without comment, but the conclusion of the tutorial says that we now know how to select data.

Hmm, wonder if I had something and deleted it. The Exploring Data tutorial discusses it a little bit but we should probably present some simple examples here, showing that it's basically equivalent to indexing and slicing but can be applied to any datastructure however deeply nested it is and is convenient to select by one particular dimension.

@jbednar
Copy link
Member Author

jbednar commented Jan 12, 2016

Oh dear. No wonder I've never been able to keep indexing/sampling/selecting/slicing straight. Ok, is it safe to say this?

In addition to slicing using a syntax likee[a:b](supported for allElementtypes) and indexing using a syntax likee[a](supported for someElementtypes as described above), HoloViews elements also support a separate method.select(), which can do both slicing and indexing but is more verbose.

If .select is supported in some cases that regular [] slicing and indexing is not, I should say what those are. But there do seem to be cases where .select doesn't work where [] does, which implies that .select isn't more general. E.g. .select works for Curves, but doesn't seem to work for Histograms:

image

image

So I'm not sure what I can say about .select.

@jbednar
Copy link
Member Author

jbednar commented Jan 12, 2016

Also, the section on "The .table and .dframe methods" seems out of place in the middle of the Sampling Data tutorial; it doesn't seem to say anything about sampling data, just converting it. It's true that the subsequent section relies on it, but that doesn't mean it can go here. Will it fit into the Pandas or Columnar tutorials?

@philippjfr
Copy link
Member

If .select is supported in some cases that regular [] slicing and indexing is not, I should say what those are. But there do seem to be cases where .select doesn't work where [] does, which implies that .select isn't more general. E.g. .select works for Curves, but doesn't seem to work for Histograms:

It isn't more general just more convenient in some deeply nested cases and Histogram should support select. Another bug I'll have to look into.

Also, the section on "The .table and .dframe methods" seems out of place in the middle of the Sampling Data tutorial; it doesn't seem to say anything about sampling data, just converting it. It's true that the subsequent section relies on it, but that doesn't mean it can go here. Will it fit into the Pandas or Columnar tutorials?

True, it's not really about sampling, moving them to Columnar Data might make sense but we should make it clear that most Element types can be converted to a Table.

@philippjfr
Copy link
Member

Just a quick side-note I was just looking at some of the new docs for Bokeh and came across their new charts documentation, which has a good section about tall and wide data (see here). We use the same format for our Columns so it might be worth providing a similar explanation.

@philippjfr
Copy link
Member

Should we merge soon? We wanted to get the website updated asap, so unless there are any half-finished changes I'd say merge now, I'll update the website and then you can open another PR to go through the rest of the Tutorials.

@jbednar
Copy link
Member Author

jbednar commented Jan 14, 2016

I think it's fine to merge now. Please check my latest commits, and then merge. I'll still have more commits, but nothing else major is ready yet.

@philippjfr
Copy link
Member

Something is going wrong in testing the Composing Data tutorial under Python 3, there's 8 of these errors:

======================================================================
ERROR: test_Composing_Data_data_000 (__main__.NBTester)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/ioam/holoviews/doc/nbpublisher/nbtest.py", line 437, in data_comparison
    ref_data,  ref_code =  pickle.load(ref_file)
ImportError: No module named 'UserString'

@jbednar
Copy link
Member Author

jbednar commented Jan 14, 2016

That's odd; I don't know what UserString is. I did notice some print statements in your recently added tutorials that lacked () that I meant to fix for Python 3 usage, but I don't recall that being in Composing Data, and in any case I didn't change them...

@philippjfr
Copy link
Member

Seeing the same errors on master so it's not your changes. Will have to look into it.

@philippjfr
Copy link
Member

Traced down the bugs in the tests. Miniconda got upgraded from Python 3.4.3-2 to 3.4.4, which apparently breaks unpickling, pretty unsettling they're breaking the pickling protocol in bug fix releases. Will have to pin it to 3.4.3-2 for now.

@philippjfr
Copy link
Member

Python 2 tests are passing here and Python 3 tests are now passing again on master so I'll go ahead and merge.

philippjfr added a commit that referenced this pull request Jan 16, 2016
@philippjfr philippjfr merged commit 9dbd827 into master Jan 16, 2016
@jbednar jbednar mentioned this pull request Jan 21, 2016
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants