Doc cleanup #401

jbednar · 2016-01-08T22:19:59Z

Partial cleanup, ready for comments (and merging if appropriate) but not yet including edits to the main tutorials. It would be very good if Philipp or Jean-Luc could inspect my changes carefully, looking especially at:

I changed the tagline in the main site to match the one on Github.
Is the tone and level of detail right on Homepage.ipynb? I tried to make it more explicit about the difference in philosophy between HoloViews and other libraries, but it's always hard to explain that.
I added links to the new Bokeh tutorials from the Tutorials index, and references to Bokeh throughout, but there are probably other places to mention it and guide people towards which backend is appropriate for their purposes.
Is the Pandas tutorial out of date? It talks about converting Pandas dataframes, but I thought we were now supporting using them as-is, not copying or converting. Does this need to be clarified somewhere?
Can we simplify the GitHub README.rst? Right now it has a redundant copy of features.rst, which I vote should just be replaced with a link, but in any case at least needs to be updated from features.rst.
There were a few locations, e.g. in the FAQ, where I wanted to document something (e.g. the aliases system), but the only documentation I could find was in a pull request, so I put in a link to the pull request. We can either leave in such links as a pragmatic measure, or pull the information out of the pull request into something we can maintain over time.
I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib (with the default backend falling back to bokeh when mpl is not installed), and then tell people to choose whichever command they wish:
- conda install holoviews # no plotting backend
- conda install holoviews matplotlib # default backend is mpl
- conda install holoviews bokeh # default backend is bokeh
- conda install holoviews matplotlib bokeh # default backend is mpl
I've run the tutorials, but don't know how to build the web site locally (which is a big pain and really discourages updating the web site!), so I can't tell if everything's still formatted properly.
It needs links to the SciPy paper in appropriate locations, because that has more philosophical discussion that makes good background reading.
I still need to go through the other tutorials, but won't be able to do that today or tomorrow.

philippjfr · 2016-01-08T23:03:28Z

Thanks for going through it, looks good overall.

(2) Is the tone and level of detail right on Homepage.ipynb? I tried to make it more explicit about the difference in philosophy between HoloViews and other libraries, but it's always hard to explain that.

Looks fine to me.

(4) Is the Pandas tutorial out of date? It talks about converting Pandas dataframes, but I thought we were now supporting using them as-is, not copying or converting. Does this need to be clarified somewhere?

It is yes, large parts of it can be reused as material for either a Transforming Data or Columns Tutorial.

(5) Can we simplify the GitHub README.rst? Right now it has a redundant copy of features.rst, which I vote should just be replaced with a link, but in any case at least needs to be updated from features.rst.

Not sure, it's nice to have a concrete list of features on the GitHub landing page especially since it actually receives more unique visitors than the website.

(6) There were a few locations, e.g. in the FAQ, where I wanted to document something (e.g. the aliases system), but the only documentation I could find was in a pull request, so I put in a link to the pull request. We can either leave in such links as a pragmatic measure, or pull the information out of the pull request into something we can maintain over time.

We should open an issue to collect items that should get some documentation so we can prioritize.

(7) I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib

It shouldn't be too hard to drop the matplotlib dependency for the bokeh backend and fallbacks shouldn't be too hard either. Of course any composite object will continue to simply warn about any unsupported Element types.

(8) I've run the tutorials, but don't know how to build the web site locally (which is a big pain and really discourages updating the web site!), so I can't tell if everything's still formatted properly.

This should do it:

cd doc
conda install sphinx, runipy
make ipynb-doc

Problem the tests require very specific versions of freetype and matplotlib and will only work on Linux so it'll just abort. Should add an option to build the docs without tests.

jbednar · 2016-01-08T23:10:42Z

Travis says https://travis-ci.org/ioam/holoviews/jobs/101181829 failed with display output mismatch, but I can't see what the output difference is...

jbednar · 2016-01-08T23:13:31Z

BTW, the Introduction tutorial shows that a Numpy array will be preserved inside a HoloViews object; can the same be shown for a Pandas dataframe or other input types? If so we should say that in that location, and show it somewhere (there if appropriate, or elsewhere).

philippjfr · 2016-01-08T23:15:36Z

BTW, the Introduction tutorial shows that a Numpy array will be preserved inside a HoloViews object; can the same be shown for a Pandas dataframe or other input types? If so we should say that in that location, and show it somewhere (there if appropriate, or elsewhere).

Should presumably go in the notebook about the Columns interface, which I'll adapt from the existing Pandas Conversion notebook.

philippjfr · 2016-01-08T23:17:08Z

Travis says https://travis-ci.org/ioam/holoviews/jobs/101181829 failed with display output mismatch, but I can't see what the output difference is...

The display tests for two cells in the Elements tutorial are still flakey, if you see a failure in cell 43 or 46 just restart the build.

jbednar · 2016-01-08T23:23:50Z

Ok, I managed to cover two more of the tutorials; still lots to go, but the intro ones are the most important and have the most changes...

jbednar · 2016-01-08T23:56:29Z

Oh, and I had some changes to About (and maybe others?) stashed away; any idea how to inspect the stash manually to cut and paste those out of there? If not I'll have to recreate them...

philippjfr · 2016-01-08T23:59:49Z

You can inspect the last stash with:

git stash show -p stash@{0}

I would seriously recommend getting magit for emacs though it makes working with git so much easier.

jlstevens · 2016-01-09T01:32:26Z

Yes! I highly recommend magit!

Anyway, thank you for doing this - I'll catch up and make some comments.

jlstevens · 2016-01-09T01:53:57Z

It is really great to see a thorough update of the website!

Here are my replies to your queries:

(1) I changed the tagline in the main site to match the one on Github.

Great!

(2) Is the tone and level of detail right on Homepage.ipynb? I tried to make it more explicit about the difference in philosophy between HoloViews and other libraries, but it's always hard to explain that.

Yes, the tone is right and I like the updated text. It is a bit wordy but I think that is unavoidable as it quite hard to convey what HoloViews offers with text only.

(5) Can we simplify the GitHub README.rst? Right now it has a redundant copy of features.rst, which I vote should just be replaced with a link, but in any case at least needs to be updated from features.rst.

I agree with Philipp. I use the README as the basis of the PyPI page (with minor tweaks) and the Anaconda Page. I also think it is important to have a nice list of features on the GitHub landing page.

(6) There were a few locations, e.g. in the FAQ, where I wanted to document something (e.g. the aliases system), but the only documentation I could find was in a pull request, so I put in a link to the pull request. We can either leave in such links as a pragmatic measure, or pull the information out of the pull request into something we can maintain over time.

Linking to PRs is fine and it is tricky to find a place for the miscellaneous odds and ends. The wiki is still a good place to record various tips and tricks as we encounter them but I do think we want a more visible place for newcomers to find them. I can imagine something in the style of the FAQ (which should be fore higher level questions) that is broken down into sections (rendering, styling, exporting etc). Whatever we do, it should be easily searchable.

(7) I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib.

Are you suggesting the conda package include bokeh as a dependency (already the case I believe) but not matplotlib? I do agree the bokeh backend would ideally not require matplotlib but I have no found a good way to do the equivalent of the extras_require with conda. That is why the conda package includes everything (and conda can install more dependencies a lot quicker than pip anyway).

Anyway, I hope we will be able to shift our attention to improving the website building process now as well as author new tutorials and improve the documentation general. For the time being, I'll have a closer look at the rest of the updated text and make inline comments if I have anything to suggest...

jlstevens · 2016-01-09T02:26:44Z

doc/FAQ.rst

+
+(hv.Image(np.random.rand(10,10), group=al.Spectrum, label=al.Glucose) +
+ hv.Image(np.random.rand(10,10), group=al.Spectrum, label=al.Water))
+```


I think documenting this in the FAQ isn't too bad in the end. It is probably things like a collection of final_hooks examples that are hard to find a place for...

jbednar · 2016-01-09T14:28:34Z

(7) I added "bokeh" to the conda install command, so that the result would be more like what happens for matplotlib (things just work). But the right approach is presumably to make the conda package be completely independent of matplotlib.

Are you suggesting the conda package include bokeh as a dependency (already the case I believe) but not matplotlib? I do agree the bokeh backend would ideally not require matplotlib but I have no found a good way to do the equivalent of the extras_require with conda. That is why the conda package includes everything (and conda can install more dependencies a lot quicker than pip anyway).

No, I'm suggesting that neither bokeh nor matplotlib should be dependencies of the HoloViews conda package, and that we ask users to specify whichever one or the other they wish to use, recommending that they install both but allowing them to install neither. conda install holoviews matplotlib is not unwieldy, and is explicitly about something the user cares about. (Whereas asking them to do conda install holoviews param would not make sense, as using Param is our own choice, not theirs.)

jlstevens · 2016-01-09T14:36:15Z

... that we ask users to specify whichever one or the other they wish to use, recommending that they install both but allowing them to install neither.

I am ok with this and it makes sense to me. It does mean we need the user to install the plotting backend themselves but that makes sense if we are going to support more options (e.g plotly) in future.

I can make this change to the conda package for the next minor release i.e 1.4.2 (which will probably happen in a week or two).

jbednar · 2016-01-09T14:39:51Z

Changing the conda package spec should be straightforward, and I don't think the change should affect any existing HoloViews users, because they will have already installed matplotlib previously. But of course we do need to make sure that the codebase can run in some meaningful way without either matplotlib or bokeh (presumably a test of just creating, saving, and restoring each Element and Container type, without plotting them?). And we also need to make the Bokeh plotting work when Matplotlib is not installed, which I believe requires making our own copies of a few Matplotlib objects that we use as defaults (e.g. colormaps).

jbednar · 2016-01-09T18:50:11Z

BTW, is there already an issue to fix the table of contents from the Elements tutorials online, which are not formatting and linking correctly? They seem formatted fine in my own IPython session, but online are a mess: http://holoviews.org/dev/Tutorials/Elements

philippjfr · 2016-01-09T19:18:00Z

BTW, is there already an issue to fix the table of contents from the Elements tutorials online, which are not formatting and linking correctly? They seem formatted fine in my own IPython session, but online are a mess: http://holoviews.org/dev/Tutorials/Elements

Not sure what happened there, might be a recent version of nbconvert or Sphinx that broke it. May have to create a pure HTML index.

jbednar · 2016-01-11T19:23:16Z

What's the status of the Transforming Data tutorial currently linked from the sidebar? Should I just delete that? Or is there some less-than-perfect version we can slap up there? Anything is better than nothing, even if it's just some code with no text...

philippjfr · 2016-01-11T19:25:03Z

Yes, please delete it. It doesn't exist and has been subsumed by the Sampling Data and now the Columnar Data tutorial. We do still need a Tutorial about operations but that should probably be called something different.

jlstevens · 2016-01-12T03:56:15Z

Ok, we are agreed then! Good that we have decided on what is supposed to be happening and what exactly is a bug and not a feature. :-)

jbednar · 2016-01-12T04:01:26Z

Anyway, as you can probably see, I'm working on the Sampling Data tutorial, which I don't think I've ever previously edited, so it's going slowly. I think slicing is making sense mostly, barring the bug above, but I'm having trouble understanding what the different high-level categories of object really are and what is supported for each meaningful category. E.g. the Sampling Data tutorial has a section "Regularly sampled or binned data in 1D", but I can't see anything about that section that requires regular sampling. E.g. the first example in it works fine if I change the sampling to irregular:

So I can't figure out what the real breakdown into important categories are -- slicing just works, and so isn't informative of category, discrete vs. continuous seems to be used inconsistently, regular sampling isn't needed for the cases it's said to be needed, etc. -- what's the real story? I'm trying to work it out...

philippjfr · 2016-01-12T04:11:11Z

So I can't figure out what the real breakdown into important categories are -- slicing just works, and so isn't informative of category, discrete vs. continuous seems to be used inconsistently, regular sampling isn't needed for the cases it's said to be needed, etc. -- what's the real story? I'm trying to work it out...

Histograms are different again since their data is defined in terms of bin boundaries. We ideally also want to switch those to be Columns types eventually. Their current indexing behavior should return the value for the current bin you're in, while slicing has to include the bin boundaries.

The only real distinction that some types are treated as discrete samples of a continuous space and therefore snap. This is currently to restricted 1D Columns types (Curve, Scatter, ErrorBars) and Histograms. All other types are agnostic to their sampling.

jbednar · 2016-01-12T04:15:49Z

You're also right that it doesn't really have anything to do with regular sampling, nothing about Columns, Histogram and Path types assumes or enforces anything about the data being regularly sampled. So I guess you should remove references to that.

Ok, will do.

The only real distinction that some types are treated as discrete samples of a continuous space and therefore snap. This is currently to restricted 1D Columns types (Curve, Scatter, ErrorBars) and Histograms. All other types are agnostic to their sampling.

But don't Image and other Raster types also support snapping? In that case the regular grid is important, because it allows you to compute the nearest neighbor in 2D with no search and a well-bounded uncertainty on the point position.

philippjfr · 2016-01-12T04:23:10Z

But don't Image and other Raster types also support snapping? In that case the regular grid is important, because it allows you to compute the nearest neighbor in 2D with no search and a well-bounded uncertainty on the point position.

True, I was only referring to Columns types and the other types that could eventually become Column like. Raster types are another example of Elements that are represent discrete samples of a continuous space and therefore snap (this time in 2D). QuadMesh is again discrete but not necessarily regular.

Here's a breakdown of the way I see the different Elements:

0D Continuous: Distribution, Spikes*, BoxWhisker*
1D discrete with snapping: Scatter, Curve, Histogram, ErrorBars
1D general: Bars*
2D discrete with snapping: Raster, Image, Surface, QuadMesh
2D general: HeatMap
2D general: Points, Paths, Polygons
3D general: Scatter3D, Trisurface

Note he dimensionality here refers to the dimensionality of the key dimensions, Columns types allow any number of value dimensions to be defined.

* - can also be multi-dimensional

jbednar · 2016-01-12T04:31:22Z

Thanks. Is it safe to say this?

All Element types support slicing using a syntax likee[a:b], which will return another Element of the same type, with the data from the specified range [*a*,*b*).
Some Elements also support indexing, using a syntax likee[a], which will return scalar values representing thevdims(value_dimensions) of the nearest data point. Specifically, theScatter,Curve,Histogram, andErrorBars1D Element types, and all of the 2DRaster-based types (Raster,Image,RGB,HSV,QuadMesh) support indexing, because in each case finding the nearest data point to the requested coordinate is straightforward. Other types do *not* support indexing, because doing so would represent a poorly defined search in multidimensional space, rather than just computing bin location in 2D or nearest neighbors in 1D.

philippjfr · 2016-01-12T04:32:59Z

Thanks. Is it safe to say this?

Yes, looks good. Although indexing isn't actually disabled for the other types, you just won't be able to do it unless you hit the exact sample.

jbednar · 2016-01-12T04:33:44Z

I'm also confused -- shouldn't the Sampling Data tutorial also define what .select() does? select is used in only one location, without comment, but the conclusion of the tutorial says that we now know how to select data.

philippjfr · 2016-01-12T04:37:49Z

I'm also confused -- shouldn't the Sampling Data tutorial also define what .select() does? select is used in only one location, without comment, but the conclusion of the tutorial says that we now know how to select data.

Hmm, wonder if I had something and deleted it. The Exploring Data tutorial discusses it a little bit but we should probably present some simple examples here, showing that it's basically equivalent to indexing and slicing but can be applied to any datastructure however deeply nested it is and is convenient to select by one particular dimension.

jbednar · 2016-01-12T05:05:32Z

Oh dear. No wonder I've never been able to keep indexing/sampling/selecting/slicing straight. Ok, is it safe to say this?

In addition to slicing using a syntax likee[a:b](supported for allElementtypes) and indexing using a syntax likee[a](supported for someElementtypes as described above), HoloViews elements also support a separate method.select(), which can do both slicing and indexing but is more verbose.

If .select is supported in some cases that regular [] slicing and indexing is not, I should say what those are. But there do seem to be cases where .select doesn't work where [] does, which implies that .select isn't more general. E.g. .select works for Curves, but doesn't seem to work for Histograms:

So I'm not sure what I can say about .select.

jbednar · 2016-01-12T05:10:53Z

Also, the section on "The .table and .dframe methods" seems out of place in the middle of the Sampling Data tutorial; it doesn't seem to say anything about sampling data, just converting it. It's true that the subsequent section relies on it, but that doesn't mean it can go here. Will it fit into the Pandas or Columnar tutorials?

philippjfr · 2016-01-12T13:22:03Z

If .select is supported in some cases that regular [] slicing and indexing is not, I should say what those are. But there do seem to be cases where .select doesn't work where [] does, which implies that .select isn't more general. E.g. .select works for Curves, but doesn't seem to work for Histograms:

It isn't more general just more convenient in some deeply nested cases and Histogram should support select. Another bug I'll have to look into.

Also, the section on "The .table and .dframe methods" seems out of place in the middle of the Sampling Data tutorial; it doesn't seem to say anything about sampling data, just converting it. It's true that the subsequent section relies on it, but that doesn't mean it can go here. Will it fit into the Pandas or Columnar tutorials?

True, it's not really about sampling, moving them to Columnar Data might make sense but we should make it clear that most Element types can be converted to a Table.

philippjfr · 2016-01-12T13:26:31Z

Just a quick side-note I was just looking at some of the new docs for Bokeh and came across their new charts documentation, which has a good section about tall and wide data (see here). We use the same format for our Columns so it might be worth providing a similar explanation.

philippjfr · 2016-01-14T15:01:40Z

Should we merge soon? We wanted to get the website updated asap, so unless there are any half-finished changes I'd say merge now, I'll update the website and then you can open another PR to go through the rest of the Tutorials.

…ehavior.

jbednar · 2016-01-14T16:35:25Z

I think it's fine to merge now. Please check my latest commits, and then merge. I'll still have more commits, but nothing else major is ready yet.

philippjfr · 2016-01-14T16:44:59Z

Something is going wrong in testing the Composing Data tutorial under Python 3, there's 8 of these errors:

======================================================================
ERROR: test_Composing_Data_data_000 (__main__.NBTester)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/ioam/holoviews/doc/nbpublisher/nbtest.py", line 437, in data_comparison
    ref_data,  ref_code =  pickle.load(ref_file)
ImportError: No module named 'UserString'

jbednar · 2016-01-14T17:00:35Z

That's odd; I don't know what UserString is. I did notice some print statements in your recently added tutorials that lacked () that I meant to fix for Python 3 usage, but I don't recall that being in Composing Data, and in any case I didn't change them...

philippjfr · 2016-01-14T17:07:30Z

Seeing the same errors on master so it's not your changes. Will have to look into it.

philippjfr · 2016-01-16T13:43:25Z

Traced down the bugs in the tests. Miniconda got upgraded from Python 3.4.3-2 to 3.4.4, which apparently breaks unpickling, pretty unsettling they're breaking the pickling protocol in bug fix releases. Will have to pin it to 3.4.3-2 for now.

philippjfr · 2016-01-16T14:22:12Z

Python 2 tests are passing here and Python 3 tests are now passing again on master so I'll go ahead and merge.

Doc cleanup

github-actions · 2024-10-26T09:26:08Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

jbednar added 3 commits January 8, 2016 15:58

Improved docs for 1.4 with updates, fixes, and clarifications

e4285a8

Updated tagline

a38d884

Updated Showcase for 1.4

128e8a5

Updated Introduction for 1.4

b08443f

jlstevens reviewed Jan 9, 2016
View reviewed changes

Updated About for 1.4

5245f1d

Updated for 1.4

d7f443c

philippjfr mentioned this pull request Jan 9, 2016

Problem with select widgets #403

Closed

Tracking master

230d220

jbednar added 4 commits January 14, 2016 10:26

Added Bokeh data download if not already available

adbe7e0

Updated description of Points and Scatter to match current indexing b…

555e502

…ehavior.

Replaced links for Transforming Data to Sampling Data

8f5f7ef

Minor clarification to docs

d95bd06

philippjfr added a commit that referenced this pull request Jan 16, 2016

Merge pull request #401 from ioam/doc-cleanup

9dbd827

Doc cleanup

philippjfr merged commit 9dbd827 into master Jan 16, 2016

jbednar mentioned this pull request Jan 21, 2016

Doc cleanup 2 #411

Merged

github-actions bot locked as resolved and limited conversation to collaborators Oct 26, 2024

Doc cleanup #401

Doc cleanup #401

Conversation

jbednar commented Jan 8, 2016

philippjfr commented Jan 8, 2016

jbednar commented Jan 8, 2016

jbednar commented Jan 8, 2016

philippjfr commented Jan 8, 2016

philippjfr commented Jan 8, 2016

jbednar commented Jan 8, 2016

jbednar commented Jan 8, 2016

philippjfr commented Jan 8, 2016

jlstevens commented Jan 9, 2016

jlstevens commented Jan 9, 2016

jlstevens Jan 9, 2016

Choose a reason for hiding this comment

jbednar commented Jan 9, 2016

jlstevens commented Jan 9, 2016

jbednar commented Jan 9, 2016

jbednar commented Jan 9, 2016

philippjfr commented Jan 9, 2016

jbednar commented Jan 11, 2016

philippjfr commented Jan 11, 2016

jlstevens commented Jan 12, 2016

jbednar commented Jan 12, 2016

philippjfr commented Jan 12, 2016

jbednar commented Jan 12, 2016

philippjfr commented Jan 12, 2016

jbednar commented Jan 12, 2016

philippjfr commented Jan 12, 2016

jbednar commented Jan 12, 2016

philippjfr commented Jan 12, 2016

jbednar commented Jan 12, 2016

jbednar commented Jan 12, 2016

philippjfr commented Jan 12, 2016

philippjfr commented Jan 12, 2016

philippjfr commented Jan 14, 2016

jbednar commented Jan 14, 2016

philippjfr commented Jan 14, 2016

jbednar commented Jan 14, 2016

philippjfr commented Jan 14, 2016

philippjfr commented Jan 16, 2016

philippjfr commented Jan 16, 2016

github-actions bot commented Oct 26, 2024