Skip to content

Template#1

Merged
lucyleeow merged 11 commits intosphinx-gallery:masterfrom
lucyleeow:template
Jan 15, 2020
Merged

Template#1
lucyleeow merged 11 commits intosphinx-gallery:masterfrom
lucyleeow:template

Conversation

@lucyleeow
Copy link
Copy Markdown
Collaborator

Basic template.

Should I add:

  • a simple module with a function or two and use it in the examples?
  • requirements file
  • example data and download the data in a template (suggested by @larsoner here)
  • pandas dataframe in example to show html being captured

Suggestions welcome.

@larsoner
Copy link
Copy Markdown
Contributor

@lucyleeow you'll want to set up CircleCI to do a build so we can see artifacts

@larsoner
Copy link
Copy Markdown
Contributor

(let's start with getting that to work so we can easily iterate and review, we'll want it as part of the template anyway)

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

@larsoner how do I get circleci to work on this PR....? I added a basic circleci config.yml file, merged this into master and it ran on master...?

@larsoner
Copy link
Copy Markdown
Contributor

You have to tell it to "build forked pull requests" in the settings on the circle site. Let me know if you can't find it and I can do it

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

Thanks @larsoner. The output is here.

@larsoner
Copy link
Copy Markdown
Contributor

Next steps it would be good to have:

  1. some sample_project module with a function and a class
  2. examples that use these
  3. backreferences support
  4. modified class.rst and function.rst showing how to add the backrefs
  5. the pandas dataframe repr stuff

@larsoner
Copy link
Copy Markdown
Contributor

Should we merge this, or wait until that stuff is in place? I'm fine either way

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

I can add to this PR and merge as one. Thanks!

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

@larsoner I am a little lost as to how why the function parameter and out parts is not formatted correctly. I think docstring is the same as functions in SG - is there a setting to set?

Also I don't know why an empty backreferences_dir examples file is made for the function but seems to work perfect for the class (at least locally).

@larsoner
Copy link
Copy Markdown
Contributor

Probably autosummary and/or napolean would help here. Also maybe autosummary_generate=True config values, etc. Basically I would copy bits of the SG config until I found what fixed things.

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

lucyleeow commented Dec 16, 2019

@larsoner The layout of the autodoc was fixed by adding 'sphinx.ext.napoleon' and 'sphinx.ext.intersphinx'. I did notice however, that pd.DataFrame was not linked via intersphinx (here). This lead me to check the SG example gallery, and it is not linked there either. Any idea why? Matplotlib plot links perfectly fine.

The backreferences file for the function in the SampleModule works if the function is specifically imported (i.e., from SampleName.module import fun) but an empty file is generated if just the package module is imported (import SampleName.module) and the function used in the example like this: package.module.fun()

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

@larsoner backreferences works also if I import the function in the __init__.py file

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

I'm not sure if this is a bug or if I am just doing it wrong. I think the case that doesn't work for me works in sklearn documentation.

@larsoner
Copy link
Copy Markdown
Contributor

@lucyleeow these sound like bugs/limitations in the parsing code, somewhere in these functions:

https://github.com/sphinx-gallery/sphinx-gallery/blob/master/sphinx_gallery/backreferences.py#L62-L190

These are often a pain to debug, I think last time I needed to do it I added some if <condition>: raise RuntimeError in some helpful place, modified tinybuild, ran python -i -msphinx ... call to actually build the doc, and then when it hit my RuntimeError I did import pdb; pdb.pm(). It's ugly but it worked. I don't know of a nicer way to get a debug prompt for a Sphinx build, though one probably exists. For example you can probably call sphinx's main from within a script, with some break points set in your IDE. But I never bothered checking.

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

Last thing I wanted to add was dealing with data files (discussed in sphinx-gallery/sphinx-gallery#565 (comment)).

Not sure how best to do this. My plan would be to download data from a url (with urllib2) and save to a SG_data folder - in the examples_dir? Do I need to give full path or relative path to the folder when doing this?

@larsoner
Copy link
Copy Markdown
Contributor

What mne-python does (and I think it's modeled after nilearn?) is to download data by default to ~/mne_data, so maybe download it to ~/sg_data. This path then gets saved in a config file in the user's system to know where to look for the data. That way if they did in the first call sample.data_path('/home/larsoner/somewhere/else') and this downloaded the data to this non-standard location, then the next call to data_path = sample.data_path() will know to look in the non-standard location.

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

This path then gets saved in a config file in the user's system to know where to look for the data.

How does this happen?

That way if they did in the first call sample.data_path('/home/larsoner/somewhere/else')

Sorry confused with what sample.data_path is. A function I define for downloading data?

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

(I'm not very familiar with this area)

@larsoner
Copy link
Copy Markdown
Contributor

See docs: https://mne.tools/dev/generated/mne.datasets.sample.data_path.html

Eventually it gets set with set_config https://github.com/mne-tools/mne-python/blob/master/mne/datasets/utils.py#L212

This code is dense because we have a lot of datasets, you probably want to start with something simpler if possible

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

Not confident I've done it right but I added a module with functions to download data to a file and save this location in a config file. Copied sklearn.

@larsoner
Copy link
Copy Markdown
Contributor

Looks like it's working!

This seems like a reasonable start to me. Things for an eventual todo (subsequent PRs when people want):

  • Add CircleCI caching
  • Add CircleCI config to download files ahead of time rather than in-script (i.e., assume people have downloaded data)
  • Add tqdm progress bar to download

Someday packages that use dataset downloading (mne, nilearn, sklearn) should think about outsourcing data file downloading duties to a new third-party module. I think all packages have benefitted greatly from sphinx-gallery being a package, having some dataset_downloader (datasetter?) package that we all could vendor or import would be nice. Less stuff to maintain and we all benefit from our fixes...

In any case, @lucyleeow I would say let's merge this and iterate in subsequent PRs

@GaelVaroquaux
Copy link
Copy Markdown

GaelVaroquaux commented Jan 15, 2020 via email

@larsoner
Copy link
Copy Markdown
Contributor

Optimistically I think that the sklearn/nilearn/MNE maintainers could fight amongst themselves long enough to make it happen :)

I guess a natural place for this to live would be sphinx-gallery/<name>.git. I might work on starting something like this with direct pushes, open a WIP PR to integrate it into in MNE, and then nilearn, then sklearn. Try to start easy and ascend the difficulty gradient :)

@lucyleeow lucyleeow mentioned this pull request Jan 15, 2020
3 tasks
@lucyleeow
Copy link
Copy Markdown
Collaborator Author

@larsoner added your suggestions in an issue. I'll merge then!

@lucyleeow lucyleeow merged commit 5043ae3 into sphinx-gallery:master Jan 15, 2020
@lucyleeow
Copy link
Copy Markdown
Collaborator Author

@larsoner to clarify:

Add CircleCI config to download files ahead of time rather than in-script (i.e., assume people have downloaded data)

How does CircleCI relate to whether people (user?) have downloaded data locally?

@larsoner
Copy link
Copy Markdown
Contributor

I'm thinking about the script output, which could give download status updates and also gives a time estimate for running the script. If the data file is downloaded during the script run, then the time estimate of the example is affected as is the output (it would eventually show "downloading..." or whatever if we made the downloader actually give updates). Downloading all example data before building docs will more accurately reflect what a user will experience if they have already downloaded the data.

Also, downloading data first has been very helpful for MNE because then our 1.5 hr doc build will fail 10 minutes in if it can't download all of the necessary data, rather than it taking the full 1.5 hr for SG to fail and report the error.

@lucyleeow
Copy link
Copy Markdown
Collaborator Author

Yes that sounds useful. I was confused about how changing CircleCI affects doc building locally but I realise you mean building in CircleCI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants