Skip to content

Use separate files for each DataFrame function and method#7890

Merged
jsignell merged 5 commits intodask:mainfrom
jsignell:dataframe-api
Jul 19, 2021
Merged

Use separate files for each DataFrame function and method#7890
jsignell merged 5 commits intodask:mainfrom
jsignell:dataframe-api

Conversation

@jsignell
Copy link
Copy Markdown
Member

@jsignell jsignell commented Jul 13, 2021

This copies the pattern that numpy and xarray use. You should be able to see the difference in load time https://dask--7890.org.readthedocs.build/en/7890/dataframe-api.html

Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jsignell, this looks great! A couple of questions that come to mind:

  • Should we also update the clean command in docs/Makefile to also the generated directory? I'm not sure what sort of caching sphinx does here
  • Does this change impact other projects which are using intersphinx to link to Dask API docs? I'm wondering if we need to add redirects for this, or if things will continue to work as is

@jsignell
Copy link
Copy Markdown
Member Author

jsignell commented Jul 13, 2021

Yes!

  • Does this change impact other projects which are using intersphinx to link to Dask API docs? I'm wondering if we need to add redirects for this, or if things will continue to work as is.

I'm not sure about that one, but I am tempted to take a merge and see approach.

@jsignell
Copy link
Copy Markdown
Member Author

I'll look into make.bat tomorrow :(

@jsignell
Copy link
Copy Markdown
Member Author

I tried to copy from xarray or numpy and neither of them have a make.bat. Now I'm wondering if we need it

@mrocklin
Copy link
Copy Markdown
Member

This is great. Two comments, neither of which are likely relevant:

  1. I tried building this locally with make html and found that links from the api index page didn't exist. Probably an issue on my end, but I thought I'd bring it up
  2. Probably we'll have a lot of broken links out there. I'm totally fine just accepting this and moving on, unless there is a very easy solution
  3. After this works, any interest in doing this for arrays and bags and futures as well?

@jrbourbeau
Copy link
Copy Markdown
Member

I'm not sure about that one, but I am tempted to take a merge and see approach

+1

I tried building this locally with make html and found that links from the api index page didn't exist

Hmm, I'm not able to reproduce locally. You might try make clean html

@jsignell
Copy link
Copy Markdown
Member Author

  1. I tried building this locally with make html and found that links from the api index page didn't exist. Probably an issue on my end, but I thought I'd bring it up

I'm wondering if you have an old sphinx version. According to the docs, this setting changed to true by default in 4.0 https://www.sphinx-doc.org/en/master/usage/extensions/autosummary.html#confval-autosummary_generate

3. After this works, any interest in doing this for arrays and bags and futures as well?

Yep that's the plan!

@jrbourbeau
Copy link
Copy Markdown
Member

Merged main to resolve the unrelated CI issues

@jsignell
Copy link
Copy Markdown
Member Author

Merging!

@jsignell jsignell merged commit bfecde1 into dask:main Jul 19, 2021
@jsignell jsignell deleted the dataframe-api branch July 19, 2021 13:33
@gerrymanoim
Copy link
Copy Markdown
Contributor

gerrymanoim commented Jul 19, 2021

This is amazing - thank you very much!

@jsignell
Copy link
Copy Markdown
Member Author

I took a look at cudf docs to see if external links are working( for instance dask.dataframe.reduction on cudf docs). It looks like no - but I'm not sure if they will just start working once the docs page over there is rebuilt. @quasiben is it easy to trigger a build?

@jsignell
Copy link
Copy Markdown
Member Author

nvmd Ben, I see that it's hardcoded over there. I'll open a PR.

rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Jul 19, 2021
@jrbourbeau
Copy link
Copy Markdown
Member

This is great, thanks @jsignell!

Probably we'll have a lot of broken links out there. I'm totally fine just accepting this and moving on, unless there is a very easy solution

Looking at pandas' docs, it seems they were able to handle API redirect in a somewhat straightforward way using sphinx html_additional_pages

https://github.com/pandas-dev/pandas/blob/059c8bac51e47d6eaaa3e36d6a293a22312925e6/doc/source/conf.py#L332-L335

https://github.com/pandas-dev/pandas/blob/059c8bac51e47d6eaaa3e36d6a293a22312925e6/doc/_templates/api_redirect.html

@jsignell
Copy link
Copy Markdown
Member Author

Yeah our case is slightly different because in our case we would want to redirect a link where all the information is stored in the anchor. So it'd be going from https://docs.dask.org/en/latest/array-api.html#dask.array.add to https://docs.dask.org/en/latest/generated/dask.array.add.html for instance. I am leaning towards not worrying too much about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request: Split up the Dataframe API documentation

4 participants