Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move documentation source to master #239

Closed
billsacks opened this issue Jan 30, 2018 · 25 comments · Fixed by #954
Closed

Move documentation source to master #239

billsacks opened this issue Jan 30, 2018 · 25 comments · Fixed by #954
Assignees
Labels
type: documentation additions or edits to user-facing documentation

Comments

@billsacks
Copy link
Member

Currently, both the source and build of the documentation resides on the gh-pages branch. We should move the source to master, adopting a workflow like https://github.com/esmci/cime/wiki/Working-with-Sphinx-and-reStructuredText

@billsacks billsacks added the type: documentation additions or edits to user-facing documentation label Jan 30, 2018
@billsacks
Copy link
Member Author

Proposed workflow, copied from a post that I made here two days ago: https://github.com/orgs/ESCOMP/teams/ctsm-write

@bertinia and I just had a long discussion to try to come up with a plan for hosting the CTSM documentation in a way that would allow us to host separate frozen versions of the documentation (e.g., corresponding to the CLM5.0 release) alongside the latest dev version. Please let us know if you have any thoughts on this.

Summary of the plan: Documentation source (rst files) will live in the main ctsm repo. Built versions of the documentation will live in a separate repository, named ctsm-docs, and so will appear at escomp.github.io/ctsm-docs. The top-level landing page will give you a list of available versions ('Development', 'CLM5.0 release', etc.) with links. Then the built documentation for individual releases will be stored in subdirectories of the ctsm-docs repo. So, for example, we'll have escomp.github.io/ctsm-docs/development/ and escomp.github.io/ctsm-docs/release-clm5.0/.

More details, discussion and rationale

@dlawrenncar and @olyson expressed a preference for having the documentation hosted in the same place as the code, which argues for having it hosted on a github.io site rather than (e.g.) readthedocs. The problem this raised was how to keep multiple versions of the documentation.

@bertinia and I considered https://robpol86.github.io/sphinxcontrib-versioning/v2.2.1/index.html . This uses the same general idea of having a subdirectory for each version of the documentation, but adds some automation to this and gives a nice list of versions in the sidebar of the html page. However, we ran into some problems with this tool, and it seems like this project has stop being maintained, so we gave up on this, at least for now. (We could always adopt this later if it starts being maintained again.)

The rationale for having a separate repository for the documentation builds is: If this were in the main repo, people would get these builds whenever they clone the repo. Even a single build doubles or triples the size of the repository. As we start supporting more side-by-side builds, the issue of repository bloat will grow. Since most people won't want or need these doc builds, I thought it would be better to isolate these builds to their own repository rather than having them on an orphan branch in the main repository.

The workflow for building the documentation and putting it in the right place would look something like this. Note that this workflow doesn't get that much (if at all) more complicated by having two separate repos; it also doesn't get that much more complicated by maintaining multiple versions of the documentation: mainly, you need to be thoughtful about which subdirectory you copy the built files to.

This assumes you have two clones in your home directory: real ctsm in ~/ctsm, and ctsm docs in ~/ctsm-docs.

cd ~/ctsm
git checkout release-clm5.0
cd doc
# change conf.py to have the correct "version"/"release"
make html
# The following gives an example for generating docs for release-clm5.0
# Often we'll be generating docs for the latest dev version, in which case you would use (e.g.) "development" in place of "release-clm5.0"
rm -rf ~/ctsm-docs/release-clm5.0/*
cp -R build/html/* ~/ctsm-docs/release-clm5.0/
cd ~/ctsm-docs
git add release-clm5.0
# Edit index.html to add a link to release-clm5.0 if there isn't one already, and git add it
git commit -m "Build docs for release-clm5.0"
git push

We don't plan to implement all of this immediately, but will probably do so in the lead-up to the cesm2 release.

@billsacks
Copy link
Member Author

billsacks commented Feb 1, 2018

I realized that it could work better to change the documentation makefile to support an out-of-source build. This way, the build would happen directly in the desired location, rather than building in the documentation source area and then requiring you to copy / move the files later. Among other things, this would:

  • better support incremental rebuilds

  • prevent filling up your source sandbox with a bunch of files that need to be removed

In order to do that, the makefile would rely on two environment variables:

  • Path to the root of the ctsm-docs directory

  • Name of the subdirectory in which to build ("development", "release-clm5.0", etc.). We could make this optional, with a rule like: If this isn't provided, then: get the current git branch name (but change "master" to "development" or whatever we call the directory for the dev version of the docs); if there is a subdirectory with a name matching the current branch name, then use that as the subdirectory for the build; otherwise, abort and force the user to set this (or to create that subdirectory if it doesn't exist - e.g., if this is the first time building the release-clm5.0 documentation).

@billsacks
Copy link
Member Author

Here's information on setting up a Travis-CI-based automatic deployment of the built documentation:

https://docs.travis-ci.com/user/deployment/pages/

@billsacks
Copy link
Member Author

billsacks commented Apr 27, 2018

I went through the documentation repo looking for large images that would add bloat to the main repository. The one that stands out is tech_note/Plant_Hydraulics/flow.png, which is about 2M and accounts for about 1/2 the size of the whole documentation source. Ideally, I'd like for this figure to be made significantly smaller before we import the documentation into the main repo. Then I'd like to rewrite history to remove the original when bringing the documentation history into the main repo, so that this large image never exists in the history of the repo (see https://help.github.com/articles/removing-sensitive-data-from-a-repository/).

Given the nature of this figure, it seems like it could be better to store it in vector format: I'm guessing that would be significantly smaller, and I think would allow someone to edit the image later. From a test, it seems like the sphinx-generated html can read svg files, but sphinx-generated pdfs (via latexpdf) cannot. It looks like https://sites.google.com/site/nickfolse/home/sphinx-latexpdf-output-with-svg-images might provide a solution. If we can't find a solution, maybe there's a way to make this file take up less space with some other supported image format?

@billsacks
Copy link
Member Author

@djk2120 and @olyson - see my comment above. Nothing needed at the moment, but I may ask for your help with this when it comes time to move the documentation source into this main repository.

@djk2120
Copy link
Contributor

djk2120 commented Apr 27, 2018 via email

@jhamman
Copy link
Contributor

jhamman commented Apr 27, 2018

@billsacks et al. Have you looked at readthedocs? This is more or less the defacto way people are hosting versioned sphinx-like documentation that get automatically built from a github repo. My personal experience is that it is much easier to use that github pages. Happy to elaborate more if you all are interested.

@billsacks
Copy link
Member Author

Thanks, @djk2120

@jhamman - yes, we evaluated readthedocs. The problems we saw with that were (1) some people preferred that the documentation be hosted in the same place as the repo to keep everything in one place; (2) as I recall, readthedocs puts a limit on the length of time allotted to build the documentation; our documentation pushes up against that limit. (There may have been other points as well.)

@billsacks
Copy link
Member Author

I have implemented a solution to determine the build directory automatically based on the branch name (as suggested in #239 (comment)). This is for the cism-wrapper repo (ESCOMP/CISM-wrapper@eb9e1d0), but similar logic could be used for ctsm:

# Possible ways to set the build directory:
# (1) Set BUILDDIR from the command line
# (2) Let BUILDDIR be determined as $(BUILDREPO)/cism-in-cesm/$(VERSION)
#     (2a) Set both BUILDREPO and VERSION directly on the command line
#     (2b) Set BUILDREPO on the command line, let VERSION be auto-determined,
#          based on the current git branch name.

ifndef BUILDDIR
# If version isn't set on the command-line, then we set it here, by
# getting the current git branch name.
VERSION ?= $(shell git symbolic-ref --short -q HEAD)
ifeq ($(VERSION),)
$(error Cannot determine version based on git branch; set VERSION on the command line)
endif
BUILDDIR = $(BUILDREPO)/cism-in-cesm/$(VERSION)
$(info Using build directory: $(BUILDDIR))
endif

@billsacks
Copy link
Member Author

I started going down the path of adding some more logic and error checking to this makefile-based solution (see ESCOMP/CISM-wrapper@b290d0f), but it started to feel like the Makefile-based logic was getting out of control. I think a better solution to support a versioned doc build is to use this python-based wrapper to the build that I just put together (and, admittedly, spent too much time on....): https://github.com/ESMCI/versioned-doc-builder

@billsacks
Copy link
Member Author

@djk2120 and @olyson - I just wanted to check back in about the comments in this thread from late April, about looking into changing the format of the large image (flow.png) to decrease its size in the repository. I'd like to move the documentation source into this repository fairly soon. It's not critical that we decrease this file size, but if it isn't too hard to do so, it would be great. This would be particularly helpful if there were side-benefits, like storing the image in a form that's easier for others to work with (which would presumably be the case for a vector rather than png format for this figure).

@billsacks
Copy link
Member Author

(@djk2120 and @olyson to be clear: it's not worth your spending a lot of time on this. This is only worth addressing if it's fairly easy and has the side-benefit of making it easier for others to work with this file.)

@billsacks
Copy link
Member Author

If storing as svg or some other vector format would be problematic, the easiest solution could just be to reduce the size / resolution of this image: From a very quick test, I can get decent crispness with about 1/10 the file size (by reducing the resolution to 200 pixels/inch rather than 1000; I haven't tried building the tech note with this change to confirm how it shows up in the generated documentation).

@djk2120
Copy link
Contributor

djk2120 commented Jul 5, 2018 via email

@billsacks
Copy link
Member Author

@djk2120 I don't see the attachment. I don't think it worked to attach it in your reply to the github comment. You can either attach it via the github issue interface, or email it to me directly.

@billsacks
Copy link
Member Author

@djk2120 sent me a version as pdf (though I'm working with him to get a version that matches what's currently in the documentation). I got this to work in the documentation build as follows. I'm not committing these changes, but this is the procedure we can follow when we're ready to bring the documentation source to the master branch of the main ctsm repo. (Then we can filter history to remove the large png file.)

  1. I don't think we can use pdf directly in html. I converted this to svg with pdf2svg (installed with: brew install pdf2svg). (I tried converting with Inkscape, following the directions here: https://en.wikipedia.org/wiki/Wikipedia:Graphics_Lab/Resources/PDF_conversion_to_SVG . This mostly worked, but at least one Greek character was dropped.)

  2. I then put both the pdf and svg versions in doc/source/tech_note/Plant_Hydraulics/ and made this change:

diff --git a/doc/source/tech_note/Plant_Hydraulics/CLM50_Tech_Note_Plant_Hydraulics.rst b/doc/source/tech_note/Plant_Hydraulics/CLM50_Tech_Note_Plant_Hydraulics.rst
index b2afa066..636c18ad 100644
--- a/doc/source/tech_note/Plant_Hydraulics/CLM50_Tech_Note_Plant_Hydraulics.rst
+++ b/doc/source/tech_note/Plant_Hydraulics/CLM50_Tech_Note_Plant_Hydraulics.rst
@@ -714,6 +714,6 @@ The outermost level of iteration works towards convergence of leaf temperature,
 .. _Figure PHS Flow Diagram:
-.. figure:: flow.png
+.. figure:: schem3.*
  Flow diagram of leaf flux calculations

(It seems like we need svg for building html, and pdf for building a pdf. We could just using a single version and convert on the fly, as in https://sites.google.com/site/nickfolse/home/sphinx-latexpdf-output-with-svg-images, but that would add more requirements to the build.)

I tested the html build and it looks fine. I wasn't able to get the pdf build to work, for seemingly unrelated reasons.

@billsacks
Copy link
Member Author

@djk2120 sent me a new version as svg, so we can use that directly (
phs_iteration_schematic.svg.zip). If we want to store a pdf version as well (for generating pdf documentation), I was able to get a good conversion to pdf using Inkscape.

@olyson
Copy link
Contributor

olyson commented Jul 9, 2018

Bill, do you have what you need for this image now?

@billsacks
Copy link
Member Author

Yes, I do.

@billsacks
Copy link
Member Author

Because I'm worried about long-term, hard-to-reverse repository bloat from images, I'm planning to use the suggestion of @barlage (which others at yesterday's CTSM software meeting, as well as @olyson , seem happy with): We'll move the tech note text into the main repository, but have a separate repository that holds images. I plan to have this images repository have a directory structure that parallels the structure of the text. This images repository will be pulled in as an optional external (via manage_externals) (setting required = False means that it won't be pulled down for typical checkouts). I'll then need to change paths to images (i.e. in the .rst files) to point to the new location.

@ekluzek
Copy link
Contributor

ekluzek commented Jul 27, 2018

@billsacks I think this makes sense. My one question is will this external be just for tech note images? Or should we put UG images there as well? User's Guide images are much less of a problem, so I can see this going either way. Currently, we hadn't installed any of the UG images, so it is something that will need to be done.

@billsacks
Copy link
Member Author

@ekluzek good question. Do you have feelings on this? If there are just a few small images (~ 100 kb or less) then keeping them in the repo is no problem. On the other hand, it might help to be consistent in our treatment of images, not distinguishing between tech note and user's guide in this respect. I'd probably cast a weak vote for keeping this consistent and putting all images in a separate repository, but I don't feel strongly about it.

@billsacks billsacks self-assigned this Mar 14, 2020
@billsacks
Copy link
Member Author

@ekluzek as we discussed recently, I'm starting to work on this. I have a couple of comments / questions for you:

  • I have introduced a new repository to hold all images; to be consistent, I am putting UG as well as tech note images there: https://github.com/ESCOMP/CTSM-doc-images

  • I am going to bring the documentation source as well as the Makefiles into the main ctsm repo (along with all history of the source, excluding history of the build and images)

  • Question: where should this go? My initial thought is to put this directly in the doc directory. So I would add doc/Makefile, doc/Makefile.tech_note, doc/Makefile.users_guide and the whole doc/source/ directory. This would maintain the same paths as were in the original ctsm-docs repository. Another option would be to introduce a subdirectory of the doc directory that holds all of this; the main point of this would be to avoid having these three Makefiles in the top-level doc directory; so we'd have doc/something/Makefile, doc/something/source/, etc. I'm personally inclined just to have these appear in the top-level doc/ directory, but I don't feel strongly; do you have any feelings on this? And if you prefer having a separate subdirectory, do you have any ideas for its name?

  • Question: I noticed there is still a UsersGuide directory under doc. Does this still serve a purpose, or should it be removed?

@billsacks billsacks added this to In progress - master in Upcoming tags Mar 22, 2020
billsacks added a commit to billsacks/CTSM-doc-images that referenced this issue Mar 26, 2020
This was provided by Daniel Kennedy; see
ESCOMP/CTSM#239 for details.

This is in place of the flow.png file with this history:

- Plant_Hydraulics/flow.png
  - djk2120 <djk2120@columbia.edu>, Jun 5 2017
@billsacks
Copy link
Member Author

For the record: I just checked with @ekluzek about my above questions. For where this should go: he's fine with this all being directly under doc/ (no need for a separate subdirectory). For the old UsersGuide: this can be removed.

@billsacks
Copy link
Member Author

billsacks commented Mar 27, 2020

Copying some notes from an email to here for posterity:

Subject: I think git LFS will work well for the documentation images

Thank you to Negin for suggesting that we try git LFS (large file support) for images in the tech note and user's guide. I have been playing with it this afternoon, and I think it will work well. For anyone who wants to work with these image files (to build the documentation, modify images in the documentation, etc.), it will require a one-time installation per machine, plus a simple once-per-user-per-machine setup (which I'll document on the wiki). But once that is done, the workflow seems quite simple: working with these image files is not really any different from working with any other files in git; I really like that. I could imagine there may be some gotchas, but for the simple additions and changes I was testing out, everything seemed to work smoothly.

By default, git LFS will download the latest versions of the images whenever you clone or update. But I found a way to configure the repository so that it doesn't do this automatically, and instead the images are only pulled down if you explicitly ask for them. I think I can make this part of the documentation build process, and will document the command for pulling down the images on the wiki in case someone wants to obtain them manually. I feel like this setup is best (i.e., not obtaining the images with every clone), but let me know if you think otherwise.

The actual image files are stored somewhere on GitHub's servers. There are limits on how much you can store for free (1 GB, counting all versions of all of your LFS files across all repositories in the ESCOMP organization) and how much you can download for free (1 GB / month... we would probably hit this very quickly if not for the change I mentioned above to avoid automatically downloading the files). However, I think we'll be okay: while I'm not 100% sure, it looks like our educational discount also lets us upgrade these storage numbers for free, and even if we have to pay, upgrading is cheap ($5/month to change both limits from 1 GB to 50 GB).

Note that this means that there is no need for https://github.com/ESCOMP/CTSM-doc-images. I will delete that repository.

Upcoming tags automation moved this from In progress - master to Master Tags/Issues Done Apr 7, 2020
ekluzek added a commit to ekluzek/CTSM that referenced this issue Apr 7, 2020
Bring documentation source to master

1. Bring documentation source to master: Pulls in the source from
   https://github.com/escomp/ctsm-docs. This is important so that
   documentation can remain in sync with changes in the model
   code. Images are stored here using git-lfs (Git Large File
   Storage). I also made some minor fixes to get the pdf build of the
   tech note working.

2. Use a different documentation theme that supports a version dropdown
   menu, and add the code needed to support this versioning on the
   documentation web pages. At a high level, the way the versioned
   documentation works is to have separate subdirectories in the
   gh-pages branch of the ctsm-docs repository for each version of the
   documentation we want to support. There is then a bit of JavaScript
   code which uses a json file in the gh-pages branch to determine which
   versions exist and how these should be named in the dropdown
   menu. Most of these changes were borrowed from ESMCI/cime#3439, which
   in turn borrowed from ESCOMP/CISM-wrapper#23, which in turn was a
   slight modification of an implementation provided by @mnlevy1981 for
   the MARBL documentation, which in turn borrowed from an
   implementation put together by Unidata (credit where credit is due).

   I am not aware of out-of-the-box support for a version pull-down in
   out-of-the-box sphinx themes (though the last time I looked was in
   Fall, 2018, so there may be something available now). However,
   support for a version dropdown exists in an open PR in the sphinx
   readthedocs theme repository: readthedocs/sphinx_rtd_theme#438. I
   have pushed this branch to a new repository in ESMCI
   (https://github.com/ESMCI/sphinx_rtd_theme) to facilitate long-term
   maintenance of this branch in case it disappears from the official
   sphinx_rtd_theme repository. I have also cherry-picked a commit onto
   that branch, which is needed to fix search functionality in sphinx1.8
   (from readthedocs/sphinx_rtd_theme#672) (which is another reason for
   maintaining our own copy of this branch). The branch in this
   repository is now named version-dropdown-with-fixes (branching off of
   the version-dropdown branch in the sphinx_rtd_theme repository). In
   the long-term, I am a little concerned about using this theme that
   isn't showing any signs of being merged to the main branch of the
   readthedocs theme, but this has been working for us in other projects
   for the last 2 years, so I feel this is a reasonable approach in the
   short-medium term.

The new process for building the documentation is given here:
https://github.com/ESCOMP/CTSM/wiki/Directions-for-editing-CLM-documentation-on-github-and-sphinx

Resolves ESCOMP#239
ekluzek added a commit to ekluzek/CTSM that referenced this issue Apr 20, 2020
Bring documentation source to master

1. Bring documentation source to master: Pulls in the source from
   https://github.com/escomp/ctsm-docs. This is important so that
   documentation can remain in sync with changes in the model
   code. Images are stored here using git-lfs (Git Large File
   Storage). I also made some minor fixes to get the pdf build of the
   tech note working.

2. Use a different documentation theme that supports a version dropdown
   menu, and add the code needed to support this versioning on the
   documentation web pages. At a high level, the way the versioned
   documentation works is to have separate subdirectories in the
   gh-pages branch of the ctsm-docs repository for each version of the
   documentation we want to support. There is then a bit of JavaScript
   code which uses a json file in the gh-pages branch to determine which
   versions exist and how these should be named in the dropdown
   menu. Most of these changes were borrowed from ESMCI/cime#3439, which
   in turn borrowed from ESCOMP/CISM-wrapper#23, which in turn was a
   slight modification of an implementation provided by @mnlevy1981 for
   the MARBL documentation, which in turn borrowed from an
   implementation put together by Unidata (credit where credit is due).

   I am not aware of out-of-the-box support for a version pull-down in
   out-of-the-box sphinx themes (though the last time I looked was in
   Fall, 2018, so there may be something available now). However,
   support for a version dropdown exists in an open PR in the sphinx
   readthedocs theme repository: readthedocs/sphinx_rtd_theme#438. I
   have pushed this branch to a new repository in ESMCI
   (https://github.com/ESMCI/sphinx_rtd_theme) to facilitate long-term
   maintenance of this branch in case it disappears from the official
   sphinx_rtd_theme repository. I have also cherry-picked a commit onto
   that branch, which is needed to fix search functionality in sphinx1.8
   (from readthedocs/sphinx_rtd_theme#672) (which is another reason for
   maintaining our own copy of this branch). The branch in this
   repository is now named version-dropdown-with-fixes (branching off of
   the version-dropdown branch in the sphinx_rtd_theme repository). In
   the long-term, I am a little concerned about using this theme that
   isn't showing any signs of being merged to the main branch of the
   readthedocs theme, but this has been working for us in other projects
   for the last 2 years, so I feel this is a reasonable approach in the
   short-medium term.

The new process for building the documentation is given here:
https://github.com/ESCOMP/CTSM/wiki/Directions-for-editing-CLM-documentation-on-github-and-sphinx

Resolves ESCOMP#239
jtruesdal added a commit to jtruesdal/ctsm that referenced this issue May 1, 2020
Bring documentation source to master

1. Bring documentation source to master: Pulls in the source from
   https://github.com/escomp/ctsm-docs. This is important so that
   documentation can remain in sync with changes in the model
   code. Images are stored here using git-lfs (Git Large File
   Storage). I also made some minor fixes to get the pdf build of the
   tech note working.

2. Use a different documentation theme that supports a version dropdown
   menu, and add the code needed to support this versioning on the
   documentation web pages. At a high level, the way the versioned
   documentation works is to have separate subdirectories in the
   gh-pages branch of the ctsm-docs repository for each version of the
   documentation we want to support. There is then a bit of JavaScript
   code which uses a json file in the gh-pages branch to determine which
   versions exist and how these should be named in the dropdown
   menu. Most of these changes were borrowed from ESMCI/cime#3439, which
   in turn borrowed from ESCOMP/CISM-wrapper#23, which in turn was a
   slight modification of an implementation provided by @mnlevy1981 for
   the MARBL documentation, which in turn borrowed from an
   implementation put together by Unidata (credit where credit is due).

   I am not aware of out-of-the-box support for a version pull-down in
   out-of-the-box sphinx themes (though the last time I looked was in
   Fall, 2018, so there may be something available now). However,
   support for a version dropdown exists in an open PR in the sphinx
   readthedocs theme repository: readthedocs/sphinx_rtd_theme#438. I
   have pushed this branch to a new repository in ESMCI
   (https://github.com/ESMCI/sphinx_rtd_theme) to facilitate long-term
   maintenance of this branch in case it disappears from the official
   sphinx_rtd_theme repository. I have also cherry-picked a commit onto
   that branch, which is needed to fix search functionality in sphinx1.8
   (from readthedocs/sphinx_rtd_theme#672) (which is another reason for
   maintaining our own copy of this branch). The branch in this
   repository is now named version-dropdown-with-fixes (branching off of
   the version-dropdown branch in the sphinx_rtd_theme repository). In
   the long-term, I am a little concerned about using this theme that
   isn't showing any signs of being merged to the main branch of the
   readthedocs theme, but this has been working for us in other projects
   for the last 2 years, so I feel this is a reasonable approach in the
   short-medium term.

The new process for building the documentation is given here:
https://github.com/ESCOMP/CTSM/wiki/Directions-for-editing-CLM-documentation-on-github-and-sphinx

Resolves ESCOMP#239
samsrabin pushed a commit to samsrabin/CTSM that referenced this issue May 3, 2024
update esmf bld to use official esmf action
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: documentation additions or edits to user-facing documentation
Projects
Upcoming tags
Done (non release/external)
Development

Successfully merging a pull request may close this issue.

5 participants