Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dvc to manage docs images #6267

Merged
merged 18 commits into from
Jan 27, 2022
Merged

Use dvc to manage docs images #6267

merged 18 commits into from
Jan 27, 2022

Conversation

maxrjones
Copy link
Member

Description of proposed changes

Originally we discussed only using dvc to track the test images, in part because there are fewer docs images and theoretically they change less often. However, there are already several outdated docs images with more likely due to remote dataset updates. This PR configures the docs and tests workflows to use dvc rather than git to manage the baseline images in doc/examples and doc/scripts to allow for these to be updated without bloating the git history.

One current limitation is that the CMake settings do not include the new doc/examples/images and doc/scripts/images directories in the release tarballs, meaning that people can not build the html docs from the tarballs. But, this relates to #2681 in that we also include docs_release in the release tarballs which contains optimized versions of those images. So do we need both? Or could we set a CMake target to generate the .PS files from the scripts as a compromise that would allow some users to build the html docs from source while shrinking the release tarball size for everyone?

Part of #5724

@maxrjones maxrjones changed the title Use dvc to manage docs images WIP: Use dvc to manage docs images Jan 25, 2022
@maxrjones
Copy link
Member Author

@seisman, the vercel deployment fails on the dvc pull step due to an apparent conflict between python versions. I think it could work to use a virtual environment to manage the python dependencies in ci/vercel-docs.sh:

python3 -m pip install --user --upgrade pip
python3 -m venv env
source env/bin/activate
python3 -m pip install docutils==0.17 sphinx dvc importlib-resources

Since you have more experience with the vercel setup, do you know of any reason to avoid venv and bootstrapping a user pip install for this case?

@seisman
Copy link
Member

seisman commented Jan 27, 2022

The raw error message is:

+ dvc pull
--
07:28:19.059 | ERROR: unexpected error - No module named '_sqlite3'
07:28:19.172 | Traceback (most recent call last):
07:28:19.172 | File "/usr/local/lib/python3.9/site-packages/dvc/main.py", line 54, in main
07:28:19.172 | cmd = args.func(args)
07:28:19.172 | File "/usr/local/lib/python3.9/site-packages/dvc/command/base.py", line 39, in __init__
07:28:19.172 | self.repo = Repo(uninitialized=self.UNINITIALIZED)
07:28:19.173 | File "/usr/local/lib/python3.9/site-packages/dvc/repo/__init__.py", line 216, in __init__
07:28:19.173 | self.state = State(self.root_dir, state_db_dir, self.dvcignore)
07:28:19.173 | File "/usr/local/lib/python3.9/site-packages/dvc/state.py", line 49, in __init__
07:28:19.173 | from diskcache import Cache
07:28:19.173 | File "/usr/local/lib/python3.9/site-packages/diskcache/__init__.py", line 8, in <module>
07:28:19.173 | from .core import (
07:28:19.173 | File "/usr/local/lib/python3.9/site-packages/diskcache/core.py", line 14, in <module>
07:28:19.173 | import sqlite3
07:28:19.173 | File "/usr/local/lib/python3.9/sqlite3/__init__.py", line 57, in <module>
07:28:19.173 | from sqlite3.dbapi2 import *
07:28:19.173 | File "/usr/local/lib/python3.9/sqlite3/dbapi2.py", line 27, in <module>
07:28:19.173 | from _sqlite3 import *
07:28:19.173 | ModuleNotFoundError: No module named '_sqlite3'

Perhaps this solution (sloria/TextBlob#173 (comment)) works?

@maxrjones
Copy link
Member Author

Thanks @seisman, the vercel docs are working now with dvc.

@maxrjones maxrjones changed the title WIP: Use dvc to manage docs images Use dvc to manage docs images Jan 27, 2022
@maxrjones
Copy link
Member Author

I added dvc pull to the instructions for building the docs for now. I think we can tackle the packaging issues separately from this PR.

@maxrjones maxrjones requested a review from a team January 27, 2022 21:34
@maxrjones maxrjones merged commit 9064276 into master Jan 27, 2022
@maxrjones maxrjones deleted the docs-dvc branch January 27, 2022 22:09
@maxrjones maxrjones added the maintenance Boring but important stuff for the core devs label Jan 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants