Conversation
After some discussion at the issue review, the only docs that will be checked in are those which are not autogenerated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting to see actual docstring changes, as if they are rendered with markdown instead of rst there will need to be a bunch of changes. You mentioned the Parameters
one, but there are a bunch more.
docs/README.md
Outdated
@@ -0,0 +1,220 @@ | |||
<p align="center"><img width="40%" src="doc/static/allennlp-logo-dark.png" /></p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplicating the file at the top level? Was that removed? Oh, was this an autogenerated file that you didn't intend to be included at this point? Same with the other top-level markdown documents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I can remove them. They are copied in.
Yes, there are a lot more changes - however, i'm only going to do the |
|
||
# doc theme | ||
sphinx_rtd_theme | ||
pydoc-markdown==2.0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these need to be pinned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't user requirements, so it's less important that they are unpinned. We should try using a bot to upgrade some of these dependencies.
@@ -20,11 +20,6 @@ This tutorial will show both you how to create `DatasetReader`s | |||
that allow this lazy behavior, and how to handle this laziness | |||
when training a model. | |||
|
|||
If you're not interested in the details, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why you're changing this in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I've moved 90% of the tutorials to live in the docs
, and this one was unnecessary. It made sense to delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -20,11 +20,6 @@ This tutorial will show both you how to create `DatasetReader`s | |||
that allow this lazy behavior, and how to handle this laziness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this file, but it's the closest file I can comment on: why have docs/api/.gitkeep
? You want the directory to be around in some script that generates the docs? Why not just create that directory in the script?
mkdocs.yml
Outdated
repo_url: https://github.com/allenai/allennlp | ||
edit_uri: edit/master/docs/ | ||
#google_analytics: | ||
# - 'UA-133183413-1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a safe id to have in source control?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's served publicly from the pages that register GA. For example, it is on view-source:https://demo.allennlp.org/reading-comprehension (and we just copy and paste code from Google).
from ruamel.yaml import YAML | ||
|
||
|
||
exclude_files = [".DS_Store", "__init__.py", "__init__.pyc", "README.md", "version.py", "run.py"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there documentation in any __init__.py
files that we want to keep? If so, where should it go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why exclude run.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't have docs in __init__
in my opinion, to make discoverability easier. run.py
is excluded because no user will actually use this, as it is installed as an entrypoint in setup.py
, meaning the entrypoint is allennlp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few cases where we have non-trivial logic in an __init__
file. It'd be nice if those showed up in the docs somehow. Not really sure how to accomplish this. I'm fine with moving whatever docs are in an existing (trivial) __init__
file to a better location.
On run.py
, it might be nice to include it with some better documentation around "if you want to write your own entrypoint to allennlp, instead of using ours, you might take this script as a starting place."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other cases apart from the PytorchXtoXWrappers
in modules/etc
? Those probably shouldn't be in the docs.
We don't want people to be writing "entrypoints" for allennlp (we want them to just write scripts, using allennlp as a library) and anyway, run.py
is not the place to start for someone writing a script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked the others I was thinking of (initializers, learning rate schedulers, regularizers), and the logic I was thinking of is not actually inside an __init__
file, so that was a false alarm.
By "entrypoint" I just meant "script", but I just checked what's in run.py
, and you're right, that's not a good example, and it doesn't need to be in the docs.
scripts/build_docs.py
Outdated
def render_docs(relative_src_path: str, src_file: str, to_file: str, modifier="++") -> None: | ||
""" | ||
Shells out to pydocmd, which creates a .md file from the docstrings of python functions and classes in | ||
the file we specify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to include more description of the modifier parameter here. Say what the default does, and point to documentation for what the other options are and what they do (e.g., "See usage information here: https://pypi.org/project/pydoc-markdown/").
scripts/build_docs.py
Outdated
exclude_files = [".DS_Store", "__init__.py", "__init__.pyc", "README.md", "version.py", "run.py"] | ||
|
||
|
||
def render_docs(relative_src_path: str, src_file: str, to_file: str, modifier="++") -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method name makes me think this should operate on more than one file. You have build_docs_for_file
below, but just render_docs
here. Maybe render_file
instead?
scripts/build_docs.py
Outdated
""" | ||
Build docs for an individual python file. If `check` is passed, we don't generate the docs themselves, | ||
but instead we just generate what the corresponding mkdocs YAML would be, so we can compare it to the | ||
current version and see if it has changed (which would mean the docs need re-building). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me what is in the YAML - if I update a docstring without changing the module structure at all, would this trigger a rebuild?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it was a halfway house solution to retain some checks - but I'll remove this as we don't need it.
scripts/build_docs.py
Outdated
if args.check: | ||
if nav_obj[docs_key] != nav_entries: | ||
raise ValueError( | ||
"Changes are required to the API documentation. Run `./scripts/build_docs.py` and commit the changes." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These instructions aren't true anymore, right?
scripts/build_docs.py
Outdated
"Changes are required to the API documentation. Run `./scripts/build_docs.py` and commit the changes." | ||
) | ||
else: | ||
print("No changes required!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looking at this whole check
logic, I don't think it does what we want. We'd want to update minor docstring fixes, and those wouldn't be caught. We should probably just remove the whole check
stuff, and always rebuild on master commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also potentially run render_file
on all files that were touched in a specific PR, to make sure there aren't rendering errors introduced by that PR. But that doesn't need to be done now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Love it.
We should note that we're changing from updating the docs per release to updating the docs on an on-going basis. I think this is important, and if we want to additionally add per-release docs (e.g. http://docs.allennlp.org/v0.8.0) we can do that additionally. Presently I don't know which version our docs correspond to exactly...
@schmmd Just FYI, this change doesn't really necessitate changing how often we release docs. Broadly, i'm in favour of Spacy's way of maintaining a single source for docs, and just noting new features which will only work with certain versions. I think we can solidify that once 1) We release 1.0 and 2) we deploy docs somewhere other than Github Pages. |
This PR brings a new way of building docs for allennlp:
Pros
Cons
The new docs use:
Build Process
The docs are built using a new
scripts/build_docs.py
script, which crawls through theallennlp/allennlp
directory, generating documentation in thedocs/api/relative/path/to/file.md
directory. Additionally, this script builds a YAML representation of the new docs, which is inserted into themkdocs.yml
file at the root of the repository, which describes how to build the mkdocs site. The site can then be built usingmkdocs build
.Instead of doing this on every commit, I have added a
--check
flag to this script, which allows us to check if any new files which need documenting have been added. If this check failed in CI, someone would just have to runpython scripts/build_docs.py
and commit the changes. This is not strictly necessary (as we can just run it when we want to deploy the docs), but helps keep the repository up to date with the docs that we have deployed. I could remove this check entirely if people think it's not necessary. It's also feasible that we could not check in thedocs/api
section, as it is a lot of files/LOC. I'm easy either way, I included it mostly for clarity of what's happening and what the auto-generated files look like.Changes
docs/tutorials
directory, so they can be included in the docs. Resolves part of Update tutorials for the new split repos and v1.0 #3438After reviewing this PR, I need to make a few more changes. In particular, to make the new docs render parameters for classes and functions nicely, I need to change all occurrences of
to
I am waiting until this is reviewed to do this, because it will touch almost every file in the repository, so I want to make that change and merge immediately, so I don't have to resolve 1 trillion merge conflicts.