Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs build failures are difficult to identify #50330

Closed
mruberry opened this issue Jan 10, 2021 · 3 comments
Closed

Docs build failures are difficult to identify #50330

mruberry opened this issue Jan 10, 2021 · 3 comments
Assignees
Labels
high priority module: docs Related to our documentation, both in docs/ and docblocks module: testing Issues related to the torch.testing module (not tests) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@mruberry
Copy link
Collaborator

mruberry commented Jan 10, 2021

Warnings in the docs build are now a build failure in CI. These failures, however, are difficult for developers to relate to their PRs. See, for example, #50046 (comment).

The warning:

Jan 05 15:57:22 docstring of torch.linalg.qr:29: WARNING: Explicit markup ends without a blank line; unexpected unindent.

appears in a lengthy printout that can include other warnings like these:

Jan 05 15:57:23 /opt/conda/lib/python3.6/site-packages/torch/jit/_trace.py:804: TracerWarning: Trace had nondeterministic nodes. Did you forget call .eval() on your model? Nodes:
Jan 05 15:57:23 	%13 : Float(4, strides=[1], requires_grad=0, device=cpu) = aten::rand(%8, %9, %10, %11, %12) # <doctest default[0]>:2:0
Jan 05 15:57:23 This may cause errors in trace checking. To disable trace checking, pass check_trace=False to torch.jit.trace()
Jan 05 15:57:23   _module_class,
Jan 05 15:57:23 /opt/conda/lib/python3.6/site-packages/torch/jit/_trace.py:804: TracerWarning: Output nr 1. of the traced function does not match the corresponding output of the Python function. Detailed error:
Jan 05 15:57:23 With rtol=1e-05 and atol=1e-05, found 4 element(s) (out of 12) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.9027273058891296 (0.904914915561676 vs. 0.0021876096725463867), which occurred at index (0, 3).
Jan 05 15:57:23   _module_class,
Jan 05 15:57:23 /opt/conda/lib/python3.6/site-packages/torch/jit/_trace.py:804: TracerWarning: Trace had nondeterministic nodes. Did you forget call .eval() on your model? Nodes:
Jan 05 15:57:23 	%14 : Float(1, 4, strides=[4, 1], requires_grad=0, device=cpu) = aten::rand(%9, %10, %11, %12, %13) # <doctest default[0]>:2:0
Jan 05 15:57:23 This may cause errors in trace checking. To disable trace checking, pass check_trace=False to torch.jit.trace()
Jan 05 15:57:23   _module_class,
Jan 05 15:57:23 /opt/conda/lib/python3.6/site-packages/torch/jit/_trace.py:804: TracerWarning: Output nr 1. of the traced function does not match the corresponding output of the Python function. Detailed error:
Jan 05 15:57:23 With rtol=1e-05 and atol=1e-05, found 4 element(s) (out of 8) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.7007782459259033 (0.7614061832427979 vs. 0.06062793731689453), which occurred at index (0, 3).
Jan 05 15:57:23   _module_class,
Jan 05 15:57:23 /opt/conda/lib/python3.6/site-packages/torch/jit/_recursive.py:197: UserWarning: 'mods' was found in ScriptModule constants,  but it is a non-constant submodule. Consider removing it.
Jan 05 15:57:23   " but it is a non-constant {}. Consider removing it.".format(name, hint))
Jan 05 15:57:24 Makefile:38: recipe for target 'doctest' failed

It would be nice to more clearly describe to developers why the doc build(s) failed on their PR.

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @jlin27 @mruberry @rgommers @mattip @ranman

@mruberry mruberry added high priority module: docs Related to our documentation, both in docs/ and docblocks module: testing Issues related to the torch.testing module (not tests) labels Jan 10, 2021
@mattip
Copy link
Collaborator

mattip commented Jan 10, 2021

The easiest thing to do would be to improve the error message at the end of the build step, either in the Makefile or in the circleCI script that calls make html to state "if this step fails, look back in the log for WARNING: (in capital letters)". A more advanced solution would be to do something like

set -o pipefail
make html | tee /tmp/doc_build.log
if [ $? -ne 0 ]; then 
    echo "doc build error detected:"; 
    grep WARNING /tmp/doc_build.log;
fi

I wonder if there is a way to write a test to make sure it works as advertised

@mattip mattip self-assigned this Jan 10, 2021
@mruberry
Copy link
Collaborator Author

Telling people to search for "WARNING" would be a major improvement. Replicating the WARNINGs with some context at the bottom of the build output would be even better. We could also consider creating an artifact and telling people to search for WARNING and/or look at the artifact to understand the failures.

@rgommers
Copy link
Collaborator

There is an artifact already, adding to the message that the pytorch_python_doc_build CI job takes you straight to the rendered html that you can inspect for issues may make sense.

@gchanan gchanan added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed triage review labels Jan 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority module: docs Related to our documentation, both in docs/ and docblocks module: testing Issues related to the torch.testing module (not tests) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants