New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add notebook formatting and CI check #124
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof that diff.
I'll need to download this and look at it in pycharm's differ (which handles the change in indent much more elegantly). Is it just changing the indent or are there additional changes?
How hard would it be to maintain a list of grandfathered-in 2-indent notebooks so we can kick this can down the road some more?
dev_tools/nbformat
Outdated
#!/usr/bin/env bash | ||
|
||
############################################################################### | ||
# Formats ipython notebooks with tensorflow-docs nbformat tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you write this or take it from somewhere?
If the former: ick bash. I guess it's ok here since you aren't doing looping (which is my rule of thumb for when a script definitely needs to be in a real programming langauge)
If the latter: note where you got it from so we can keep them in sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is taken directly from Cirq: https://github.com/quantumlib/Cirq/blob/master/check/nbformat
In addition to indentation, metadata is also removed and probably a few other things - @lamberta may know more here.
Not sure why the diff is so large - may be because GH has trouble analyzing notebooks but I'll check. At least by quick glance the notebooks are very similar before/after formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try passing --indent=2
to nbfmt. The default is 4 but the quantum folks prefer 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dev_tools/nbformat
Outdated
cd "$(git rev-parse --show-toplevel)" | ||
|
||
# Install a pinned version of tf-docs | ||
TF_DOCS_COMMIT=b3dc8a922d8bdc6e998a8ad4f4953359dc6e576a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinning a commit makes sense, but we don't really do releases in tensorflow/docs so this commit is arbitrary.
It might be better to download the cirq dev_tools/nbformat
script into your GitHub Actions VM so you're at least using the same thing and don't get too out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 downloading cirq's script
Diff is about 1/2 as large with indent=2. (But note Cirq sets indent=1.)
I think it's fine to punt on this. This PR is the simplest fix for #56 using the tool already in Cirq, but it sounds like there's some things to decide on.
Great point. So would we want to just grab this file from Cirq in CI to make sure they're the same? I guess this works as long as the file location doesn't change. |
Ah, you're right (and I misspoke): nbfmt indent defaults to 2 and the quantum folks mentioned they prefer 1.
Curl'ing the file is probably easiest for a one-off. Though you can checkout the entire Cirq repo (shallow clone) next to this one using the GitHub Actions checkout module—if there's additional tooling in that repo you needed to access. |
If we use Cirq's script (which I think we should) then we can't do this (without a lot of headache) so I'm fine "biting the bullet" now. Like I said, pycharm seems to handle the change-in-indent diffs much better so I'll review the PR that way once it's finalized with indent=1 and using cirq's script |
Another fun trick is that you can append |
@rmlarose what do you think about |
I like it! I should be able to get to this by the end of the week, just haven't had time yet. Only potential downside I see is a new contributor to ReCirq wondering how to format a notebook to pass CI. It can be figured out though, and I agree its a priority to be in sync with Cirq. |
Good point. Maybe we can have a simple shell script in this repo to optionally download and then execute the script if [[ check/nbfmt doesn't exist ]]
wget it from cirq
fi
check/nbfmt which the CI can just call (as well as contributors). [and add the downloaded file to |
True. Depending how fancy you want to get, you can add a GitHub Action to automatically post a comment with instructions to the pull request, for example: tensorflow/docs-l10n#474 (comment) We actually have a bot set up to automatically format new pull requests. It's nice but requires auth setup for GitHub with a Cloud Function. Probably not worth it for now |
Went with this method.
Works well I think, but I guess we're kind of crossing our fingers that no one moves or renames |
This would break the CI but not any user-facing functionality, i.e. the least-bad breakage. I'll keep an eye on Cirq PRs with this in mind anyways though :) |
@@ -39,3 +39,6 @@ docs/generated | |||
|
|||
# Default pycharm virtual env | |||
.venv/ | |||
|
|||
# Notebook formatting script. | |||
nbformat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it downloads into dev_tools/
, right? I guess this still works but may be overzealous. Probably doesn't matter too much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed downloads into dev_tools
} | ||
}, | ||
"source": [ | ||
"king_moves = cirq.Circuit(\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this deleting code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's moving it down lower
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the diff offline. Looks good; quantum chess notebook is a little wonky. Why are we saving outputs there? And if we are, we should run it through "fresh" so the indices are contiguous from zero
Not sure. Maybe there was a good reason for saving output? The first cell starting at 97 is certainly not necessary - at least this doesn't show on the website or on colab though. See #135 which follows up on this. I can address any remaining comments from this PR there if desired. |
Fixes #56.