Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete Jupyter notebook integration #5188

Closed
21 of 26 tasks
dhruvmanila opened this issue Jun 19, 2023 · 8 comments
Closed
21 of 26 tasks

Complete Jupyter notebook integration #5188

dhruvmanila opened this issue Jun 19, 2023 · 8 comments
Assignees
Labels
core Related to core functionality

Comments

@dhruvmanila
Copy link
Member

dhruvmanila commented Jun 19, 2023

Meta issue to keep track of the remaining tasks before shipping Jupyter notebook native integration:

Formatter Support

Editor Integrations

IPython Syntax Handling (%, !, etc.)

Ruff

Proposed new rules

  • Empty magic command (%) (Autofix: remove the statement)
  • Magic command contains whitespace between the prefix and command value (% matplotlib inline) (Autofix: remove all the whitespaces) (Only valid for % magic prefix)

Existing rule changes

I'll be looking at other related rules and update this issue accordingly

Parser

Autofix for Jupyter Notebook

Implementation

@charliermarsh
Copy link
Member

My main question would be: at what point do you think we can remove the feature flag, and open this up to users? (Maybe we're already past that point!)

@charliermarsh charliermarsh added the core Related to core functionality label Jun 20, 2023
@dhruvmanila
Copy link
Member Author

I do think we're ready except that without the (1) task above Ruff might give false positives. Without that we would ignore any cell which contains magic commands. Now, we would also ignore if it contains valid Python code. It could be that a variable is defined in that cell and is being referenced in another cell. Here, we might trigger undefined variable. Similar example can be given for imports.

I see this a lot for %matplotlib inline command: https://sourcegraph.com/search?q=context:global+lang:%22Jupyter+Notebook%22+%25matplotlib+inline&patternType=standard&sm=1&groupBy=repo

@dhruvmanila
Copy link
Member Author

What do you think about releasing this under experimental? I'm not sure what would that look like but maybe we don't add the *.ipynb to include and ask users to explicitly opt in to that.

@charliermarsh
Copy link
Member

Yeah, I think it would be reasonable to release it now (even without magic handling) and omit *.ipynb in include, so users need to opt-in explicitly. We can market it as experimental to get feedback. We can include it in the next release IMO.

We should also add the magic command support, but doesn't have to block if it's opt-in :)

@charliermarsh
Copy link
Member

(To be explicit: we could just remove the flag now.)

dhruvmanila added a commit that referenced this issue Jun 26, 2023
## Summary

Experimental release for Jupyter Notebook integration.

Currently, this requires a user to explicitly opt-in using the
[include](https://beta.ruff.rs/docs/settings/#include) configuration:

```toml
[tool.ruff]
include = ["*.py", "*.pyi", "**/pyproject.toml", "*.ipynb"]
```

Or, a user can pass in the file directly:

```sh
ruff check path/to/notebook.ipynb
```

For known limitations, please refer #5188 

## Test Plan

Following command should work without the `--all-features` flag:

```sh
cargo dev round-trip /path/to/notebook.ipynb
```

Following command should work with the above config file along with
`select = ["ALL"]`:

```sh
cargo run --bin ruff -- check --no-cache --config=../test-repos/openai-cookbook/pyproject.toml --fix ../test-repos/openai-cookbook/
```

Passing the Jupyter notebook directly:

```sh
cargo run --bin ruff -- check --no-cache --isolated --select=ALL --fix ../test-repos/openai-cookbook/examples/Classification_using_embeddings.ipynb
```
@jpmckinney
Copy link

I'm not sure if this deserves its own issue, but: Would it be possible to use extend-include = ["*.ipynb"] and ruff format to only re-format the Python code in cells, and not the ipynb file itself?

My users typically edit notebooks in Google Colab. Google Colab uses two-space indentation in the ipynb file, instead of the typical one-space indentation. ruff format currently re-indents with one space. As such, edit history becomes difficult/impossible to track, because effectively every line gets rewritten between the user saving from Google Colab (changing to 2 spaces) and CI running ruff format (changing to 1 space).

Google Colab also puts keys like metadata, nbformat, nbformat_minor at the top of the file, but ruff format moves these to the bottom. ruff format also changes the order of execution_count, outputs, etc.

I would love to be able to opt out of ipynb formatting, so that only the Python code inside cells is reformatted.

@dhruvmanila
Copy link
Member Author

(I think it deserves its own issue, we can discuss it there :))

@dhruvmanila
Copy link
Member Author

Closing this as completed. The only thing remaining is to support --add-noqa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Related to core functionality
Projects
None yet
Development

No branches or pull requests

3 participants