Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use libCST to parse imports from python 3 code #10907

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Oct 5, 2020

Problem

Fixes #10894.

Solution

  • Add libcst>=0.3.12,<0.4 to requirements.txt.
  • Remove the usage of the normal python 3 ast library from import_parser.py, and implement a _CSTVisitor to visit import nodes.
  • When using python 3, instead use the _CSTVisitor.
    • We use python 3.8 to parse all python 3 code, which is 100% correct for the purpose of parsing imports.

Result

We should no longer be dependent on the version of the current python executable to parse python code for dependencies!

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@cosmicexplorer cosmicexplorer force-pushed the libcst-for-range-of-python-versions branch from 284dace to fd082cb Compare October 5, 2020 07:05
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

3rdparty/python/constraints.txt Outdated Show resolved Hide resolved
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this test skip! That's really exciting. This is a great improvement.

@pytest.mark.skipif(
sys.version_info[:2] < (3, 8),
reason="Cannot parse Python 3.8 unless Pants is run with Python 3.8.",
)
def test_works_with_python38() -> None:

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@cosmicexplorer cosmicexplorer force-pushed the libcst-for-range-of-python-versions branch from 2e4e2fe to e08e438 Compare October 6, 2020 01:22
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@cosmicexplorer cosmicexplorer force-pushed the libcst-for-range-of-python-versions branch from 80ab6d9 to 46459fd Compare October 6, 2020 01:55
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@cosmicexplorer cosmicexplorer force-pushed the libcst-for-range-of-python-versions branch from 46459fd to e58a4b1 Compare October 6, 2020 01:56
@coveralls
Copy link

coveralls commented Oct 6, 2020

Coverage Status

Coverage remained the same at 0.0% when pulling d7de279 on cosmicexplorer:libcst-for-range-of-python-versions into ee50791 on pantsbuild:master.

3rdparty/python/requirements.txt Outdated Show resolved Hide resolved
Comment on lines +140 to +141
if isinstance(s, bytes):
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because SimpleString includes bytes, I presume?

Copy link
Contributor Author

@cosmicexplorer cosmicexplorer Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although it's not explicitly mentioned in the documentation: https://libcst.readthedocs.io/en/latest/nodes.html?highlight=import#libcst.SimpleString.

With pex:

>>> cst.parse_expression('b"asdf"').evaluated_value
b'asdf'
>>> type(cst.parse_expression('b"asdf"'))
<class 'libcst._nodes.expression.SimpleString'>
>>> cst.parse_expression('"asdf"').evaluated_value
'asdf'
>>> type(cst.parse_expression('"asdf"'))
<class 'libcst._nodes.expression.SimpleString'>

@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Oct 6, 2020

Also, what do you think about logging at the debug level if the AST failed to parse? I'm thinking about new users trying out Python 3.9 being confused why dep inference isn't working if they are using Py39-only syntax. It might be helpful for us if we could ask them to run ./pants -ldebug so that we can identify it's an AST parsing issue vs. something else, like >1 owner for the file.

If you like that, it would be excellent to use the caplog Pytest fixture to ensure we are logging:

def test_warn_if_python_version_configured(caplog) -> None:
def assert_is_configured(*, has_config: bool, args: List[str], warning: str) -> None:
config = FileContent("mypy.ini", b"[mypy]\npython_version = 3.6") if has_config else None
is_configured = check_and_warn_if_python_version_configured(config=config, args=tuple(args))
assert is_configured
assert len(caplog.records) == 1
assert warning in caplog.text
caplog.clear()
assert_is_configured(has_config=True, args=[], warning="You set `python_version` in mypy.ini")

(I think DEBUG rather than WARN because we don't want to be too noisy. Their code will already eagerly fail if they have a legit syntax error when running run, test, etc. A general principle of dep inference is that it should never cause your build to break; it's only an enhancement.)

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@cosmicexplorer
Copy link
Contributor Author

Also, what do you think about logging at the debug level if the AST failed to parse?

Done! Thanks so much for suggesting this!

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@cosmicexplorer cosmicexplorer force-pushed the libcst-for-range-of-python-versions branch from d218329 to d7de279 Compare October 7, 2020 20:13
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. Thank you so much for iterating on this! I love the debug messages and appreciate the tests for it.

@cosmicexplorer cosmicexplorer merged commit ea542ac into pantsbuild:master Oct 8, 2020
Eric-Arellano added a commit to Eric-Arellano/pants that referenced this pull request Oct 20, 2020
…10907)

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Eric-Arellano added a commit that referenced this pull request Oct 21, 2020
…1001)

Unfortunately, libCST resulted in a major slowdown.

Before, ff02ef0:

```
▶ /usr/bin/time ./pants --no-pantsd typecheck src/python/pants::
...
       10.63 real        11.47 user         6.99 sys
```

After, ea542ac:

```
▶ /usr/bin/time ./pants --no-pantsd typecheck src/python/pants::
...
       35.40 real        36.13 user         7.08 sys
```

This makes sense: libCST stores a Concrete Syntax Tree, rather than an AST. It's doing more work, like preserving whitespace.

Unfortunately, this means that you must again run Pants with Python 3.8 in order to parse Python 3.8-only syntax. Because the original fix results in a 240% slowdown, this is less offensive than the slowdown, even though it's not ideal. 

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano added a commit to Eric-Arellano/pants that referenced this pull request Oct 21, 2020
…10907) (pantsbuild#11001)

Unfortunately, libCST resulted in a major slowdown.

Before, ff02ef0:

```
▶ /usr/bin/time ./pants --no-pantsd typecheck src/python/pants::
...
       10.63 real        11.47 user         6.99 sys
```

After, ea542ac:

```
▶ /usr/bin/time ./pants --no-pantsd typecheck src/python/pants::
...
       35.40 real        36.13 user         7.08 sys
```

This makes sense: libCST stores a Concrete Syntax Tree, rather than an AST. It's doing more work, like preserving whitespace.

Unfortunately, this means that you must again run Pants with Python 3.8 in order to parse Python 3.8-only syntax. Because the original fix results in a 240% slowdown, this is less offensive than the slowdown, even though it's not ideal. 

[ci skip-rust]
[ci skip-build-wheels]
@gshuflin gshuflin mentioned this pull request Oct 30, 2020
gshuflin added a commit that referenced this pull request Oct 31, 2020
Internal-only changes: 

* upgrade to cpython crate v0.5.1 (#11052)
  `PR #11052 <https://github.com/pantsbuild/pants/pull/11052>`_

* Prepare 2.0.0 (#11053)
  `PR #11053 <https://github.com/pantsbuild/pants/pull/11053>`_

* Revert "Add new EngineAware method metadata() (#11030)" (#11047)
  `PR #11030 <https://github.com/pantsbuild/pants/pull/11030>`_
  `PR #11047 <https://github.com/pantsbuild/pants/pull/11047>`_

* Remove deprecated `python_binary` target in favor of `pex_binary` (#11046)
  `PR #11046 <https://github.com/pantsbuild/pants/pull/11046>`_

* Prepare 2.0.0rc3 (#11044)
  `PR #11044 <https://github.com/pantsbuild/pants/pull/11044>`_

* Eagerly validate entry points for `setup_py().with_binaries()` (#11034)
  `PR #11034 <https://github.com/pantsbuild/pants/pull/11034>`_
  `PR #11021 <https://github.com/pantsbuild/pants/pull/11021>`_

* Use Ubuntu Bionic for CI (#11027)
  `PR #11027 <https://github.com/pantsbuild/pants/pull/11027>`_

* Prepare 2.0.0rc2 (#11017)
  `PR #11017 <https://github.com/pantsbuild/pants/pull/11017>`_

* Remove RunTrackerLogger (#11018)
  `PR #11018 <https://github.com/pantsbuild/pants/pull/11018>`_

* Upgrade Pex to 2.1.20 (#11014)
  `PR #11014 <https://github.com/pantsbuild/pants/pull/11014>`_

* Remove more unused code from RunTracker (#11012)
  `PR #11012 <https://github.com/pantsbuild/pants/pull/11012>`_

* Add type annotations to AggregatedTimings (#11009)
  `PR #11009 <https://github.com/pantsbuild/pants/pull/11009>`_

* Increase default `[python-setup].resolver_jobs` to `cpu_count / 2` (#11006)
  `PR #11006 <https://github.com/pantsbuild/pants/pull/11006>`_

* Include `<PYENV>` in `[python-setup].interpreter_search_paths` default (#10998)
  `PR #10998 <https://github.com/pantsbuild/pants/pull/10998>`_

* Remove PantsDaemonStats class wrapper (#11003)
  `PR #11003 <https://github.com/pantsbuild/pants/pull/11003>`_
  `PR #files#r508861045 <https://github.com/pantsbuild/pants/pull/11000/files#r508861045>`_

* Revert using libCST for dep inference due to performance (#10907) (#11001)
  `PR #10907 <https://github.com/pantsbuild/pants/pull/10907>`_
  `PR #11001 <https://github.com/pantsbuild/pants/pull/11001>`_


* Run tracker refactor (#11000)
  `PR #11000 <https://github.com/pantsbuild/pants/pull/11000>`_

* refactor Core::new including command runner setup (#10993)
  `PR #10993 <https://github.com/pantsbuild/pants/pull/10993>`_
  `PR #10960 <https://github.com/pantsbuild/pants/pull/10960>`_

* Allow changing the versioning scheme for `python_distribution` first-party dependencies (#10977)
  `PR #10977 <https://github.com/pantsbuild/pants/pull/10977>`_

* Remove tokio Handle type from Executor::new (#10980)
  `PR #10980 <https://github.com/pantsbuild/pants/pull/10980>`_

* Remove deprecated `Address.parse()` and `Address.reference()` (#10981)
  `PR #10981 <https://github.com/pantsbuild/pants/pull/10981>`_

* Remove project_ methods from the externs module (#10955)
  `PR #10955 <https://github.com/pantsbuild/pants/pull/10955>`_

* Prepare 2.0.0rc1 (#10972)
  `PR #10972 <https://github.com/pantsbuild/pants/pull/10972>`_

* Fix interpreter selection when building a PEX to use `[python-setup].interpreter_search_paths` (#10965)
  `PR #10965 <https://github.com/pantsbuild/pants/pull/10965>`_

* Delete `JsonReporter` (#10964)
  `PR #10964 <https://github.com/pantsbuild/pants/pull/10964>`_

* Fix log (#10959)
  `PR #10959 <https://github.com/pantsbuild/pants/pull/10959>`_

* Expose getattr method for Python-related APIs (#10953)
  `PR #10953 <https://github.com/pantsbuild/pants/pull/10953>`_

* Upgrade tokio package to 0.2.22 (#10949)
  `PR #10949 <https://github.com/pantsbuild/pants/pull/10949>`_

* Use `package` to build Pants's wheels, rather than `setup-py` (#10947)
  `PR #10947 <https://github.com/pantsbuild/pants/pull/10947>`_

* Simplify val_to_str and have it and val_to_log_level use PyObject (#10946)
  `PR #10946 <https://github.com/pantsbuild/pants/pull/10946>`_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

use libCST to stop having "the versions of python you can parse" depend on "pants's python version" for good
3 participants