Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better python package identification #215

Open
wants to merge 7 commits into
base: master
from

Conversation

@rotu
Copy link
Contributor

commented Aug 27, 2019

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 09eb1d8 to e17f0ff Aug 27, 2019
@codecov-io

This comment has been minimized.

Copy link

commented Aug 28, 2019

Codecov Report

Merging #215 into master will decrease coverage by <.01%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #215      +/-   ##
==========================================
- Coverage   80.24%   80.23%   -0.01%     
==========================================
  Files          54       55       +1     
  Lines        3098     3097       -1     
  Branches      512      511       -1     
==========================================
- Hits         2486     2485       -1     
  Misses        576      576              
  Partials       36       36
Impacted Files Coverage Δ
colcon_core/task/python/build.py 33.56% <0%> (ø) ⬆️
colcon_core/task/python/test/__init__.py 26.31% <0%> (ø) ⬆️
colcon_core/run_setup_py.py 37.5% <37.5%> (ø)
colcon_core/package_identification/python.py 93.05% <88.46%> (+5.4%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a75968f...3f670a8. Read the comment docs.

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Aug 28, 2019

@dirk-thomas, Finally got it to pass CI! This was a bear - for some reason colcon doesn't seem to work properly when installed in editable (developer) mode. Shall I squash before or after you take a first review pass?

@rotu rotu changed the title WIP: Better package identification Better python package identification Aug 28, 2019
Copy link
Member

left a comment

for some reason colcon doesn't seem to work properly when installed in editable (developer) mode.

I would be very surprised if that is the case. A symlink install / develop mode is my primary work environment and it works just fine for me. Please feel free to create a separate ticket with specifics if there is something which needs fixing.

Shall I squash before or after you take a first review pass?

No need to. I will take a look at it right away...

colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
test/test_package_identification_python.py Outdated Show resolved Hide resolved
test/test_package_identification_python.py Outdated Show resolved Hide resolved
colcon_core/package_identification/python.py Outdated Show resolved Hide resolved
@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

Please don't change the line limit and revert 0c94427.

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Aug 29, 2019

Please don't change the line limit and revert 0c94427.

so not this? https://index.ros.org//doc/ros2/Contributing/Developer-Guide/#id4
What should the line limit be?
Edit: seems this uses the flake8 default of 79 instead of ament_flake8's 99. I'll switch back to 79 but I'm not gonna like it!

@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

seems this uses the flake8 default of 79

Yeah, colcon uses plain pep8 / flake8 without any ROS specific customizations.

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch 2 times, most recently from 71433fd to d3ca897 Aug 29, 2019
@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Aug 29, 2019

It successfully builds the entire ROS2 source tree now!

@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

I think you introduced some syntax errors here.

Indeed - fixed in ba17901 and 8dad4a0.

@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

I dislike the use of format_locals in this way since (1) it prevents static analysis tools from being able to catch mistakes in variable/property names (2) it prevents tools from being able to detect unused local variables.

Both of these arguments are limitations in tooling. Those should go away when we move f-strings (see next comment).

On the plus side it resembles f-string format, but I’d rather use .format()

That is exactly the intention. Once we can drop Python 3.5 support all .format_map(locals()) cases will be converted to f-strings.

@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

Sorry for the numerous small edits. I thought I could address a few nitpicks through the GitHub UI but then it went down a rabit hole. I should have edited the files locally and pushed the changes to this branch instead...

For the moment I have stopped iterating on the patch since I am unable to bootstrap the colcon packages using this. The package identification of colcon-argcomplete fails due to a PicklingError:

Can't pickle <class 'CustomSdistDebCommand'>: attribute lookup CustomSdistDebCommand on builtins failed

The problem seems to be due to the switch to the multiprocess module and trying to pass Python data back and forth. To resolve this the logic will likely need to be change to be similar to what colcon-python-setup-py does: the subprocess outputs the data in Python syntax which then is being interpreted (see https://github.com/colcon/colcon-python-setup-py/blob/4d9c1c4bb2e8652ec4099be0cd9b1cf31cf73ac2/colcon_python_setup_py/package_identification/python_setup_py.py#L121).

That being said maybe it would be better to:

  • keep the Python package identification extension in this repo as is (only taking setup.cfg files into account, enough to perform the bootstrapping) and
  • update the Python package identification extension in the colcon-python-setup-py package to use the approach from this PR

What do you think?

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2019

Sorry for the numerous small edits. I thought I could address a few nitpicks through the GitHub UI but then it went down a rabit hole.

It's all good. I basically live down a fractal rabbit hole.

For the moment I have stopped iterating on the patch since I am unable to bootstrap the colcon packages using this. The package identification of colcon-argcomplete fails due to a PicklingError

I'll take a look. That particular issue looks as simple as adding 'cmdclass' to the blacklist here https://github.com/RoverRobotics-forks/colcon-core/blob/fbaa13384f97740b968bdeaf975f751dab298a29/colcon_core/package_identification/python.py#L154-L162

It's possible to just not use multiprocessing and run setup.py in the colcon process, but I figured isolating package setup.py's from each other is a desirable feature.

I do like this approach (I'm sure in part because I've sunk so much time and effort into it) and I think it's going to do the right thing more often than either the setup.py or setup.cfg approach. I'd like to fix it and ship it 🚀

@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

That particular issue looks as simple as adding 'cmdclass' to the blacklist

That sounds like a viable option to me.

It's possible to just not use multiprocessing and run setup.py in the colcon process

I wasn't suggesting that. Even the colcon-python-setup-py does it in a separate process - it just uses subprocess instead of multiprocessing. But lets go with the blacklisting for now.

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from d9f7c39 to 53915a2 Sep 5, 2019
@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

Please rebase your branch on top of the latest state of master. Otherwise it is difficult to try the bootstrapping since other packages already depend on changes available on master.

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 53915a2 to 63ba56b Sep 5, 2019
@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2019

Rebased.

@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 6, 2019

While the CI tests pass one of the tests fails when running colcon test --packages-select colcon-core. In the failing test the function distutils.core.run_setup even returns the information about the colcon-core package even when pointing to a setup.py file which is empty (https://github.com/colcon/colcon-core/pull/215/files#diff-381a95d67d06d3fbe2e095a1b1d3644bR38).

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch 3 times, most recently from 6a43078 to d300e75 Sep 7, 2019
@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 7, 2019

Why is the change in 1cf7b4f necessary? Could this be deferred to a separate PR instead to keep the diff minimal?

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 13, 2019

you got it.

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 78865c8 to 8260648 Sep 13, 2019
@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 8260648 to 94a420b Sep 13, 2019
@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 94a420b to 23f0605 Sep 13, 2019
@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

Please move the improved test into a separate PR with the goal to merge that one first.

This is waiting for #229, right?

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

This is waiting for #229, right?

Yes.

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 59bf510 to 92af5f1 Sep 19, 2019
@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

While #229 should land first it would be good to see this PR pass CI and then show the coverage impact of both PRs (since this one includes the other one). Let me know once that is the case so I can re-review. Thanks.

rotu and others added 4 commits Sep 19, 2019
Previously, we either monkey-patched setuptools and harvested the arguments passed to setuptools.setup or we parsed setup.cfg
Now we run the setup.py script with distutils.core.run_setup to return properties from the resulting Distribution object.
@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 92af5f1 to 1864386 Sep 19, 2019
@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

While #229 should land first it would be good to see this PR pass CI and then show the coverage impact of both PRs (since this one includes the other one). Let me know once that is the case so I can re-review. Thanks.

Done. In retrospect, splitting this up into two PRs has seemed to be a mistake :-(

@dirk-thomas

This comment has been minimized.

Copy link
Member

commented Sep 26, 2019

Overall what this patch does looks good to me.

The only thing making me a bit worry is the introduced complexity into colcon-core with the usage of multiprocessing. I just recall pass problems which have recently come up again (e.g. ros-infrastructure/catkin_pkg#266). And we don't strictly need this functionality to bootstrap colcon-core since all colcon packages use setup.cfg files.

I brought up the alternative above and would still be interested what you think about moving the functionality in colcon-python-setup-py where it would actually replace the current approach to "interrogate" the setup function:

That being said maybe it would be better to:

  • keep the Python package identification extension in this repo as is (only taking setup.cfg files into account, enough to perform the bootstrapping) and
  • update the Python package identification extension in the colcon-python-setup-py package to use the approach from this PR

What do you think?

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

I don't like the idea of having two implementations of the same logical operation, but you gave me an idea...

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch 3 times, most recently from 65bfdeb to b9fd456 Sep 26, 2019
@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

Alright. I made it not use subprocesses when bootstrapping. I think it's silly and unnecessary - building C++ and Python projects necessarily involves spawning subprocesses; no need to fear them for package identification.

@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch 3 times, most recently from 971d916 to 192fcda Sep 26, 2019
It turns out most of what I was doing with multiprocessing.Process was reinventing Pool.apply.
This also naturally throttles the number of simultaneous processes, which was a possible concern.
Move setup.py-specific stuff to run_setup_py. I expect to reuse this for python build, plus isolating it makes the subprocess workload minimal.
@rotu rotu force-pushed the RoverRobotics-forks:better_package_identification branch from 192fcda to 3f670a8 Sep 28, 2019
@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2019

@dirk-thomas, I think you're gonna like this. I switched to a process pool to reduce the number of simultaneous processes (by default = number of CPU cores) and the code is even cleaner. Still passes all the tests :-).

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Sep 30, 2019

Oh and setuptools read_configuration is not even threadsafe on simple cfg files. It uses os.chdir.

@rotu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 1, 2019

@dirk-thomas So now that it uses a process pool, this will not create arbitrarily many parallel processes which averts the Windows issue you mentioned.

As for the complexity, I’d rather have one, more complex and correct code path than two partial ones. Bugs hide in poorly-exercised code paths. If you'd rather move python discovery and building out of this package entirely, that sounds like a great task for a follow-on PR.

Is there something I’m missing or can this move forward as-is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants
You can’t perform that action at this time.