Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing pybuilder #130

Closed
wants to merge 22 commits into from
Closed

Introducing pybuilder #130

wants to merge 22 commits into from

Conversation

svaningelgem
Copy link
Contributor

I implemented pybuilder now here. The changes do NOT include the optimizations of the headers (yet) ;-). That's for another PR.
pybuilder works, I don't know about travis yet though.

@svaningelgem
Copy link
Contributor Author

svaningelgem commented Oct 14, 2020

Sorry, it took a few hits & misses before I got it building right on travis :).

One thing to mention: if I'm using pybuilder 0.12 (latest) it will only work on py 3.4+. If you still want to support that version (end security fixes = 2019-03 for 3.4, 2020-09 for 3.5), we can lower down the pybuilder version back to 0.11. As far as I'm concerned there is not really a reason to use the latest (except better support for colors in the console :p).

I would suggest to drop 3.5 support as well, normally bigger companies (the ones that use Spark) will seldom use a (python) version that is not support anymore by fixes.
Advantage is that a bunch of the code will be nicer ;-).
I made these changes as-is, but can revert/adjust if you would decide differently.

@svaningelgem
Copy link
Contributor Author

So, now it should be fine. There were a bunch of changes done by the "intelligent" refactoring of IntelliJ... Damn annoying :)

@svenkreiss
Copy link
Owner

Hello @svaningelgem , can you say a few more words about the benefit of pybuilder for pysparkling?

@svaningelgem
Copy link
Contributor Author

svaningelgem commented Oct 15, 2020

The reason why I like pybuilder is the modular setup and the ease with which you can write/include additional plugins into the build process.It has a lot of decorators to handle specific tasks in the build process. Or attach your own steps to existing steps in the build process.All things that are more difficult (of course not impossible) to achieve with setuptools and/or additional scripts.

Pybuilder concepts are based on the Maven/Gradle way of working and I like these concepts a lot as they enforce (a bit) to write cleaner code.The way it builds a project is much more simple (and if you need more complex workflows, a lot of plugins are available which have already implemented just that.)

Anyway, that's my take on it: simplicity and extensibility in 1 clean package.

@svenkreiss
Copy link
Owner

What plugins? What is the problem you are solving in the setuptools build process?

@svaningelgem
Copy link
Contributor Author

What I still had written down to enable in the build process:

  • flake8
  • coverage via coveralls
  • pylint moving from travis to the build chain
  • optimize imports

Copy link
Collaborator

@tools4origins tools4origins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments, but here is my main one:

It seems that support of pybuilder would allow/simplify setting up some pipelines, but we only write setup.py when running it. Would it be possible to keep setup.py and, if needed, have pybuilder retrieving some information using import?

@@ -1,16 +1,14 @@
dist: xenial
language: python
python:
- "3.4"
- "3.5"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to stop supporting these versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some issues with the code + Python 3.5 stopped with full support in mid 2017 iirc, and ended security fixes 2 years later... The earliest Python version still supported is 3.6 (till end of next year).
If you want I will add them again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.5 or 3.4? 3.5 end of life was 2 weeks ago (it has now entered in security mode only)... But yep, still, it is in the past.

I spend a bit of time on my last PR because of some python<3.6 related issues (unordered dict).

There are some stats on the marketshare of python versions here: https://github.com/hugovk/pypi-tools/tree/master/ and... I think I am in favor of dropping support for 3.5 in the new versions of pysparkling. Its marketshare is shrinking (more than I thought), there is a cost to maintain them, and the previous version of pysparkling still supports it

build.py Show resolved Hide resolved
build.py Outdated Show resolved Hide resolved
@@ -1,68 +0,0 @@
from setuptools import setup, find_packages
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to only add support of pybuilder, without removing support of setuptools & co?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is possible, but why support 2 build chains? We can do this for the moment of course.

setup.cfg Outdated Show resolved Hide resolved
build.py Outdated Show resolved Hide resolved
# https://github.com/AlexeySanko/pybuilder_pytest
# use_plugin('pypi:pybuilder_pytest')
# https://github.com/AlexeySanko/pybuilder_pytest_coverage
# use_plugin('pypi:pybuilder_pytest_coverage')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove this commented code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep them there as it was a reminder for the next steps: coverage & pytest within pybuilder.

@svenkreiss
Copy link
Owner

I don't see a reason for pybuilder for pysparkling. Every person and every CI can run pylint, flake8, coveralls etc. The proposed changes to the repo are too extensive.

@svaningelgem
Copy link
Contributor Author

svaningelgem commented Oct 15, 2020

I don't see a reason for pybuilder for pysparkling. Every person and every CI can run pylint, flake8, coveralls etc. The proposed changes to the repo are too extensive.

There are nearly no changes except for the moving into the src directory. It looks big, but it's just moving stuff.
Read the discussion here: tools4origins#4 (comment)

Yes, you're right, setuptools can do that, and any person can write the commands. But would you? Should every person do that? Streamline the experience!
So if you want to smoothen it, you write a script. And what's so different about getting a build.py in the root... Not much in my opinion.

@svenkreiss
Copy link
Owner

It's late here, let's discuss tomorrow.

@tools4origins
Copy link
Collaborator

tools4origins commented Oct 16, 2020

I think there are 3 modifications in this PR:

  1. Moving pysparkling code to from ./pysparkling to ./src/pysparkling
  2. Adding PyBuilder
  3. Removing setup.py

I was not convinced at the beginning but 1. Moving pysparkling code to from ./pysparkling to ./src/pysparkling appears to be a good thing:

  • It makes it possible to only add pysparkling module to PYTHONPATH by adding project_dir/src. Currently, we cannot add pysparkling code without also adding others modules in the project root.
  • It appears to be useful in other cases, e.g. testing, TL;DR: If you want to test an pysparkling installation but in the project directory, your tests will implicitly run against the project module, not the one installed, because of precedence.
  • It clarifies what the application code is, and what is related to building the package

So I think moving code to src is a good thing in itself and should be done independently of pybuilder.

Pybuilder looks good too, as it provides a unique entry point to build the project "properly" (with more checks, I think we will soon want to add coverage monitoring as there is more code than tests in my latest PRs). Also it may help on parsing the SQL grammar, which is something I am currently working on but requires some tooling to convert from the grammar definition to a string parser that converts "SELECT * FROM table" to a python object. So pybuilder should be useful.

Removing setup.py: This one I am unsure of why we do not keep it, in order not to remove support of the current flow

@svenkreiss
Copy link
Owner

I was going write a longer reply, but there is one thing we need to clarify first: you should never(!) modify PYTHONPATH. Not currently and not in the future. This is not a repository of scripts, this is a Python package. If you work with the cloned repo, you install it in editable mode: pip install -e .

Now to my longer reply from before. I wasn’t aware of the previous discussion. I have read it now.
To make progress on the discussion here, I suggest we split layout of the project from build tools.

Layout:

  1. While pypa recommends tests outside of the import path, it was a deliberate choice for pysparkling not to follow that. Many tests are derived from doctests embedded with the source and therefore inseparable. There is one major advantage for including tests in the import path (which doesn’t mean they are imported, just importable): on any system where you have pysparkling, you can run python -m pytest pysparkling.tests even when you just installed it with pip without having cloned the repository. Given that pysparkling can operate on various cluster architectures, I felt it was necessary to always be able to run a suite of tests in that particular environment.
  2. Scripts. We currently don’t need that. Let’s only add it if really necessary. This is not recommended by pypa.
  3. Src. While recommended by pypa, it’s an indirection that has not been necessary for pysparkling so far. Again, this entire repo is a Python package. Don't treat it as a collection of scripts.

Build tools:

I am not sure what your argument is. Python doesn’t need to be build when there are no extensions. Pysparkling makes it quite clear to aim for “pure python”. That makes many things very simple.

When I want to do multiple things, I am happy with running multiple commands. With any build system, you need different commands if you want to run anything individually. Like lint only or test only or test a particular file only. I don’t want these operations to be hidden behind a layer of indirection.

Converter from a grammar to a string parser would be a reason to add a scripts folder.

Coveralls is a service you can just enable in GitHub. Their command line interface to run locally is also trivial. I don't see the need for the indirection through pybuilder.

@tools4origins
Copy link
Collaborator

./tests or ./src/tests: I don't know.

./scripts: I think the script folder already exists and is not touched by this PR

./src: It seems we have reached some cases where it would be useful, so since there is no hard reason not to do it, let's do it?

Build tools: I would appreciate having a single entrypoint to check everything before a commit and potentially do some extra tasks. Especially it would enable adding more checks/automated things without making this step more complex.

@svenkreiss
Copy link
Owner

Regarding tests: in both of your proposed cases, tests would be a standalone directory with a set of Python scripts. It wouldn't be a module and it wouldn't be accessible when installed without a clone of this repository.

./scripts: indeed, there are quite a few. None for a build process though.

./src: unless you share these cases here where this is better, we cannot have a discussion about it. I am not aware of any cases.

single entry point: I'd be willing to add a ./scripts/tests.sh.

A new question that I had when looking at the code again: When you do pip install <some-github-branch>, does pip know to pickup build.py somehow and install the branch?

Maybe I am wrong and maybe pybuilder is the way forward, but I see no reasons at this moment. I only see a layer of indirection that I would like to avoid.

@svenkreiss svenkreiss closed this Nov 1, 2020
@svenkreiss
Copy link
Owner

Thank you @svaningelgem for your contribution, but for the main branch we are moving forward with pure setuptools for now. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants