Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major refactoring #91

Merged
merged 49 commits into from
Mar 25, 2018
Merged

Major refactoring #91

merged 49 commits into from
Mar 25, 2018

Conversation

sushain97
Copy link
Member

@sushain97 sushain97 commented Mar 24, 2018

Some of these are wishful thinking but most of them are possible in one fell swoop. The ones that are not will be converted to issues upon merge of this PR. This will, of course, require a minor version bump.

to do (feel free to add):

  • improve gitignore with standard python ignores
  • use travis_retry in travis.yml for pip installation
  • add requirements.txt (split into dev and non-dev)
  • cache python dependencies in travis.yml
  • replace streamparser submodule with pypi dependency
  • make streamparser optional dependency
  • comply with pep8 naming (and add pep8-naming flake8 plugin)
  • add trailing comma flake plugin
  • move all test related files in tests/
  • convert bash tests to unittest
  • install python dependencies in travis.yml via requirements.txt
  • rename ChangeLog to CHANGELOG
  • move all core python files into apertium_apy
  • rename language-names to language_names
  • update readme with minimal instructions
  • add coverage / build / pypi badges to readme
  • add coverage calculation
  • rename servlet.py to apy.py
  • link servlet.py to apertium_apy/apy.py
  • update all references to moved files
  • move language names README into language_names
  • add more info to language_names README
  • clone data from git instead of svn
  • add full complement of module level dunders
  • actually support py3.4 (typing module doesn't exist in it...)
  • delete toro (requires higher Tornado version, yes)
  • add setup.py
  • publish to pypi
  • use quote characters consistently and add linter
  • drop the max line length to something less ridiculous (120? 140?)
  • import requests at the top and properly disable suggestion handler if it's missing
  • cache apertium-nno in travis.yml (30% build time reduction)
  • improve/update testing README
  • expose apertium-apy script in package

create issues for (if not already exists), ticked when issue created:

@sushain97 sushain97 requested a review from xavivars March 24, 2018 03:42
@TinoDidriksen
Copy link
Member

For packaging purposes, PyPi must be optional. Everything has to work with whatever dependencies can be had from apt-get on the target distros (currently all supported and upcoming editions of Debian and Ubuntu).

Copy link
Member

@xavivars xavivars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few high level questions:

  • If streamparser is installable via pip, what would we want to make it optional? Wouldn't it be easier to just include it with the other dependencies and have it install at the same time than, for example, tornado? If someone doesn't have internet access and can't do a pip install streamparser, they won't be able to do pip install requirements.txt either...
  • Regarding requirements.txt... If the goal is to have APy itself installable via pip, wouldn't it be better to use something like setuptools to make the different dependencies explicit (runtime, development/test) in the setup.py so the package that gets published automatically retrieves all dependencies?

In any case, I personally like a lot the refactor (I'll look into it with more deeply to try to provide more useful feedback), and I'd even go further (moving all Handlers to an apy.handlers module, etc).

In any case, great improvement!

@xavivars
Copy link
Member

@TinoDidriksen: what's the benefit of packaging it as a deb/rpm if it can be just installed as a python package than can be used in any python virtual environment in an easy way? (I'm asking because I'm sure there are benefits that I'm not aware of)

@TinoDidriksen
Copy link
Member

Ping @kartikm - does Wikimedia critically want the Debian package for APy, or can you live with PyPi?

@sushain97
Copy link
Member Author

Everything has to work with whatever dependencies can be had from apt-get on the target distros (currently all supported and upcoming editions of Debian and Ubuntu).

Presumably you can apt-get specific versions of deps, yes? For example, apt-get Tornado 4.5.x instead of 5.x.

@sushain97
Copy link
Member Author

If streamparser is installable via pip, what would we want to make it optional? Wouldn't it be easier to just include it with the other dependencies and have it install at the same time than, for example, tornado? If someone doesn't have internet access and can't do a pip install streamparser, they won't be able to do pip install requirements.txt either...

Well, I think the philosophy is to make the only hard requirement Tornado. That's why chardet and cld2 are also soft dependencies that will simply disable functionality if not installed.

@sushain97
Copy link
Member Author

Regarding requirements.txt... If the goal is to have APy itself installable via pip, wouldn't it be better to use something like setuptools to make the different dependencies explicit (runtime, development/test) in the setup.py so the package that gets published automatically retrieves all dependencies?

Indeed. That shall be done in the setup.py :) Right now, requirements.txt is the union of all meaningful deps.

@sushain97
Copy link
Member Author

In any case, I personally like a lot the refactor (I'll look into it with more deeply to try to provide more useful feedback), and I'd even go further (moving all Handlers to an apy.handlers module, etc). In any case, great improvement!

Thanks! :) I agree, some even more in depth code reorganization would be nice. That will come after the big changes are taken care of (mostly done).

@ftyers
Copy link
Member

ftyers commented Mar 24, 2018

One of the best things about apy is that it doesn't require all the python setup.py stuff. The original idea was to have a simple standalone server (servlet.py), I appreciate some kind of feature creep and I think that apy is a nice piece of software, but would like to avoid the typical python installation bloat.

Incidentally would it be possible to do whatever you want regarding build/configuration with autotools ?

@sushain97
Copy link
Member Author

sushain97 commented Mar 24, 2018

There will be no functional change for backwards compatibility. If you don't want to use standard installation steps, you can clone the repo and run servlet.py. The current state of this code is really awful and it makes me not want to work on fixing bugs or improving it, hence this massive refactor. There are also lots of bugs and undocumented behavior that a refactor like this uncovers...

Incidentally would it be possible to do whatever you want regarding build/configuration with autotools ?

There isn't any new build or configuration.

@sushain97
Copy link
Member Author

The original idea was to have a simple standalone server (servlet.py),

Simple is small and fine if we don't care about things like performance or stability.

$ cat apertium_apy/*.py | wc -l
    3134

a ton of that isn't just feature creep but also required for holding pipelines open, locking them properly, dropping them when required, restarting as required, etc. And it's insanely annoying right now to find where you want to be in the code... :(

@sushain97
Copy link
Member Author

sushain97 commented Mar 24, 2018

Ok, almost all the todo are complete. This diff is already really massive so before more fine grained refactoring in the code, I'd like to merge this into master.

Thoughts? (would appreciate a 👍 if you have no objections)

@sushain97
Copy link
Member Author

$ pip install apertium-apy
$ apertium-apy

will now Just Work to run APy (if you already have Apertium, of course).

@sushain97
Copy link
Member Author

sushain97 commented Mar 25, 2018

Caching /tmp/languages has improved build time by ~30% \o/

@sushain97
Copy link
Member Author

OK, I think everything I originally planned for this particular PR is now complete. Will merge tomorrow afternoon/evening after any concerns raised are resolved.

@unhammer
Copy link
Member

Seems to work just fine for me without having to use virtualenv/pip etc. for anything; this looks great =D

@sushain97 sushain97 merged commit ed088c1 into master Mar 25, 2018
@sushain97
Copy link
Member Author

sushain97 commented Mar 25, 2018

Awesome! This has been merged into master, tagged with v0.11.0.rc1 and released to PyPi.

Issues have been created for the remaining 4 items and I will release v0.11.0 soon (day-scale) if there are no problems reported.

@kartikm
Copy link
Member

kartikm commented Mar 27, 2018

@TinoDidriksen Pypi isn't good for Debian packages or WMF repository. Same as, #91 (comment)

@LuccoJ
Copy link

LuccoJ commented Mar 27, 2018

I find it is in general a bad idea to have hardcoded "ceiling" version requirements. Pip, Virtualenv, Docker all seem to encourage this by saying "hey, who cares if there's several separate versions of this library installed? They can all coexist, and even if some are old enough to have serious security issues, they're running isolated!".

But Virtualenv isn't meant to provide security, and Docker provides it to some extent thanks to the kernel infrastructure it employs, but it's not its primary goal, nor is it sensible in the long run to make software insecure in the knowledge it will (perhaps?) be run under an isolated sandbox.

So in my opinion, maintained software meant for servers should generally keep doing what it has always done: interface correctly with all the versions of dependencies it requires from the latest to the oldest one that is still maintained, distributed and not shown to be critically insecure.
This allows packaging it for major distributions of Unix, as well as using the PyPi infrastructure... just, not as a shortcut to depend on specific (and quite possibly non-ideal) versions of things.

@sushain97
Copy link
Member Author

sushain97 commented Mar 27, 2018

I find it is in general a bad idea to have hardcoded "ceiling" version requirements.

To my knowledge, APy behaves very badly on Tornado 5.0 so the setup.py doesn't ask for it here. When that issue is fixed, the ceiling will be lifted. @TinoDidriksen correct me if I'm wrong regarding its behavior.

They can all coexist, and even if some are old enough to have serious security issues, they're running isolated!".

Well, as far as APy is concerned, Tornado 4.X is still well supported and maintained so there shouldn't be any security issues that could be fixed by lifting the current ceiling. As far as I can tell, there's no dependence on non-ideal versions; the ceiling is the latest version of 4.5 (released less than 2 months ago).

So in my opinion, maintained software meant for servers should generally keep doing what it has always done: interface correctly with all the versions of dependencies it requires from the latest to the oldest one that is still maintained, distributed and not shown to be critically insecure.

I don't oppose a PR that makes APy compatible with Tornado version X as long as it doesn't clutter the code significantly and someone actually uses APy with Tornado version X. The last release to Tornado v3 was almost 4 years ago so I think if someone wants to use APy with it, fine. They can use APy v0.10.0 :)

@TinoDidriksen
Copy link
Member

APy doesn't run at all with Tornado 5.0, which I guess counts as behaving badly.

Continue this discussion in ticket #99

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants