Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH-FIX: change distutils to setuptools and add flexible cythonization #949

Merged
merged 121 commits into from Jul 12, 2016
Merged

ENH-FIX: change distutils to setuptools and add flexible cythonization #949

merged 121 commits into from Jul 12, 2016

Conversation

sem-geologist
Copy link
Contributor

@sem-geologist sem-geologist commented Apr 9, 2016

this code fixes two things:

  1. changes distutils to setuptools
  2. adds the flexible cythonization.

first is straight forward, but second have to be discussed.
Firstly lets consider these points:

  • User can obtain hyperspy in few different ways: git, binary distribution, source distribution.
  • It is highly recommended by cython documentation to distribute C files generated from cython as this makes cython not required during installation and prevents undefined behavior resulting from different versions of cython.
  • cython generated C files are huge, complicated, contains lots of "boiler plates" is quite ugly and hardly readable. It is not strange that it is not very welcome in the git source tree.
  • At this moment there is one library pending (bcf io_plugin) which have some functions implemented in cython, but alternative python code is also present, so even in worst scenario lack of cython should not be an obstacle for installment.

Considering above the setup should meet such criteria:

  • do not have hard dependency on cython
  • automatically check if *.pyx files have c files with same name, if not:
    • try to import cythonize from Cython and use on the pyx files
    • if there is no cython present on the system, user should be informed and present of alternatives, but the installation should be not terminated as without cython code hyperspy is still fully functional, just in some minute parts where cython code would be missing it will be a bit slow.
  • if there is cythonized c files present, do not cythonize the code, unless the new "recythonize" command is used: python setup.py recythonize. special user option are added to python setup.py install --force-cythonization (or to develop which will be used for testing env)
  • for development purpose if source contains '.git' - generate a post-checkout hook which removes only cythonized '.c/.cpp' and compiled '.so' or '.pyd' files.

TODO:

  • add trial to cythonize extensions if c/c++ files are missing
  • add new command "recythonize" to be used with setup.py to recythonize all extensions
  • update CI and anaconda configs
  • add git hook to automaticaly recythonize and rebuild --inplace after git checkout
  • add travis ci testing for osx
  • add bdist_wheels deployment to release from travis ci (for osx) and sdist (for linux and other)
  • test travis release works
  • add bdist_wheels deployment to release from appveyor
  • test bdist_wheels deployment to release works from appveyor (it does not deploy but it builds and preserves the artefacts at appveyor)
  • test python setup.py install works without compiler present (ubuntu live usb: passed; appveyor: too much interfered setup; windows 7 with fresh winpython + traits(from gohlke's place): passed)


import distutils.dir_util
import setuptools
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, this could have been untouched as only thing it is used is to clean build folder.
Alternatively it could be from setuptools import distutis.dir_util.

just first part...
dir_utils is in distutils which looks like is in setuptools,
but setuptools have just a link to distutils, so 
```from setuptools import distutils.dir_utils``` works
add temporary file to test the cython work-chain in building testing and CI environmnets
@francisco-dlp
Copy link
Member

This does more than it advertises. In addition to re-switching to setup tools (this was done in #849, but possible it was reverted in a merge) this adds functions to cythonize cython extensions and a test to check that the cython extensions are correctly installed. That's fine, but could you edit the description of this PR to better reflex what it does?

The cythonize functionality seems to WIP. Let us know when this is ready for review.

subclased and add the check or force mechanism to (re)cythonize
*.pyx files while python setup.py install;
added optional --force-cythonization user flag
@sem-geologist
Copy link
Contributor Author

just upgraded the cythonization during installation:

  1. it checks if there is c file for every pyx file, if not then calls cythonize, if there is no cython library it prints warning about it but does not stop.
  2. install have additional user option --force-cythonization if i.e. user have newer/older cython/c and wants to recythonize existing c code.

I added dummy pyx in misc dir to check how cythonization works. It is either to be removed in the end of this feature development, or I could add some testing in tests, to see it works properly.

@sem-geologist sem-geologist changed the title ENH-FIX: change distutils to setuptools ENH-FIX: change distutils to setuptools and add flexible cythonization Apr 11, 2016
try:
from Cython.Build import cythonize
except ImportError:
print("""WARNING: cython required to generate fast c code is not found on this system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use a proper warning instead i.e. warnings.warn

@francisco-dlp
Copy link
Member

Installing without Cython currently fails like this:

jd29@SATELLITE-PRO-C850-10N:~/git/hyperspy$ ana3-python3.5 setup.py build_ext --inplace --force-cythonization
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

error: option --force-cythonization not recognized
fjd29@SATELLITE-PRO-C850-10N:~/git/hyperspy$ ana3-python3.5 setup.py install --force-cythonization
running install
cythonizing the *.pyx source
WARNING: cython required to generate fast c code is not found on this system.
Only slow pure python alternative functions will be available.
To use fast implementation of some functions writen in cython either:
a) install cython and re-run the installation,
b) try alternative source distribution containing cythonized C versions of fast code,
c) use binary distribution (i.e. wheels, egg).
Traceback (most recent call last):
  File "setup.py", line 254, in <module>
    "Topic :: Scientific/Engineering :: Physics",
  File "/usr/lib/python3.5/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.5/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.5/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "setup.py", line 106, in run
    orig_install.run()
TypeError: run() missing 1 required positional argument: 'self'

@francisco-dlp
Copy link
Member

And with Cython installed it fails exactly in the same way.

@sem-geologist
Copy link
Contributor Author

please, patience, it is still work in progress
:)
The sad thing is that both - distutils and setuptools are not well documented (particularly sub-classing, customization and such stuff, so it is a bit like blind guy trying to find out the exit in the maze)

@sem-geologist
Copy link
Contributor Author

@francisco-dlp , so at this moment it should work. If you check the travis logs (under the 340b769
commit), you will see it prints the warning (I will change it to use warning.warn), because travis have no cython installed in its environment at this moment. So that is how it will look, if somebody obtains a source code straight from github, but will have no cython installed. Now I am going to modify the travis to install cython. And pyx files should be compiled then.

I can remove the test file in the end before the merge, or move it to tests. Now I want to be sure pyx files in hyperspy normal directories can be compiled.

@sem-geologist
Copy link
Contributor Author

oh, and by the way, build_ext will not work at this moment as it is still not subclassed, just install is subclassed at this moment. Also build_ext --inplace using the setuptools can be changed rather to develop as it does the same thing

@francisco-dlp
Copy link
Member

I think that travis should indeed always cythonize. In this way, we'll know if a new version of Cython breaks something.

Regarding the test, I think that it'll be best to move it to hyperspy/tests as, in case that some other Cython extension gets broken, if this one passes we'll know that it is most likely not the fault of setup.py.

Probably it'll be also good to add an extra travis and appveyor build to install without the cython extensions to verify that everything goes well too in this case.

Finally, I think that this requires an entry in the developers guide explaining best practices for including Cython code in HyperSpy and how to install in devel mode. Now that we have the Developers' Guide maybe we should move this to the devel guide and updated with instruction on how to properly do this now that we include Cython code too?

@sem-geologist
Copy link
Contributor Author

To add thing to development guide is a very splendid idea.
However at the moment I feel a bit like a blind person in the maze. I need to sort out what is the best approach to this case because I start to think I get myself into the trap. Probably I need to move the stuff away from install to build_ext, as that is called by all commands (build, develop, install). Setuptools are very highly using distutils (and actually if build_ext is called it is mainly from distutils), and I need to get more familiar how distutils works (trying to figure out by reading source code). I wonder how is it possible that documentation on such important python toolchain is so lacking.

@francisco-dlp
Copy link
Member

I agree that the documentation on this issue is not great.

I like sklearn's approach: cythonize always except when PKG-INFO exists.

@sem-geologist
Copy link
Contributor Author

ok, I am going to modify, the appveyor: instead of smoking up that file, I will rename it, and after rename it back (hopefully). I am testing stuff now on separate branch (as I am not good in powershell - lots of trial and error going there :) ). I will merge it back to this branch, so you could start reviewing changes and test it out how it works when compiler is not present (setup.py modified, and new minimal dummy c file created in /hyperspy/tests/misc/test_compilers.c), and in mean time I will try to do renaming of that file forward and back tricks (The winpython/ in appveyor stuff).

@francisco-dlp
Copy link
Member

francisco-dlp commented Jul 10, 2016

Sounds good, just let me know when this is ready for review.

add test to check for compiler presence
@sem-geologist
Copy link
Contributor Author

sem-geologist commented Jul 10, 2016

I merged the changes to setup.py and minimal c file, however appveyor winpython have problems if I leave the pysitutil.cfg file file. It behaves differently than linux wihout gcc, where on linux if extensions is empty list it will not try to use compiler, while on winpython it tries to use mingw compiler anyway, even if extensions is set to just empty list. Maybe it is winpythons modifications. I will try to change empty list to None, will see where it will lead (tried on linux it works same as with empty list, maybe winpython will take it)

Installation will continue in 10 sec...""")
extensions = []
from time import sleep
sleep(10) #wait 10 secs for user to notice the message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why waiting 10 s? As the user has to take no action I would simply print the warning and continue...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if terminal used have some limit (often 1000 lines, by default) it "flies" away. this will wait for 10 sec only if compiler is missing. if it is here, no waiting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I have results from winpython... it is bad, no matter to what extensions are set (empty list or None) if keyword ext_modules is provided to setup() then winpython anyway tries to fire up compiler :/. So I don't know what to do now as it is "impossible" to trick winpython from touching compiler.

@sem-geologist
Copy link
Contributor Author

And I have results from winpython... it is bad, no matter to what extensions are set (empty list or None) if keyword ext_modules is provided to setup() then winpython anyway tries to fire up compiler :/. So I don't know what to do now as it is "impossible" to trick winpython from touching compiler.

@francisco-dlp
Copy link
Member

I think that the easiest solution is to put all the keyword arguments in a dictionary and simply don't add the ext_modules key if there is not compiler e.g.

keyword_args = {}
setup(**keyword_args)

May that work?

@sem-geologist
Copy link
Contributor Author

sem-geologist commented Jul 10, 2016

:) exactly what I am trying to do now :D

@sem-geologist
Copy link
Contributor Author

the revolution in setup.py didn't helped... I think problem is lying somethere else. I made abranch from master, adjusted the appveyor to fit for faster testing of winpython. master have no problems to get installed on winpython. I will add stuff step by step to find out where the problem is introduced. i.e. I skiped all conda stuff and tried to go straight to winpython steps, and it fails (many times trying to use compiler...), however if conda stuff is not comented it works (that why I suspect compiled stuff from conda is coppied to winpython? so basically, if somebody will want to install hyperspy on bare minimum winpython, it will fail with dependencies :/

add test_compilers.c
instead of delting rename before
install the pydistutil.cfg
and back after install
in appveyor script
@sem-geologist
Copy link
Contributor Author

sem-geologist commented Jul 11, 2016

I didnt moved to keywords as it does not fix issue, however I think I fail to simulate compiler'less environment on appveyor winpython, due to it reusing condas stuff... unfortunately I have no winpython environment (windows) to test that now. If I comment conda stuff at appveyor script (at fresh master branch), then winpython behaves similary as python in ubuntu: it tries to compile traits, and fails. I guess with conda ennabled winpython reuse condas distutils which internally have msvc compiler presence coded, and so my test thinks compiler is present (I guess, it is hard to tell what is really going on) but then winpython forces to use mingw and so it gets messed. That of course do not influence building the nsis installer as I made the renaming of config before winpython installs hyperspy, and rename back just before nsis start building.

So basically I think there is little I can do to this any more... I think it is time to merge

@sem-geologist
Copy link
Contributor Author

sem-geologist commented Jul 11, 2016

@francisco-dlp , I just tested installation of master on fresh windows system (7) with freshly installed winpython. It fails to install because of traits needs to be compiled. So ubuntu for now is the only system where limited hyperspy version is actual (as there is python-traits binary package there available). Unfortunately installation of hyperspy as library on winpython requires compiler, and I just want to make 200% sure that winpython does not include any compiler! Also I think the points I stated yesterday is valid, and this can be merged.
update: oh and it is very likely that wheels will not help us here, as traits is also dependent there.
update2: hmm unleass to use famous goklke's unofficial binaries... which do not install through winpython's control panel but only by cmdline pip, however it does install :)...

update3: I am not very happy about reduction dependancies in setup.py, just to get to know about them after installation trying out hyperspy: natsort

update4: tried installing bundle, so far so good. wait traitsui not implemented error if hs.load();
but hs.load('pathtofilename') works...

@sem-geologist
Copy link
Contributor Author

looks that installing from wheel have all dependencies needed. only traits have to be installed from gohlke's site. I think I will open new issue/pull request to improve instructions from this point

so winpython could install minimal
"pure" python version of hyperspy
@sem-geologist
Copy link
Contributor Author

@francisco-dlp , @to266 , finally I tested installing this branch in windows environment without any compiler present on winpython and it was a success (after some minor modifications which I just pushed, catching msvc missing). the weird thing is that on my machine ,differently than on appveoyr, distutil complained about msvc missing and not the mingw. Anyway this is now completely tested and safe to use.

@francisco-dlp francisco-dlp merged commit 735042c into hyperspy:master Jul 12, 2016
@francisco-dlp
Copy link
Member

Thanks for all the effort and attention to detail that you've put into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants