-
Notifications
You must be signed in to change notification settings - Fork 273
Python3 support #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python3 support #98
Conversation
Add compatability function for some tests Add fallback if no c extenstions installed Fix comment parsing in c extension
scrapely/compat.py
Outdated
| try: | ||
| utext = unicode | ||
| except NameError: | ||
| class utext(str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it for doctests? scrapely uses https://pypi.python.org/pypi/doctest-ignore-unicode nose plugin to handle u prefixes in doctests; is this custom function needed?
tox.ini
Outdated
|
|
||
| [tox] | ||
| envlist = py27,py33,py34 | ||
| envlist = py27,py34 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to add tox environments both for compiled and non-compiled versions of the code; one of them is untested otherwise.
|
@kmike I've addressed your comments. |
3ae5c93 to
4771ff4
Compare
|
@ruairif do you know why is CI failing? |
7f63c88 to
a7f15c2
Compare
| ext_modules=cythonize(extensions), | ||
| install_requires=['numpy', 'w3lib', 'six'], | ||
| extras_require={ | ||
| 'speedup': ['cython'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is it for? If an user installs scrapely[speedup] there won't be any speedup for the user, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cython isn't installed it will build the c extension from the included _htmlpage.pyx file. If you're downloading from pypi and the _htmlpage.c file is included that will be used to create the extension instead.
| from Cython.Build import cythonize | ||
| extensions = cythonize(extensions) | ||
| if IS_PYPY: | ||
| extensions = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both IS_PYPY and USE_CYTHON are True then extensions will be cythonized, but not used. I think it makes sense to either respect USE_CYTHON in PyPy (their cpyext layer is improving, so maybe it compiles and speed is not worse), or avoid compiling the extension if IS_PYPY is True.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it shouldn't build with Cython when using pypy. The performance when using the compiled extension is 10 times slower than without so it is better for pypy to not use the extension at this time.
PR has been updated to reflect this.
Test python parsing implementation Fallback to pure python parser if no cython available
a7f15c2 to
cf0ee1b
Compare
Add compatability function for some tests
Add fallback if no c extenstions installed
Fix comment parsing in c extension
Handle comparison to None in python 3 in TextRegionExtractor