Skip to content

Conversation

@ruairif
Copy link
Collaborator

@ruairif ruairif commented Nov 8, 2016

Add compatability function for some tests
Add fallback if no c extenstions installed
Fix comment parsing in c extension
Handle comparison to None in python 3 in TextRegionExtractor

Add compatability function for some tests
Add fallback if no c extenstions installed
Fix comment parsing in c extension
try:
utext = unicode
except NameError:
class utext(str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it for doctests? scrapely uses https://pypi.python.org/pypi/doctest-ignore-unicode nose plugin to handle u prefixes in doctests; is this custom function needed?

tox.ini Outdated

[tox]
envlist = py27,py33,py34
envlist = py27,py34
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to add tox environments both for compiled and non-compiled versions of the code; one of them is untested otherwise.

@ruairif
Copy link
Collaborator Author

ruairif commented Nov 10, 2016

@kmike I've addressed your comments.

@ruairif ruairif force-pushed the python3-support branch 3 times, most recently from 3ae5c93 to 4771ff4 Compare November 14, 2016 09:34
@kmike
Copy link
Member

kmike commented Nov 14, 2016

@ruairif do you know why is CI failing?

@ruairif ruairif force-pushed the python3-support branch 2 times, most recently from 7f63c88 to a7f15c2 Compare November 14, 2016 13:19
ext_modules=cythonize(extensions),
install_requires=['numpy', 'w3lib', 'six'],
extras_require={
'speedup': ['cython']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it for? If an user installs scrapely[speedup] there won't be any speedup for the user, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cython isn't installed it will build the c extension from the included _htmlpage.pyx file. If you're downloading from pypi and the _htmlpage.c file is included that will be used to create the extension instead.

from Cython.Build import cythonize
extensions = cythonize(extensions)
if IS_PYPY:
extensions = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both IS_PYPY and USE_CYTHON are True then extensions will be cythonized, but not used. I think it makes sense to either respect USE_CYTHON in PyPy (their cpyext layer is improving, so maybe it compiles and speed is not worse), or avoid compiling the extension if IS_PYPY is True.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it shouldn't build with Cython when using pypy. The performance when using the compiled extension is 10 times slower than without so it is better for pypy to not use the extension at this time.
PR has been updated to reflect this.

Test python parsing implementation
Fallback to pure python parser if no cython available
@kmike kmike merged commit a1eb99e into master Dec 20, 2016
@ruairif ruairif deleted the python3-support branch December 21, 2016 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants