Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 support #263

Closed
extesy opened this issue Mar 6, 2013 · 65 comments
Closed

Python 3 support #263

extesy opened this issue Mar 6, 2013 · 65 comments
Milestone

Comments

@extesy
Copy link

extesy commented Mar 6, 2013

Python 3 is several years old and most of packages now support it (even django!). It would be really nice to support it in scrapy as well.

@artemdevel
Copy link

artemdevel commented Mar 7, 2013

Scrapy uses Twisted in its core, so support python 3 at least depended on Twisted python 3 support. Twisted development team has a project to port Twisted on python 3 and it is in progress, so I think as soon as Twisted is ported to python 3 Scrapy will get good chances to be ported as well.

@todoit
Copy link

todoit commented Apr 25, 2013

mark

@nramirezuy
Copy link
Contributor

nramirezuy commented Apr 25, 2013

we are waiting for http://www.python.org/dev/peps/pep-3156/

@estin
Copy link

estin commented Apr 29, 2013

for python3 I am developing
https://bitbucket.org/estin/pomp
like scrapy but very small, unstable and without hard twisted dependency

@coodoing
Copy link

coodoing commented May 7, 2013

mark'
the latest development branch 0.17 did not support py3

@ariddell
Copy link

ariddell commented May 16, 2013

@nramirezuy there's a reference implementation for pep 3156 here: https://code.google.com/p/tulip/

@muelli
Copy link

muelli commented Jun 19, 2014

Is there a list of what parts of Twisted are used?
Twisted have a python3 migration plan here: http://twistedmatrix.com/trac/wiki/Plan/Python3
It might be worthwhile to investigate whether the used parts of Twisted are already ported.

@txtsd
Copy link

txtsd commented Oct 23, 2014

Can scrapy not be made to work with python 3, now that asyncio is available?

@curita curita added this to the Scrapy 1.0 milestone Feb 11, 2015
@pablohoffman pablohoffman modified the milestones: Scrapy 1.1, Scrapy 1.0 Mar 5, 2015
@ianozsvald
Copy link

ianozsvald commented Mar 18, 2015

+1 for Python 3.4 support. After a year using Python 3 (mainly sklearn, numpy, Anaconda, matplotlib, networkx etc) this is the first blocker I've had forcing me to downgrade.

The only other Python2.7-only project that I'm lightly using is Apache Spark and 3.4+ support is scheduled for their next release. In their issue tracker I posted some stats for Python 3 adoption - roughly speaking it is ">40%" (accepting the self-selected group of survey participants):
https://issues.apache.org/jira/browse/SPARK-4897?focusedCommentId=14303154&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14303154

@kmike
Copy link
Member

kmike commented Mar 18, 2015

@ianozsvald we are working on it, it is a priority :)

Scrapy is the worst kind of project to port to Python 3 - it depends on Twisted (which is not ported to Python 3 yet - some subset of Twisted works though), and it works at outside world / python world boundary, so there are many questions about unicode. "Outer World" Scrapy works with is wild - there is no a well-defined encoding we can decode/encode data from/to. Encoding rules are sometimes crazy - e.g. browsers (which Scrapy aims to emulate) can use different charsets for different parts of a single URL, e.g. cp1251 for /path and utf-8 for GET parameter values. I've ported a lot of code to Python 3 (including most of NLTK and tens of other Python packages), but still getting porting details wrong for Scrapy (e.g. #837 is wrong).

Some parts of Scrapy are already ported to Python 3. We're running tests for Python 3.3 on Travis to prevent regressions; ~240 tests pass in 3.3, out of ~1000. There is a GSoC project to port Scrapy to Python 3.x; I think we should make a good progress this summer.

@kmike
Copy link
Member

kmike commented Mar 18, 2015

There is also https://github.com/mitmproxy/mitmproxy Scrapy dependency which doesn't have Python 3 support yet, but it is used only in tests.

@ianozsvald
Copy link

ianozsvald commented Mar 19, 2015

@kmike Hey Mikhail! You are a man of many projects :-) Glad to hear it is being worked on, I didn't get that impression from the early parts of this thread and couldn't see any other porting docs. I quite agree that this project (just like Flask et al.) is going to be hard, dealing with the interface to the outside world is horrid. I certainly didn't know that URLs themselves could have mixed encodings :-(
Given the continual migration to Python 3 for personal projects (50/50 according to the survey I linked vs Python 2.7) and >40% for work, the need for scrapy's Py3 support is only going to get stronger. Bon chance!

@pbronez
Copy link

pbronez commented Mar 23, 2015

+1 for Python 3 support! Thanks for the hard work you guys are putting into it, hope GSoC goes well.

@nuschk
Copy link

nuschk commented Apr 20, 2015

👍 as well, would really love to be able to use python 3 with scrapy! And many thanks your effort!

@vmarkovtsev
Copy link

vmarkovtsev commented May 19, 2015

You can use my patches with ported twisted.web.client.Agent and friends from my fork.

@nyov
Copy link
Contributor

nyov commented Jul 14, 2015

Are there still outside blockers for porting to python3? (twisted libs, etc.?)
Would love to see a list of those, if one has been made.

Also, in the name of eventual portability (e.g. asyncio?) how do people feel about dropping dependencies on twisted for the web/downloader part? I recall there was a gsoc idea for this?
Would be interesting to see if a downloader using pycurl bindings might work with twisted here.
(Though pycurl has no cffi bindings at this time, so no pypy support.)

@curita
Copy link
Member

curita commented Jul 14, 2015

There's a comprehensive status of the twisted dependencies in Berker's proposal. @berkerpeksag, would you mind if we put it up on our wiki for reference?

@berkerpeksag
Copy link
Contributor

berkerpeksag commented Jul 14, 2015

Sure, but that list is a bit outdated. For example, twisted.web.static has already been ported to Python 3. You may want to check twisted/python/dist3.py first.

@curita
Copy link
Member

curita commented Jul 14, 2015

Will do, thanks!

@curita
Copy link
Member

curita commented Jul 14, 2015

@nyov
Copy link
Contributor

nyov commented Jul 14, 2015

Thank you both!

@tonal
Copy link

tonal commented Jul 15, 2015

@curita
Copy link
Member

curita commented Jul 15, 2015

That's great news, thanks for reporting! I just updated the wiki.

@ianozsvald
Copy link

ianozsvald commented Jul 23, 2015

Hello all. I've got a Lightning Talk on Python3.5 at my next PyDataLondon meet (200+ data scientists in the room). Someone is bound to ask about scrapy/twisted on Python 3.4+, could someone comment on the current state? It isn't clear to me from the links above if enough of twisted has been ported for scrapy to run on Python 3 (or will soon)?

@redapple redapple added this to the v1.1 milestone Jan 27, 2016
@redapple
Copy link
Contributor

redapple commented Jan 27, 2016

Basic support is planned for v1.1
And we plan to make it more robust for v1.2

@vmarkovtsev
Copy link

vmarkovtsev commented Jan 27, 2016

👍

@stonebig
Copy link

stonebig commented Jan 27, 2016

Great ! Is there a document that estimates the rough timeline of these two milestones ? spring 2016 and summer 2016 ?

@redapple
Copy link
Contributor

redapple commented Jan 27, 2016

@stonebig , we plan on releasing Scrapy 1.1 officially by the end of February 2016 (with a candidate release at least in the next few days)
Scrapy 1.2 would be a couple of months after that (we hope)

@stonebig
Copy link

stonebig commented Jan 27, 2016

thanks a lot for this information, @redapple !

@KeremTubluk
Copy link

KeremTubluk commented Feb 2, 2016

It has gone and past six days, @redapple!

:)

@redapple
Copy link
Contributor

redapple commented Feb 2, 2016

@KeremTubluk , we're not quite there yet: https://github.com/scrapy/scrapy/milestones/v1.1

@manugarri
Copy link

manugarri commented Feb 4, 2016

aand its official now. http://doc.scrapy.org/en/stable/news.html#id1

@ghost
Copy link

ghost commented Feb 29, 2016

it seems that the twisted already supports py3.3+

@d0ugal
Copy link

d0ugal commented Feb 29, 2016

it seems that the twisted already supports py3.3+

@ABSmiLT Yeah, AFAICT Twisted only recently supported 3 well enough for Scrapy. Hence all the discussion above and in the docs.

@kmike
Copy link
Member

kmike commented Feb 29, 2016

@ABSmiLT we've released scrapy 1.1rc1 with alpha-level Python 3 support about a month ago. 1.1rc2 will be released soon; it fixes several Python 3 compatibility issues we've found while testing 1.1rc1.

@ghost
Copy link

ghost commented Mar 18, 2016

thanks for informing, @kmike @d0ugal
looking forward to the new stable version compatible with py3

@redapple
Copy link
Contributor

redapple commented May 11, 2016

Scrapy 1.1.0 is on PyPI, with Python 3 support (finally).

What took you so long?

See release notes.

Have fun!

@nyov
Copy link
Contributor

nyov commented May 11, 2016

Are we there yet? are we there yet are wethereyetarewethere — 💥 ...eh, WHAT?
😭 finally!

Congrats. Great job there keeping up the backports, @redapple. (Probably made a nice bang there, punching that "close"- button just then, right?)
And Thanks, everyone.

@fish-ball
Copy link

fish-ball commented May 12, 2016

Congratulations!!!!

@ghost
Copy link

ghost commented May 16, 2016

@redapple FINALLY!
thanks @ all

btw, the "PY3 95%" label is now out of date?
image

@kmike
Copy link
Member

kmike commented May 16, 2016

@ABSmiLT no, it means 95% tests are passing in Python 3. There is a couple of features not ported yet (e.g. telnet console), and some tests are skipped. If we'd have PY2 badge it also won't be 100% according to this metric because some tests are also skipped in Python 2 :)

@lothbrek
Copy link

lothbrek commented May 17, 2016

So glad to see Python 3 support. Currently writing an automation/web crawling system in Ruby but may switch to Python with the help of Scrapy!

@dangra
Copy link
Member

dangra commented May 18, 2016

It has been a long road, great work!

@ghost
Copy link

ghost commented May 20, 2016

@kmike thanks a lot~ 💯

@AverHLV
Copy link

AverHLV commented May 21, 2016

But what about Twisted? I can`t run my Scrapy project on python 3.5 cause of Twisted errors.

@rmax
Copy link
Contributor

rmax commented May 21, 2016

@AverHLV Twisted can run on python 3 but not on windows because _win32stdio has not been ported yet. See:

@kmike
Copy link
Member

kmike commented May 21, 2016

Yeah, there is an issue in Scrapy bug tracker about that: #1998

@ghost
Copy link

ghost commented May 26, 2016

oops...am using python 3.5...so i have to continue waiting...loooooool

@ghost
Copy link

ghost commented May 26, 2016

i wish i can contribute to this project ASAP...still a newbie now...

@dangra
Copy link
Member

dangra commented May 26, 2016

hey @ABSmiLT, try the hack suggestion in #1998 (comment) and share your experience.

@ghost
Copy link

ghost commented May 28, 2016

@dangra okay, thanks a lot~

lucywang000 pushed a commit to lucywang000/scrapy that referenced this issue Feb 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests