Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 support #263

Closed
extesy opened this issue Mar 6, 2013 · 65 comments
Closed

Python 3 support #263

extesy opened this issue Mar 6, 2013 · 65 comments
Milestone

Comments

@extesy
Copy link

@extesy extesy commented Mar 6, 2013

Python 3 is several years old and most of packages now support it (even django!). It would be really nice to support it in scrapy as well.

@artemdevel
Copy link
Contributor

@artemdevel artemdevel commented Mar 7, 2013

Scrapy uses Twisted in its core, so support python 3 at least depended on Twisted python 3 support. Twisted development team has a project to port Twisted on python 3 and it is in progress, so I think as soon as Twisted is ported to python 3 Scrapy will get good chances to be ported as well.

@todoit
Copy link

@todoit todoit commented Apr 25, 2013

mark

@nramirezuy
Copy link
Contributor

@nramirezuy nramirezuy commented Apr 25, 2013

@estin
Copy link

@estin estin commented Apr 29, 2013

for python3 I am developing
https://bitbucket.org/estin/pomp
like scrapy but very small, unstable and without hard twisted dependency

@coodoing
Copy link

@coodoing coodoing commented May 7, 2013

mark'
the latest development branch 0.17 did not support py3

@ariddell
Copy link

@ariddell ariddell commented May 16, 2013

@nramirezuy there's a reference implementation for pep 3156 here: https://code.google.com/p/tulip/

@muelli
Copy link

@muelli muelli commented Jun 19, 2014

Is there a list of what parts of Twisted are used?
Twisted have a python3 migration plan here: http://twistedmatrix.com/trac/wiki/Plan/Python3
It might be worthwhile to investigate whether the used parts of Twisted are already ported.

@txtsd
Copy link

@txtsd txtsd commented Oct 23, 2014

Can scrapy not be made to work with python 3, now that asyncio is available?

@curita curita removed the gsoc-candidate label Feb 11, 2015
@curita curita added this to the Scrapy 1.0 milestone Feb 11, 2015
@pablohoffman pablohoffman modified the milestones: Scrapy 1.1, Scrapy 1.0 Mar 5, 2015
@ianozsvald
Copy link

@ianozsvald ianozsvald commented Mar 18, 2015

+1 for Python 3.4 support. After a year using Python 3 (mainly sklearn, numpy, Anaconda, matplotlib, networkx etc) this is the first blocker I've had forcing me to downgrade.

The only other Python2.7-only project that I'm lightly using is Apache Spark and 3.4+ support is scheduled for their next release. In their issue tracker I posted some stats for Python 3 adoption - roughly speaking it is ">40%" (accepting the self-selected group of survey participants):
https://issues.apache.org/jira/browse/SPARK-4897?focusedCommentId=14303154&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14303154

@kmike
Copy link
Member

@kmike kmike commented Mar 18, 2015

@ianozsvald we are working on it, it is a priority :)

Scrapy is the worst kind of project to port to Python 3 - it depends on Twisted (which is not ported to Python 3 yet - some subset of Twisted works though), and it works at outside world / python world boundary, so there are many questions about unicode. "Outer World" Scrapy works with is wild - there is no a well-defined encoding we can decode/encode data from/to. Encoding rules are sometimes crazy - e.g. browsers (which Scrapy aims to emulate) can use different charsets for different parts of a single URL, e.g. cp1251 for /path and utf-8 for GET parameter values. I've ported a lot of code to Python 3 (including most of NLTK and tens of other Python packages), but still getting porting details wrong for Scrapy (e.g. #837 is wrong).

Some parts of Scrapy are already ported to Python 3. We're running tests for Python 3.3 on Travis to prevent regressions; ~240 tests pass in 3.3, out of ~1000. There is a GSoC project to port Scrapy to Python 3.x; I think we should make a good progress this summer.

@kmike
Copy link
Member

@kmike kmike commented Mar 18, 2015

There is also https://github.com/mitmproxy/mitmproxy Scrapy dependency which doesn't have Python 3 support yet, but it is used only in tests.

@ianozsvald
Copy link

@ianozsvald ianozsvald commented Mar 19, 2015

@kmike Hey Mikhail! You are a man of many projects :-) Glad to hear it is being worked on, I didn't get that impression from the early parts of this thread and couldn't see any other porting docs. I quite agree that this project (just like Flask et al.) is going to be hard, dealing with the interface to the outside world is horrid. I certainly didn't know that URLs themselves could have mixed encodings :-(
Given the continual migration to Python 3 for personal projects (50/50 according to the survey I linked vs Python 2.7) and >40% for work, the need for scrapy's Py3 support is only going to get stronger. Bon chance!

@pbronez
Copy link
Contributor

@pbronez pbronez commented Mar 23, 2015

+1 for Python 3 support! Thanks for the hard work you guys are putting into it, hope GSoC goes well.

@nuschk
Copy link

@nuschk nuschk commented Apr 20, 2015

👍 as well, would really love to be able to use python 3 with scrapy! And many thanks your effort!

@vmarkovtsev
Copy link

@vmarkovtsev vmarkovtsev commented May 19, 2015

You can use my patches with ported twisted.web.client.Agent and friends from my fork.

@nyov
Copy link
Contributor

@nyov nyov commented Jul 14, 2015

Are there still outside blockers for porting to python3? (twisted libs, etc.?)
Would love to see a list of those, if one has been made.

Also, in the name of eventual portability (e.g. asyncio?) how do people feel about dropping dependencies on twisted for the web/downloader part? I recall there was a gsoc idea for this?
Would be interesting to see if a downloader using pycurl bindings might work with twisted here.
(Though pycurl has no cffi bindings at this time, so no pypy support.)

@curita
Copy link
Member

@curita curita commented Jul 14, 2015

There's a comprehensive status of the twisted dependencies in Berker's proposal. @berkerpeksag, would you mind if we put it up on our wiki for reference?

@berkerpeksag
Copy link
Contributor

@berkerpeksag berkerpeksag commented Jul 14, 2015

Sure, but that list is a bit outdated. For example, twisted.web.static has already been ported to Python 3. You may want to check twisted/python/dist3.py first.

@curita
Copy link
Member

@curita curita commented Jul 14, 2015

Will do, thanks!

@curita
Copy link
Member

@curita curita commented Jul 14, 2015

@nyov
Copy link
Contributor

@nyov nyov commented Jul 14, 2015

Thank you both!

@tonal
Copy link

@tonal tonal commented Jul 15, 2015

@curita
Copy link
Member

@curita curita commented Jul 15, 2015

That's great news, thanks for reporting! I just updated the wiki.

@ianozsvald
Copy link

@ianozsvald ianozsvald commented Jul 23, 2015

Hello all. I've got a Lightning Talk on Python3.5 at my next PyDataLondon meet (200+ data scientists in the room). Someone is bound to ask about scrapy/twisted on Python 3.4+, could someone comment on the current state? It isn't clear to me from the links above if enough of twisted has been ported for scrapy to run on Python 3 (or will soon)?

@redapple redapple added this to the v1.1 milestone Jan 27, 2016
@redapple
Copy link
Contributor

@redapple redapple commented Jan 27, 2016

Basic support is planned for v1.1
And we plan to make it more robust for v1.2

@vmarkovtsev
Copy link

@vmarkovtsev vmarkovtsev commented Jan 27, 2016

👍

@stonebig
Copy link

@stonebig stonebig commented Jan 27, 2016

Great ! Is there a document that estimates the rough timeline of these two milestones ? spring 2016 and summer 2016 ?

@redapple
Copy link
Contributor

@redapple redapple commented Jan 27, 2016

@stonebig , we plan on releasing Scrapy 1.1 officially by the end of February 2016 (with a candidate release at least in the next few days)
Scrapy 1.2 would be a couple of months after that (we hope)

@stonebig
Copy link

@stonebig stonebig commented Jan 27, 2016

thanks a lot for this information, @redapple !

@KeremTubluk
Copy link

@KeremTubluk KeremTubluk commented Feb 2, 2016

It has gone and past six days, @redapple!

:)

@redapple
Copy link
Contributor

@redapple redapple commented Feb 2, 2016

@manugarri
Copy link

@manugarri manugarri commented Feb 4, 2016

@ghost
Copy link

@ghost ghost commented Feb 29, 2016

it seems that the twisted already supports py3.3+

@d0ugal
Copy link

@d0ugal d0ugal commented Feb 29, 2016

it seems that the twisted already supports py3.3+

@ABSmiLT Yeah, AFAICT Twisted only recently supported 3 well enough for Scrapy. Hence all the discussion above and in the docs.

@kmike
Copy link
Member

@kmike kmike commented Feb 29, 2016

@ABSmiLT we've released scrapy 1.1rc1 with alpha-level Python 3 support about a month ago. 1.1rc2 will be released soon; it fixes several Python 3 compatibility issues we've found while testing 1.1rc1.

@ghost
Copy link

@ghost ghost commented Mar 18, 2016

thanks for informing, @kmike @d0ugal
looking forward to the new stable version compatible with py3

@redapple
Copy link
Contributor

@redapple redapple commented May 11, 2016

Scrapy 1.1.0 is on PyPI, with Python 3 support (finally).

What took you so long?

See release notes.

Have fun!

@redapple redapple closed this May 11, 2016
@nyov
Copy link
Contributor

@nyov nyov commented May 11, 2016

Are we there yet? are we there yet are wethereyetarewethere — 💥 ...eh, WHAT?
😭 finally!

Congrats. Great job there keeping up the backports, @redapple. (Probably made a nice bang there, punching that "close"- button just then, right?)
And Thanks, everyone.

@fish-ball
Copy link

@fish-ball fish-ball commented May 12, 2016

Congratulations!!!!

@ghost
Copy link

@ghost ghost commented May 16, 2016

@redapple FINALLY!
thanks @ all

btw, the "PY3 95%" label is now out of date?
image

@kmike
Copy link
Member

@kmike kmike commented May 16, 2016

@ABSmiLT no, it means 95% tests are passing in Python 3. There is a couple of features not ported yet (e.g. telnet console), and some tests are skipped. If we'd have PY2 badge it also won't be 100% according to this metric because some tests are also skipped in Python 2 :)

@lothbrek
Copy link

@lothbrek lothbrek commented May 17, 2016

So glad to see Python 3 support. Currently writing an automation/web crawling system in Ruby but may switch to Python with the help of Scrapy!

@dangra
Copy link
Member

@dangra dangra commented May 18, 2016

It has been a long road, great work!

@ghost
Copy link

@ghost ghost commented May 20, 2016

@kmike thanks a lot~ 💯

@AverHLV
Copy link

@AverHLV AverHLV commented May 21, 2016

But what about Twisted? I can`t run my Scrapy project on python 3.5 cause of Twisted errors.

@rmax
Copy link
Contributor

@rmax rmax commented May 21, 2016

@AverHLV Twisted can run on python 3 but not on windows because _win32stdio has not been ported yet. See:

@kmike
Copy link
Member

@kmike kmike commented May 21, 2016

Yeah, there is an issue in Scrapy bug tracker about that: #1998

@ghost
Copy link

@ghost ghost commented May 26, 2016

oops...am using python 3.5...so i have to continue waiting...loooooool

@ghost
Copy link

@ghost ghost commented May 26, 2016

i wish i can contribute to this project ASAP...still a newbie now...

@dangra
Copy link
Member

@dangra dangra commented May 26, 2016

hey @ABSmiLT, try the hack suggestion in #1998 (comment) and share your experience.

@ghost
Copy link

@ghost ghost commented May 28, 2016

@dangra okay, thanks a lot~

lucywang000 pushed a commit to lucywang000/scrapy that referenced this issue Feb 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.