New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goose is non-functional in Python 3 #148

Open
fake-name opened this Issue Sep 16, 2014 · 13 comments

Comments

Projects
None yet
7 participants
@fake-name

fake-name commented Sep 16, 2014

Title is largely self explanatory.

Primary limitation seems to be reliance on BeautifulSoup 3, which has been EOL for quite a while now, and really should be migrated away from.

@fake-name

This comment has been minimized.

Show comment
Hide comment
@fake-name

fake-name Sep 16, 2014

Actually, where is beautifulsoup used at all? I can't find any reference in the codebase to it at all It's being used in lxml somewhere, somehow, despite no explicit mention of it anywhere.

Also, unittest sucks, and doesn't report anything informative when you have an importerror. You can apparently use nosetests to run the same tests with sane output.

jieba can be replaced with jieba3k.

fake-name commented Sep 16, 2014

Actually, where is beautifulsoup used at all? I can't find any reference in the codebase to it at all It's being used in lxml somewhere, somehow, despite no explicit mention of it anywhere.

Also, unittest sucks, and doesn't report anything informative when you have an importerror. You can apparently use nosetests to run the same tests with sane output.

jieba can be replaced with jieba3k.

@fake-name

This comment has been minimized.

Show comment
Hide comment
@fake-name

fake-name Sep 16, 2014

Going through everything, it appears that the heavy dependency on soupparser is a problem. Runtime patching in bs4 instead of bs3 is not workable, since lxml uses invalid arguments to __init__.

fake-name commented Sep 16, 2014

Going through everything, it appears that the heavy dependency on soupparser is a problem. Runtime patching in bs4 instead of bs3 is not workable, since lxml uses invalid arguments to __init__.

@fake-name

This comment has been minimized.

Show comment
Hide comment
@fake-name

fake-name Sep 16, 2014

I have unit tests working.

Ran 126 tests in 10.607s

FAILED (errors=54, failures=49)

Welp! Time to look at other text extractors.

Is there any timeline on python 3 compatibility?

fake-name commented Sep 16, 2014

I have unit tests working.

Ran 126 tests in 10.607s

FAILED (errors=54, failures=49)

Welp! Time to look at other text extractors.

Is there any timeline on python 3 compatibility?

@hnykda

This comment has been minimized.

Show comment
Hide comment
@hnykda

hnykda Nov 25, 2014

+1 for python 3 support... Is there any schedule? Or you don't care at all?

hnykda commented Nov 25, 2014

+1 for python 3 support... Is there any schedule? Or you don't care at all?

@fake-name

This comment has been minimized.

Show comment
Hide comment
@fake-name

fake-name Nov 25, 2014

@kotrfa - It's not a direct equivalent, but I wound up using python-readability for text extraction. It works well enough.

fake-name commented Nov 25, 2014

@kotrfa - It's not a direct equivalent, but I wound up using python-readability for text extraction. It works well enough.

@vetal4444

This comment has been minimized.

Show comment
Hide comment
@vetal4444

vetal4444 Apr 9, 2015

Prepare PR to add py3 support: #220

vetal4444 commented Apr 9, 2015

Prepare PR to add py3 support: #220

@xanderdunn

This comment has been minimized.

Show comment
Hide comment
@xanderdunn

xanderdunn Jul 13, 2015

+1 for this. Why uses Python 2!?

xanderdunn commented Jul 13, 2015

+1 for this. Why uses Python 2!?

@hipoglucido

This comment has been minimized.

Show comment
Hide comment
@hipoglucido

hipoglucido Jun 7, 2016

Still waitting for Python 3 support :)

hipoglucido commented Jun 7, 2016

Still waitting for Python 3 support :)

@hnykda

This comment has been minimized.

Show comment
Hide comment
@hnykda

hnykda Jun 7, 2016

I believe this project is dead. Use https://github.com/codelucas/newspaper instead, which is inspired by goose and supports Python 3 flawlessly.

hnykda commented Jun 7, 2016

I believe this project is dead. Use https://github.com/codelucas/newspaper instead, which is inspired by goose and supports Python 3 flawlessly.

@hipoglucido

This comment has been minimized.

Show comment
Hide comment
@hipoglucido

hipoglucido Jun 7, 2016

Yep, I already knew it but I just wanted to do some comparison of the available tools. Indeed, I will use it. Thanks!

hipoglucido commented Jun 7, 2016

Yep, I already knew it but I just wanted to do some comparison of the available tools. Indeed, I will use it. Thanks!

@LukeB42

This comment has been minimized.

Show comment
Hide comment
@LukeB42

LukeB42 Sep 5, 2016

Any plans to introduce Python 3 support to this project?

LukeB42 commented Sep 5, 2016

Any plans to introduce Python 3 support to this project?

@LukeB42

This comment has been minimized.

Show comment
Hide comment
@LukeB42

LukeB42 Sep 5, 2016

Any plans to introduce Python 3 support to this project?

LukeB42 commented Sep 5, 2016

Any plans to introduce Python 3 support to this project?

@lababidi

This comment has been minimized.

Show comment
Hide comment
@lababidi

lababidi Apr 20, 2017

Hi everyone, this may come off as self promotion, but I went ahead and forked goose to work with python3. http://github.com/goose3/goose3 Enjoy

lababidi commented Apr 20, 2017

Hi everyone, this may come off as self promotion, but I went ahead and forked goose to work with python3. http://github.com/goose3/goose3 Enjoy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment