Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BeautifulSoup can't find html5lib #10

Closed
chrisberkhout opened this issue Apr 4, 2022 · 3 comments
Closed

BeautifulSoup can't find html5lib #10

chrisberkhout opened this issue Apr 4, 2022 · 3 comments

Comments

@chrisberkhout
Copy link

Hi Carey!

Since updating to Python 3.10, I can't get stravabackup/stravaweblib to run without this error:

Traceback (most recent call last):
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/bin/strava-backup", line 8, in <module>
    sys.exit(main())
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/lib/python3.10/site-packages/stravabackup/__main__.py", line 100, in main
    sb = StravaBackup(access_token, email, password, output_dir)
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/lib/python3.10/site-packages/stravabackup/stravabackup.py", line 99, in __init__
    self.client = WebClient(access_token=access_token, email=email,
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/lib/python3.10/site-packages/stravaweblib/webclient.py", line 73, in __init__
    self._login_with_password(email, password)
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/lib/python3.10/site-packages/stravaweblib/webclient.py", line 151, in _login_with_password
    **self.csrf
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/lib/python3.10/site-packages/stravaweblib/webclient.py", line 98, in csrf
    self._csrf = self._get_csrf_token()
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/lib/python3.10/site-packages/stravaweblib/webclient.py", line 108, in _get_csrf_token
    soup = BeautifulSoup(login_html, 'html5lib')
  File "/home/chris/.virtualenvs/Strava-m1m-PDVS/lib/python3.10/site-packages/bs4/__init__.py", line 243, in __init__
    raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html5lib. Do you need to install a parser library?

Tried:

  • Extra install of updated html5lib, lxml, bs4.
  • Downgrading beautifulsoup4 to 4.9.3 as before and ensuring no change in html5lib version.
  • Avoiding Pipenv use and just installing directly with pip.

The other suggestion I see around for this problem is to change BeautifulSoup(login_html, 'html5lib') to BeautifulSoup(login_html, 'html.parser'). Not sure why BeautifulSoup isn't finding html5parser anymore, but maybe switching parser is the easiest fix?

@chrisberkhout chrisberkhout changed the title BeautifulSoup BeautifulSoup can't find html5lib Apr 4, 2022
@chrisberkhout
Copy link
Author

Switching to html.parser works in my fork.

@chrisberkhout
Copy link
Author

May have something to do with the Collections abstract base class moving in Python 3.10 and html5lib version 0.999999999 depending on the deprecated name.

https://docs.python.org/3.9/library/collections.html#module-collections

Deprecated since version 3.3, will be removed in version 3.10: Moved Collections
Abstract Base Classes to the collections.abc module. For backwards
compatibility, they continue to be visible in this module through Python 3.9.

So an alternate fix may be to change this project to allow the latest html5lib.

@pR0Ps pR0Ps closed this as completed in bc7b6ce Apr 5, 2022
@pR0Ps
Copy link
Owner

pR0Ps commented Apr 5, 2022

Thanks for the report, should be fixed in v0.0.6! (uses the built-in html.parser now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants