Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. #68

Closed
maretodoric opened this issue May 21, 2020 · 4 comments · Fixed by #78
Closed
Labels
enhancement New feature or request question Further information is requested
Milestone

Comments

@maretodoric
Copy link

When trying to run the script, an error message appeared after trying to login to facebook. DEBUG log bellow:

[2020-05-21 16:41:41,728] fb2cal INFO (<module>) Starting fb2cal v1.0.4 (Production) [https://git.io/fjMwr]
[2020-05-21 16:41:41,730] fb2cal INFO (<module>) This project is released under the GPLv3 license.
[2020-05-21 16:41:41,730] fb2cal INFO (main) Attemping to parse config file config.ini...
[2020-05-21 16:41:41,733] fb2cal INFO (main) Config successfully loaded.
[2020-05-21 16:41:41,734] fb2cal INFO (main) Logging level set to: DEBUG
[2020-05-21 16:41:41,736] fb2cal INFO (main) Attemping to authenticate with Facebook...
[2020-05-21 16:41:41,744] urllib3.connectionpool DEBUG (_new_conn) Starting new HTTP connection (1): www.facebook.com:80
[2020-05-21 16:41:41,838] urllib3.connectionpool DEBUG (_make_request) http://www.facebook.com:80 "GET /login.php HTTP/1.1" 302 0
[2020-05-21 16:41:41,844] urllib3.connectionpool DEBUG (_new_conn) Starting new HTTPS connection (1): www.facebook.com:443
[2020-05-21 16:41:42,059] urllib3.connectionpool DEBUG (_make_request) https://www.facebook.com:443 "GET /login.php HTTP/1.1" 200 None
Traceback (most recent call last):
  File "src/fb2cal.py", line 743, in <module>
    main()
  File "src/fb2cal.py", line 113, in main
    facebook_authenticate(browser, config['AUTH']['FB_EMAIL'], config['AUTH']['FB_PASS'])
  File "src/fb2cal.py", line 214, in facebook_authenticate
    login_page = browser.get(FACEBOOK_LOGIN_URL)
  File "/usr/local/lib/python3.7/dist-packages/mechanicalsoup/browser.py", line 127, in get
    Browser.add_soup(response, self.soup_config)
  File "/usr/local/lib/python3.7/dist-packages/mechanicalsoup/browser.py", line 70, in add_soup
    response.soup = bs4.BeautifulSoup(response.content, **soup_config)
  File "/usr/local/lib/python3.7/dist-packages/bs4/__init__.py", line 245, in __init__
    % ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

I was running this on Raspberry Pi 3 (Raspbian 10), Python 3.7.3.
I have installer prerequisites prior to running the app as described and lxml is installed, see the output bellow:

root@raspberrypi:~/fb2cal# python3 --version
Python 3.7.3
root@raspberrypi:~/fb2cal# python3 -m pip install lxml
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Requirement already satisfied: lxml in /usr/local/lib/python3.7/dist-packages (4.5.1)

But the error still persisted. I managed to get this resolved by changing lxml to html.parser on src/fb2cal.py line #624 from:
day_name = BeautifulSoup(birthday_date_str, 'lxml').get_text().lower()
to
day_name = BeautifulSoup(birthday_date_str, 'html.parser').get_text().lower()

and file /usr/local/lib/python3.7/dist-packages/mechanicalsoup/browser.py line #35 from
def __init__(self, session=None, soup_config={'features': 'lxml'},
to
def __init__(self, session=None, soup_config={'features': 'html.parser'},

After that, worked like a charm, successfully imported calendar to google.
It might need to be checked why it ignored lxml but worked fine with html.parser.

@mobeigi mobeigi added the question Further information is requested label May 22, 2020
@mobeigi
Copy link
Owner

mobeigi commented Jun 23, 2020

So I think the Pi just wasn't able to support lxml (maybe due to dependancies). A good solution here would be to prefer lxml and then fallback to html.parser if its unavailable.

@joseparreiras
Copy link

Seems like this error is back. I tried to run the latest version on my MacOS and I still get this.

@mobeigi
Copy link
Owner

mobeigi commented Jan 4, 2022

@joseparreiras Hmm honestly lxml has been a problem for a few people due to the external dependancy.

Since speed isn't a real problem with this tool, I might just change the parser to html.parser to fix the issue on a fair few platforms.

@joseparreiras
Copy link

Can I do it quickly myself?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants