Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Regex HTML parsing and use beautiful soup instead. #156

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

AidanCurrah
Copy link
Contributor

No description provided.

@AidanCurrah
Copy link
Contributor Author

Builds failed as BeautifulSoup4 4.6.3 seems to remove the closing backslash. Downgrading to 4.6.0 for the time being.

@lewiscollard
Copy link
Contributor

Builds failed as BeautifulSoup4 4.6.3 seems to remove the closing backslash.

This is fine - it's consistent with everything else to avoid the XML style of void elements. Feel free to fix those tests and bump the Soup version to the latest.

cms/html.py Outdated
attrs["height"] = '"%s"' % thumbnail.height
else:
assert False
return str(soup.decode(formatter=None))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should six.text_type(soup) be fine here? (Or str(soup) whenever we remove Python 2 support.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the docs:

If you pass in formatter=None, Beautiful Soup will not modify strings at all on output. This is the fastest option, but it may lead to Beautiful Soup generating invalid HTML/XML[.]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's formatter=None at the moment as otherwise soup turns the © into ©

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants