Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds feature to strip HTML from captions #1045

Merged
merged 4 commits into from
Nov 8, 2021

Conversation

Baa14453
Copy link
Contributor

@Baa14453 Baa14453 commented Nov 6, 2021

Because Pixiv allows some formatting in image descriptions, the HTML tags remain and it can mess up the formatting of some descriptions in image software e.g:

image

With the stripHTMLTagsFromCaption enabled, the HTML tags are removed using BeautifulSoup:
image

This does mean that some data might be lost .e.g the actual URL because it's inside the tag, for URLs though I made sure this is collected by writeUrlInDescription before they are removed by stripHTMLTagsFromCaption so they are available in the rest of the metadata.

image

Also I've removed Exiv2 from dependent software as it's not needed with the new Pyexiv2 library implemented earlier, but Visual Studio C++ Redistributable is required and is not installed on Windows by default.

@Nandaka Nandaka merged commit 482997c into Nandaka:master Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants