Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove html tags when exporting report in excel format #3174

Merged

Conversation

OlaAkeela
Copy link
Contributor

When a report contains html formatted texts, exporting it in excel format would show html tags along with data in cells such as shown is the attached image.
image

I've added code to remove html tags if they exist when exporting a report to excel sheet.

@manassolanki
Copy link
Contributor

Hi @OlaAkeela ,
Please use "html2text" for the same. As this is already in the requirement list of the frappe and cover a lot of edge-cases, so you should use it instead.

@OlaAkeela
Copy link
Contributor Author

OlaAkeela commented Apr 27, 2017

Hi @manassolanki
I tried html2text but it does not give me the required format. As you can see in the attached image, the value of "href" attribute was not omitted.
image

@manassolanki
Copy link
Contributor

For ignoring the links you have to set ignore_links = True. I hope this would help you

import html2text
obj = html2text.HTML2Text()
#Ignore converting links from HTML
obj.ignore_links = True

@OlaAkeela
Copy link
Contributor Author

OlaAkeela commented Apr 30, 2017

@manassolanki I have used html2text in my last update.
I've found that html2text adds new lines to the end of texts. I've used body_width = 0, but it still produces new lines and my excel sheet was produced as in the attached image.
So I had to remove new break lines manually.
What was the problem of using rejex as I have done firstly like this ?

for row in data:
	clean_row = []
	for item in row:
		cleaner = re.compile('<.*?>')
		clean_row.append(re.sub(cleaner, '', item))

image

@rmehta
Copy link
Member

rmehta commented May 1, 2017

@OlaAkeela I think the regex was too aggressive, might have taken out email ids like <me@example.com>. Current one looks fine!

@rmehta rmehta merged commit 512bcd3 into frappe:develop May 1, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants