Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace dot with underscore in HTML name #229

Closed
wants to merge 1 commit into from
Closed

Replace dot with underscore in HTML name #229

wants to merge 1 commit into from

Conversation

pkozul
Copy link

@pkozul pkozul commented Jul 23, 2021

I used docx-templates to generate a Word document with embedded HTML (altchunks). I was unable to open the document using Docx4j. The following error was reported:

org.docx4j.openpackaging.exceptions.Docx4JException: For source /word/document.xml, cannot find part word/template_document.xml_html1.html from rel html1=template_document.xml_html1.html
        at org.docx4j.openpackaging.io3.Load3.getRawPart(Load3.java:626)
        at org.docx4j.openpackaging.io3.Load3.getPart(Load3.java:372)

After unzipping the Word document, I noticed there were a bunch of files in the word folder that were named like this:

template_document.xml_html1.html
template_document.xml_html2.html
template_document.xml_html3.html

Since each file name contained two dots in it, I renamed them by replacing the first dot with an underscore:

template_document_xml_html1.html
template_document_xml_html2.html
template_document_xml_html3.html

After renaming the files, I zipped up the folder, and was then able to open the Word document in Docx4j.

This pull request contains a single line of code to ensure the first dot in the HTML file name is replaced with an underscore. Not sure if the file names could end up with multiple dots, so a regex global replace would probably be a better solution.

jjhbw added a commit that referenced this pull request Jul 25, 2021
@jjhbw
Copy link
Collaborator

jjhbw commented Jul 25, 2021

While this sounds to me like a bug in docx4j instead (these are valid filenames AFAIK), this is a simple enough change to implement on our side. Merged manually. Thanks.

@jjhbw jjhbw closed this Jul 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants