Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get XML from docx file #67

Closed
theasteve opened this issue Jun 4, 2019 · 2 comments
Closed

How to get XML from docx file #67

theasteve opened this issue Jun 4, 2019 · 2 comments

Comments

@theasteve
Copy link

theasteve commented Jun 4, 2019

I'm trying to convert a docx file into PDF. The process I thought about was as follows, convert the docx file into an HTML file and from HTML into PDF. However, using this process the outcome wasn't what I expected.
testing.pdf

This is what it looks like after the process mentioned above. Here is a link to the origin docx file
https://www.dropbox.com/s/f1klwguv4r9iyje/testing.docx?dl=0

I think word documents use XML so this might improve how documents are displayed if I saved the file from docx to xml and then into PDF(You might have better direction on this.)

So far I have doc = Docx::Document.open('testing.docx') When I try to get the XML from the document I get nil.

[61] pry(#<PDFProducer>)> doc.xml
=> nil

Can one get XML from the word document? Or am I wrong in my assumption that word documents use XML?

https://stackoverflow.com/questions/56450113/font-size-convert-docx-into-pdf-in-ruby-using-wickedpdf-and-docx

@unixmonkey
Copy link

doc = Docx::Document.open('testing.docx')
File.open("testing.html", 'wb') do |f|
  f << doc.to_html
end

@theasteve theasteve changed the title How to save docx into HTML file How to get XML from docx file Jun 4, 2019
@theasteve
Copy link
Author

@unixmonkey Just saw your answer, I just updated my post. Should I closed it and open a new one? and bring the old question back to represent your answer? Yes, your answer is correct I came across it earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants