-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue on 'for loop' with shapes inside? #61
Comments
The error occurs because the for loop generates figures with duplicated ids. I will see if I can come up with a fix. |
@fernandobrito thanks for looking into this! You are definitely on the right track. The way sablon loops work is that they copy the contents for every iteration. If there are elements inside the loop that have to have unique properties, Word will most likely complain. |
Any suggestions on how to fix it? I needed something quick for a client, so I'm just appending a unique number to each node inside the loop body that has an "id" attribute. It solved my issue with figures. However, I'm not sure if it is always safe to update all nodes with an "id" attribute or if other types of nodes may have different unique ids but use some other name on the node attribute. |
@fernandobrito I would assume id's are unique across the entire document but to really be sure you'd need to check the XML spec (5500 pages of light reading if you need something to do on a weekend). While gigantic it is very well organized and bookmarked so its not as bad as it seems. Another good reference for WordML and DrawingML: http://officeopenxml.com, they reference the 3rd edition XML spec but I think that is simply due to websites age. Could you throw out a small excerpt of the corrupted version i.e. duplicated id's and a excerpt of your fixed version? Seeing what is actually going on will help me out a bit. I think the best way to fix it is to add functionality that finds the max id value in use and increments it from that point forward. You'll need to check every element that has an |
@stadelmanma thanks for the references. I've done a quick search on the XML spec and it seems most figures/drawings have their unique ids described as an attribute either on the wp:docPr or on the wp:cNvPr tag. I couldn't really understand when which one is used. The description for the id attribute (which should be a unique integer) is:
Here you can find a small example of a rectangle inside a loop: https://github.com/fernandobrito/sablon/blob/eebd895954b157e659fbf180ea8312b027a05a69/test/fixtures/xml/figure_loop.xml#L30. Line 30 has a docPr element with an id attribute which will get duplicated when sablon make copies of the loop body, corrupting the output file. I've started playing around with your idea of finding the highest id and then assigning new ids for each loop iteration (on this fork), but I just realized I did a big mistake. I started doing my work on top of the images support PR :(. When I have more time I will cherry-pick my changes on top of senny/sablon master so you can take a look and maybe provide some feedback. By the way, I've been using a lot a tool from Microsoft: https://www.microsoft.com/en-us/download/details.aspx?id=30425. It lets me generate diffs between OOXML files and also validate them. Is there anywhere where I should add it for future contributors? Perhaps on a wiki page on this repo? |
@fernandobrito I completely missed the last bit of your comment about the MS tool. Sadly, I don't have access to a Windows computer to try it out. Does it generate a diff like git showing you the line by line changes to each XML file in the document? The validation part sounds extremely useful since MS Word tends to be pretty ambiguous when a docx gets corrupted. What kind of information does it generate when validating a document? |
@stadelmanma I had to install Windows on a virtual machine in order to use it. Yes, the tool provides diffs, such as: Maybe the biggest win is the validation feature. It shows which lines are causing problems: That's how I found about the duplicated id issue on figures. |
I think this is more easily fixable with the new DOM logic, I have a branch on my fork to work on it. All Additional notes on any elements that use the
Side note there is a <w:id> element but we shouldn't need to worry about this one (17.5.2.18). |
Implementation stages:
|
Hello,
When I have a "for loop" with a shape inside, and the loop runs more than once, Word claims that output is corrupted.
Sample input:
If loop runs only once, output is as expected (one blue rectangle). If loop is run more than once, Word 2016 complains:
But if I try to proceed and let Word recover the document, output is as expected (two rectangles). If I replace the figure with text, everything works fine.
I will try to investigate the issue soon. Just posting it here to force myself to come later with a following up and to ask if this is a known bug.
Thanks again for the great work on the gem.
The text was updated successfully, but these errors were encountered: