Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unoconv includes deleted text when converting from doc to txt #40

Closed
vincentbernat opened this issue Feb 21, 2012 · 8 comments
Closed

Comments

@vincentbernat
Copy link
Contributor

Hi!

When a .doc file with recorded changes is converted to a .txt, deleted text appears in the converted version. The same happens when using LibreOffice directly but a work around is to disable the "Show changes" option in LibreOffice menu. There is no such option with unoconv.

(This bug was originally reported to Debian BTS as bug #624295)

@dagwieers
Copy link
Member

Hi Vincent,

If the export filter has filter options for this (through the GUI) then there must be a way to instruct unoconv to do the same using -e / --export. The question is: What s the filter option name to do this !

In the file docs/filters.txt we already describe all the export filter options of the PDF export filter (at least, all options that we found documented on the OpenOffice website at one point), we would need the same for the Text export filter. So in itself, unoconv already can do it, we simply don't know how ;-)

@dagwieers
Copy link
Member

Ok, I found a way to do it, but I need your help to test this.

diff --git a/unoconv b/unoconv
index 972e962..dd43884 100755
--- a/unoconv
+++ b/unoconv
@@ -882,6 +882,11 @@ class Convertor:
             info(2, "Selected office filter: %s" % outputfmt.filter)
             info(2, "Used doctype: %s" % outputfmt.doctype)

+            ### Document properties phase
+            phase = "properties"
+            if hasattr(document, 'ShowChanges'):
+                document.ShowChanges = False
+
             ### Export phase
             phase = "export"

Quite a small change, unclear yet how we would like to make this optional (together with a large set of other options one may want to set).

@vincentbernat
Copy link
Contributor Author

I am sorry to have not answered yet. For some unknown reason, I didn't received a notification for your comment. I will add the patch in the next Debian package and ask for testing.

@bulletmark
Copy link

I converted a heap of word docs to pdf using unoconv and found all output files were showing deleted text. Found this bug and applied the patch manually to fix the problem. It would be good to get his fix released.

@vincentbernat
Copy link
Contributor Author

Hey!

The patch is in Debian since July 2013 and nobody complained. It is now in a stable release too. I think you can merge it.

@skywinder
Copy link
Contributor

@vincentbernat Hi. I also test it a bit - it seems, that it's works as expected.
I'm waiting of @dagwieers approve to release new version of unoconv.
And can also include this fix there as well.
@dagwieers what do you think about it?

@dagwieers
Copy link
Member

I guess we can add it as the default. And people who do want to old behaviour can write a plugin to do the opposite (once we have the plugin/extension system integrated in v0.8).

@Yichen-fqyd
Copy link

Hi, when I was using the unoconv to convert from word to pdf, the changes(deleted text) is still included, is there any way to solve this problem??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants