Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search pops up numerous SaveAs file dialogs for .doc files #629

Closed
Keyhabit opened this issue Jan 27, 2022 · 11 comments
Closed

Search pops up numerous SaveAs file dialogs for .doc files #629

Keyhabit opened this issue Jan 27, 2022 · 11 comments

Comments

@Keyhabit
Copy link

Keyhabit commented Jan 27, 2022

When searching for .doc files, dngrep will pop up Save As file dialogs for certain .doc files. When repeating the process, the dialog pops up again for the exact same select files (only a fraction of the .doc files in the searched folders) so the phenomenon seems to have something to do with the properties or contents of those files.

Any help in finding the cause would be helpful. As is, I can't use dngrep with .doc for fear of crashing Windows with the large number of dialogs popping up from extensive searches.

This is on Windows 10, German regional settings with v3 of dngrep but already occurred on v2.93.

Thanks!

@doug24
Copy link
Contributor

doug24 commented Jan 27, 2022

If you are seeing Save As file dialogs, they are from Word itself. The only save file dialog in dnGrep is for the search results report outputs.

The plug-in to search the old-style Word documents opens the .doc file using the Word application. dnGrep sets visibility flag for Word to false so you don't see the application open. It opens the .doc files as read-only and disables all macros. It calls some Word automation commands to extract all the text, closes the file, and closes Word. It's not clear why Word would think the file needs to be saved when it is opened in read-only mode, but it sounds like there is something in these document that Word automatically updates when the document is opened.

Checking the Word documentation again, there is an optional SaveChanges argument to the document Close method. I can try setting that to the value wdDoNotSaveChanges, and see if that helps.

Using Word to do this is very inefficient and prone to errors. It would be great to be able to get the text from a old Word document without using Word itself, but I have been unable to find any free open source library that can read the old file format.

@Keyhabit
Copy link
Author

Thank you very much for your quick and clear reply!

I'd be very happy to try your suggested fix. As for non-native ways to get at the text inside .doc files, I found this list of converters (some of them free / open source) but maybe you've come across those already and haven't found them usable:

  • XDoc2txt for many file types
  • Kryloff GetText (HTM HTML RTF PDF WPD HLP DOC XLS PPT XML)
  • GetTextIFilter (RTF DOC DOT PPT XLS XLT CS CPP H TXT HTM HTML)
  • SilverCoders DocToText (DOC)
  • AntiWord (DOC)

@doug24
Copy link
Contributor

doug24 commented Jan 28, 2022

I updated the Word doc plugin and merged it into master. If you are able to build the project, you can try it now. Otherwise, I'll have another release in a few weeks.

@Keyhabit
Copy link
Author

Thanks for doing this! I'll wait for the next release.

@celeron533
Copy link
Contributor

One of my friends who is using other file content search tool (not dnGrep) has the same issue these days. I guess it is a change on MSOffice side when pulling winword.exe in background.
taskkill /F /IM winword.exe will kill Word processes. But be careful to save your own work first before execute this command.

@Keyhabit
Copy link
Author

Thanks for chiming in and recommending taskkill to get back in control. Will wait to see if next release solves the issue. If not, will resort to your fix. Cheers.

@doug24
Copy link
Contributor

doug24 commented Feb 26, 2022

Fixed (hopefully) in v3.0.29.0

@Keyhabit
Copy link
Author

Can confirm no unwanted Word dialogs popping up anymore with this latest update. Thank you very much!

@Keyhabit
Copy link
Author

Noticed a slight issue: some of the searched .doc files now show up in Windows' recent files list, making it sort of unusable after a search. I suspect it may be just those ones that used to pop up the SaveAs dialog but am not entirely sure. It certainly looks like it isn't all the .doc files that were searched.

@doug24
Copy link
Contributor

doug24 commented Mar 27, 2022

I'm not able to do anything about the recent files list, I have looked for a solution. The problem is with the old style .doc files, Work is actually opening the files and does not distinguish between a hidden automation session and a user opening the file to view or edit the document.

@Keyhabit
Copy link
Author

I see. Thanks for looking into it! I should probably just take the time to convert all my .doc's to .docx's but it's a daunting task...

@doug24 doug24 closed this as completed Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants