-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
files are being skipped from search when the filename has a spanish accent #503
Comments
I assume you are searching by asterisk pattern, because these aren't good regular expressions. But you aren't including any wildcards in the the pattern either. * matches any number of characters, and ? matches a single character. So ".pdf" would be used to find any file with a pdf extension, and ".*" would be used to find any file name with any extension. Can you try searching by one of these? |
Hi, https://i.imgur.com/jz1k140.png Regards, |
I did reproduce this bug. It is specific to pdf files, which I had not tested before. As I commented on #504, dnGrep searches plain text, so it uses plug-ins to convert binary formatted files like Word, Excel and PDF to text before searching. The bug isn't actually in dnGrep, but in the pdftotext.exe application that dnGrep calls to extract text from pdf files. When calling pdftotext.exe dnGrep makes the call like this:
But instead of creating the file "Eliseo Verón.txt", pdftotext.exe creates a file named "Eliseo Verón.txt", and dnGrep can't find the correct file to search. This bug appears to be in pdftotext version 4, but not in version 3. The dnGrep installer installs version 4 with the application, but you can overwrite it with the older version. I attached pdftotext.exe version 3 to this note (see below). To use it, open this directory in Windows Explorer: |
Fixed in Release 2.9.345 |
Hi,
When I look for some content in some folder, "paths to match"= "", "." or ".pdf", the results list miss files with a positive match if the filename has a character with a spanish accent (IE: "ó").
I've tested this with the same archive, removing the character from one of them, and dnGrep finds only one of them.
The file is a PDF, just in case.
Regards,
The text was updated successfully, but these errors were encountered: