-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix expansion of bracketed expressions in RegExpBasedFileFinder #7338
Fix expansion of bracketed expressions in RegExpBasedFileFinder #7338
Conversation
This avoid replicating the exact same behaviour in RegExpBasedFileFinder
For this particular use we must assume that the rest of the text is a regexp, and only put the content of the expanded bracket between quotes. It must return a String as this is only a part of the final regexp that will be compiled later.
} catch (UncheckedIOException | IOException ioe) { | ||
return Collections.emptyList(); | ||
resultFiles.addAll(pathStream.collect(Collectors.toList())); | ||
} catch (UncheckedIOException uncheckedIOException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this even an UncheckedIOException?
If needed, the UncheckedException is just a wrapper for the IOEception. So you could call throw uncheckedException.getCause()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I am not sure why an UncheckedIOException
is caught in this part of the code. I don't have much experience with nio.* but my interpretation of the API is that the UncheckedIOException
must be caught in the parts of the code that make use of the Path
reference.
I don't know if there are any other potential issues with a lazily loaded file system walk. Based on DirectoryStream and Files.walk I'd guess it could be thrown if depth > 1 and there is a cycle, hence, not in this part of the code unless it is changed.
This needs deeper thinking at review. At first gut feeling, |
This is possible in this PR. I'll review my PR description and see if I can clarify. |
@koppor @Siedlerchr sorry about that. Sometimes it is a bit too easy to tunnel-vision and assume something is "obvious", while forgetting all the assumptions needed to make it obvious. The PR description is updated and, hopefully, more readable. |
src/test/java/org/jabref/logic/util/io/RegExpBasedFileFinderTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I've only one suggestion for the tests, otherwise +1 for merge.
src/test/java/org/jabref/logic/util/io/RegExpBasedFileFinderTests.java
Outdated
Show resolved
Hide resolved
* upstream/master: (217 commits) Fix handling of URL in file field (JabRef#7347) Fix expansion of bracketed expressions in RegExpBasedFileFinder (JabRef#7338) Refactor unlinked files (JabRef#7209) Add pressing enter when the search field is focused as a way to trigger search (JabRef#7377) Upgrade citeproc to 3.x snapshot without graal (JabRef#7370) Fix Exception if no AzureInstrumentationKey is available (JabRef#7373) Update snapcraft source url (JabRef#7372) Fix checkstyle and adjust language GitBook: [master] 3 pages and 32 assets modified Add migration to special field (JabRef#7368) GitBook: [master] 5 pages and 29 assets modified Modify message at the duplicates found dialog (JabRef#7231) Fixes miss-parsed names in `AutomaticPersonsGroup` (JabRef#7228) Fix an issue where the password for a shared SQL database was only remembered if it was the same as the username (JabRef#7364) Fix harvard exporter by changing AuthorsFormatter (JabRef#7355) Bump styfle/cancel-workflow-action from 0.6.0 to 0.7.0 (JabRef#7363) Bump mockito-core from 3.7.0 to 3.7.7 (JabRef#7360) Bump org.beryx.jlink from 2.23.1 to 2.23.2 (JabRef#7361) Bump libreoffice from 7.0.3 to 7.0.4 (JabRef#7362) Export full month name instead of number in ms office (JabRef#7358) ... # Conflicts: # external-libraries.md # src/main/java/module-info.java # src/main/java/org/jabref/gui/shared/SharedDatabaseLoginDialogViewModel.java
* upstream/master: Bump archunit-junit5-api from 0.15.0 to 0.16.0 (#7407) Bump classgraph from 4.8.98 to 4.8.102 (#7401) Bump archunit-junit5-engine from 0.15.0 to 0.16.0 (#7402) Bump mariadb-java-client from 2.7.1 to 2.7.2 (#7406) Bump org.beryx.jlink from 2.23.2 to 2.23.3 (#7400) Bump checkstyle from 8.39 to 8.40 (#7404) Ignore codecov status for automerge Fixes issue of Changing font size makes font size field too small (#7398) fix "Alt + keyboard shortcuts do not work" (#7379) Fixed invisible file path in the dark theme (#7396) Fix File Filter and some layout issues (#7385) Feature/implement complex queries (#7350) Change format for study definition to yaml (#7126) Fix handling of URL in file field (#7347) Fix expansion of bracketed expressions in RegExpBasedFileFinder (#7338)
Fixes #4342
#4342 contains two issues.
[title]
does not refer to the StandardField title, but a "special" pattern intended to give nicer output (this is why dashes don't work, see Finding files with non-alphabetical characters in title #4342 (comment)). It is intended and can be avoided by using[TITLE]
.(abc)
will matchabc
, but not(abc)
.If the expanded bracketed pattern contains characters that have a special meaning in regexp (such as a parenthesis and brackets), it becomes a problem.
An entry with the title
Regexp from [A-Z]
and the bracketed expression[TITLE]
would match a file titledRegexp from B
, but not a file namedRegexp from [A-Z]
.The goal of this PR
Match files whose expanded bracketed pattern includes symbols that can be miss-interpreted.
Pattern.quote
. I don't see a reasonable use-case for allowing a bracket to expand into an actual regex.ACM/IEEE-CS
andACM_IEEE-CS
).Note: Regex can still be used outside of the bracketed patterns. (except for character classes, since they will be interpreted as a bracketed pattern)
Issues neither addressed in this PR nor the current master
**/.*[title][a-z]+.*\\.[extension]
,[a-z]
is treated as if it is a bracketed patternChecklist
Check if parts can be replaced by a glob (getPathMatcher)Add type-safety to the expanded bracket (return a compiled Pattern when it is supposed to be a RegularExpression)Add test case for an expression that would result in a JabRef corrected filename ("ACM/IEEE-CS Information Technology Curriculum" -> "ACM_IEEE_CS Information Technology Curriculum")I don't think this can be done reliably within the scope of this PRMake sure a warning appears in the debug log if a filename/dirname is expected to be too longRefactorI don't think this can be done reliably within the scope of this PRFileUtil.createFileNameFromPattern
so the same code is usedScreenshots added in PR description (for UI changes)