OAK-10590 - If includedPaths and excludedPaths are specified as a String instead of array of String, interpret them as a one-element array of Strings by nfsantos · Pull Request #1254 · apache/jackrabbit-oak

nfsantos · 2023-12-19T16:34:43Z

The includedPaths property of an index definition should be an array of strings. But a common mistake made by users is to define it as a String when it has a single element. That is, instead of:

        "includedPaths": [ "/a/b"] ,

it is defined as:

        "includedPaths": "/a/b",

If includedPaths is defined as a String, the indexing job would ignore its value and instead default to use the root as the includedPaths, which results in downloading the full node store and creating an FFS containing everything (except hidden paths). This will slow down significantly the indexing job, as it will negate any benefits from using regex filtering. And even if regex filtering is not enabled or cannot be used, using / as includedPaths will also result in the FFS containing more nodes than it should, which will once again slow down the indexing job.

The same is true for excludedPaths, but in this case, the default value is an empty list of excludedPaths, so it will ignore the value of this property and will not exclude anything. This may result in parts of the node store being indexed that should not be indexed.

The handling of includedPaths and excludedPaths is also inconsistent with the handling of queryPaths, which is being interpreted as a one-element array if it is defined as a String.

This PR makes the logic that reads the includedPaths and excludedPaths properties more lenient, by treating Strings as one-element arrays and issuing a warning with a suggested fix.

Additionally, the PR makes some minor cleanups in the files that had to be modified (remove usages of Guava and fix some compilation warnings).

…a String instead of array of String, log a warning and treat them as a one-element array instead of assuming that it they are not defined.

thomasmueller · 2023-12-20T09:08:32Z

oak-store-spi/src/main/java/org/apache/jackrabbit/oak/spi/filter/PathFilter.java

            return property.getValue(Type.STRINGS);
+        } else if (property != null && property.getType() == Type.STRING) {
+            String value = property.getValue(Type.STRING);
+            LOG.warn("Property \"{}\"=\"{}\" has type String but it should be array of String. Proceeding by treating it as a " +


I would probably not log a warning. Instead, I would document that this is also supported.

Reason: we already have many many entries that are defined as string (mostly "/dummy") instead of array.

I would make it public and add a unit test case for it.

…as being done incorrectly, by setting the property as a string instead of array of strings.

…ept a string value as being a one-element array (NodeCounterMBeanEstimator). Other refactoring to clean up the code.

…TRING instead of STRINGS. Simplified the logic of extracting a list of strings from a property that can be STRING or STRINGS by taking advantage of the default value of PropertyState.getValue(STRINGS), which handles the case of the property being a STRING.

…ing instead of array of String, interpret them as a one-element array of Strings (#1254)

nfsantos added 3 commits December 19, 2023 09:51

WIP

b4ab7a4

If an index definition has includedPaths or excludedPaths defined as …

48dd5f4

…a String instead of array of String, log a warning and treat them as a one-element array instead of assuming that it they are not defined.

Fix wrong commit.

42bd018

thomasmueller reviewed Dec 20, 2023

View reviewed changes

nfsantos added 3 commits December 20, 2023 10:21

Remove setting excludePaths which is not relevant for this test and w…

da0f81c

…as being done incorrectly, by setting the property as a string instead of array of strings.

Update all parts of the code that read included/excluded paths to acc…

235991a

…ept a string value as being a one-element array (NodeCounterMBeanEstimator). Other refactoring to clean up the code.

Add tests and reduce code duplication.

4c5234d

nfsantos requested a review from thomasmueller December 20, 2023 12:01

fabriziofortino approved these changes Dec 21, 2023

View reviewed changes

nfsantos merged commit e9bb593 into apache:trunk Dec 21, 2023

nfsantos deleted the OAK-10590 branch December 21, 2023 17:32

rishabhdaim pushed a commit that referenced this pull request Jan 25, 2024

OAK-10590 - If includedPaths and excludedPaths are specified as a Str…

dd1bd62

…ing instead of array of String, interpret them as a one-element array of Strings (#1254)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OAK-10590 - If includedPaths and excludedPaths are specified as a String instead of array of String, interpret them as a one-element array of Strings#1254

OAK-10590 - If includedPaths and excludedPaths are specified as a String instead of array of String, interpret them as a one-element array of Strings#1254
nfsantos merged 7 commits intoapache:trunkfrom
nfsantos:OAK-10590

nfsantos commented Dec 19, 2023 •

edited

Loading

Uh oh!

thomasmueller Dec 20, 2023

Uh oh!

thomasmueller Dec 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nfsantos commented Dec 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasmueller Dec 20, 2023

Choose a reason for hiding this comment

Uh oh!

thomasmueller Dec 20, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nfsantos commented Dec 19, 2023 •

edited

Loading