Skip to content

Commit

Permalink
review commit:
Browse files Browse the repository at this point in the history
* added a cache to SearchIndexingSignalEnrichmentFacade in order to only evaluate "patterns" once for a given namespace
* removed copy&pasted unit tests in SearchIndexingSignalEnrichmentFacadeTest by adding another abstract test class AbstractCachingSignalEnrichmentFacadeTest
* added a unit test testing selection of JsonFieldSelectors based on different namespaces
* minor cleanup and formatting
* enhanced documentation about how to configure the indexed namespaces via system properties

Signed-off-by: Thomas Jäckle <thomas.jaeckle@beyonnex.io>
  • Loading branch information
thjaeckle committed Jan 24, 2024
1 parent 80086bc commit c297c65
Show file tree
Hide file tree
Showing 10 changed files with 650 additions and 993 deletions.
Expand Up @@ -330,24 +330,31 @@ The default behavior of Ditto is to index the complete JSON of a thing, which in
* Increased load on the search database, leading to performance degradation and increased database cost.
* Only a few fields are ever used for searching.

In Ditto *3.5.0*, there is now configuration to specify, by a namespace pattern, which fields will be included in the search database.
Since Ditto *3.5.0*, there is a configuration to specify, by a namespace pattern, which fields will be included in the search database.

To enable this functionality, there are two new options in the `thing-search.conf` configuration:

```
```hocon
ditto {
...
//...
caching-signal-enrichment-facade-provider = org.eclipse.ditto.thingsearch.service.persistence.write.streaming.SearchIndexingSignalEnrichmentFacadeProvider
...
//...
search {
namespace-search-include-fields = [
{
namespace-pattern = "org.eclipse",
search-include-fields = [ "attributes", "features/info" ]
namespace-pattern = "org.eclipse.test"
search-include-fields = [
"attributes",
"features/info/properties",
"features/info/other"
]
},
{
namespace-pattern = "org.eclipse.test",
search-include-fields = [ "attributes", "features/info/properties/", "features/info/other" ]
namespace-pattern = "org.eclipse*"
search-include-fields = [
"attributes",
"features/info"
]
}
]
}
Expand All @@ -356,17 +363,30 @@ ditto {
There is a new implementation of the caching signal enrichment facade provider that must be configured to enable this
functionality.

For each namespace pattern, only the selected fields are included in the search database. In the example above, for
things in the "org.eclipse" namespace, only the "attributes" and "features/info" paths will be the only fields indexed
in the search database. For things in the "org.eclipse.test" namespace, the fields indexed in the search database will
only be "attributes", "features/info/properties", and "features/info/other".
For each namespace pattern, only the selected fields are included in the search database. In the example above, for
things in the "org.eclipse.test" namespace, the fields indexed in the search database will
only be "attributes", "features/info/properties", and "features/info/other".
Things matching the "org.eclipse*" namespace, only the "attributes" and "features/info" paths will be the only fields
indexed in the search database.

Important notes:
* Ditto will use the namespace of the thing and match the FIRST namespace-pattern it encounters. So make sure any
configured namespace-patterns are unique enough to match.
* Ditto will automatically add the system-level fields it needs to operate, so no manual configuration of these is
necessary.

Example for configuring the same configuration via system properties for the `things-search` service:

```shell
-Dditto.search.namespace-search-include-fields.0.namespace-pattern=org.eclipse.test
-Dditto.search.namespace-search-include-fields.0.search-include-fields.0=attributes
-Dditto.search.namespace-search-include-fields.0.search-include-fields.1=features/info/properties
-Dditto.search.namespace-search-include-fields.0.search-include-fields.2=features/info/other
-Dditto.search.namespace-search-include-fields.1.namespace-pattern=org.eclipse*
-Dditto.search.namespace-search-include-fields.1.search-include-fields.0=attributes
-Dditto.search.namespace-search-include-fields.1.search-include-fields.1=features/info
```

## Logging

Gathering logs for a running Ditto installation can be achieved by:
Expand Down
Expand Up @@ -28,11 +28,11 @@
import org.eclipse.ditto.base.model.headers.DittoHeaders;
import org.eclipse.ditto.base.model.signals.Signal;
import org.eclipse.ditto.base.model.signals.WithResource;
import org.eclipse.ditto.internal.utils.pekko.logging.DittoLoggerFactory;
import org.eclipse.ditto.internal.utils.pekko.logging.ThreadSafeDittoLogger;
import org.eclipse.ditto.internal.utils.cache.Cache;
import org.eclipse.ditto.internal.utils.cache.CacheFactory;
import org.eclipse.ditto.internal.utils.cache.config.CacheConfig;
import org.eclipse.ditto.internal.utils.pekko.logging.DittoLoggerFactory;
import org.eclipse.ditto.internal.utils.pekko.logging.ThreadSafeDittoLogger;
import org.eclipse.ditto.json.JsonFactory;
import org.eclipse.ditto.json.JsonFieldSelector;
import org.eclipse.ditto.json.JsonObject;
Expand Down Expand Up @@ -97,7 +97,7 @@ public CompletionStage<JsonObject> retrieveThing(final ThingId thingId, final Li

final DittoHeaders dittoHeaders = DittoHeaders.empty();

JsonFieldSelector fieldSelector = determineSelector(thingId.getNamespace());
final JsonFieldSelector fieldSelector = determineSelector(thingId.getNamespace());

if (minAcceptableSeqNr < 0) {
final var cacheKey =
Expand Down Expand Up @@ -450,7 +450,7 @@ private JsonObject enhanceJsonObject(final JsonObject jsonObject, final List<Thi
}

@Nullable
protected JsonFieldSelector determineSelector(String namespace) {
protected JsonFieldSelector determineSelector(final String namespace) {
// By default, we do not return a field selector.
return null;
}
Expand Down
Expand Up @@ -12,25 +12,17 @@
*/
package org.eclipse.ditto.internal.models.signalenrichment;

import org.apache.pekko.japi.Pair;
import org.eclipse.ditto.base.model.headers.DittoHeaders;
import org.eclipse.ditto.internal.utils.cache.config.CacheConfig;
import org.eclipse.ditto.internal.utils.pekko.logging.DittoLoggerFactory;
import org.eclipse.ditto.internal.utils.pekko.logging.ThreadSafeDittoLogger;
import org.eclipse.ditto.json.JsonFieldSelector;
import org.eclipse.ditto.json.JsonObject;
import org.eclipse.ditto.things.model.Thing;
import org.eclipse.ditto.things.model.ThingId;
import org.eclipse.ditto.things.model.signals.events.ThingEvent;
import static org.eclipse.ditto.base.model.common.ConditionChecker.checkNotNull;

import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.CompletionStage;
import java.util.concurrent.Executor;
import java.util.regex.Pattern;

import static org.eclipse.ditto.base.model.common.ConditionChecker.checkNotNull;
import org.apache.pekko.japi.Pair;
import org.eclipse.ditto.internal.utils.cache.config.CacheConfig;
import org.eclipse.ditto.json.JsonFieldSelector;

/**
* Extension of {@code DittoCachingSignalEnrichmentFacade} that allows a selected map of selected indexes grouped by
Expand All @@ -39,6 +31,7 @@
public final class SearchIndexingSignalEnrichmentFacade extends DittoCachingSignalEnrichmentFacade {

private final List<Pair<Pattern, JsonFieldSelector>> selectedIndexes;
private final Map<String, JsonFieldSelector> selectedIndexesCache;

private SearchIndexingSignalEnrichmentFacade(
final List<Pair<Pattern, JsonFieldSelector>> selectedIndexes,
Expand All @@ -49,7 +42,8 @@ private SearchIndexingSignalEnrichmentFacade(

super(cacheLoaderFacade, cacheConfig, cacheLoaderExecutor, cacheNamePrefix);

this.selectedIndexes = Collections.unmodifiableList(selectedIndexes);
this.selectedIndexes = List.copyOf(selectedIndexes);
selectedIndexesCache = new HashMap<>();
}

/**
Expand Down Expand Up @@ -78,16 +72,15 @@ public static SearchIndexingSignalEnrichmentFacade newInstance(
}

@Override
protected JsonFieldSelector determineSelector(String namespace) {
protected JsonFieldSelector determineSelector(final String namespace) {

// We iterate through the list and return the first JsonFieldSelector that matches the namespace pattern.
for (final Pair<Pattern, JsonFieldSelector> pair : selectedIndexes) {

if (pair.first().matcher(namespace).matches()) {
return pair.second();
}
if (!selectedIndexesCache.containsKey(namespace)) {
// We iterate through the list and return the first JsonFieldSelector that matches the namespace pattern.
selectedIndexes.stream()
.filter(pair -> pair.first().matcher(namespace).matches())
.findFirst()
.ifPresent(pair -> selectedIndexesCache.put(namespace, pair.second()));
}

return super.determineSelector(namespace);
return selectedIndexesCache.get(namespace);
}
}

0 comments on commit c297c65

Please sign in to comment.