Skip to content

Commit

Permalink
Support for standalone index fields and query conditions on this fiel…
Browse files Browse the repository at this point in the history
…ds (#43)

* Add support for standalone fields and where conditions

* Remove obsolete limitation from README
  • Loading branch information
Namoshek committed May 15, 2023
1 parent 637430c commit 2c24bf7
Show file tree
Hide file tree
Showing 10 changed files with 939 additions and 4 deletions.
57 changes: 56 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,14 +155,69 @@ based on the inverse document frequency (i.e. the ratio between indexed document
the term frequency (i.e. the number of occurrences of a search term within a document) and the term deviation (which is only relevant for the
wildcard search). Returned are documents ordered by their score in descending order, until the desired limit is reached.

### Extending the Search Index

It is possible to extend the search index table (`scout_index`) with custom columns.
During indexing, these columns may be filled with custom content and during searching the searches can be scoped to these columns (exact match).
This feature is particularly useful when working with a multi-tenancy application where the search index is used by multiple tenants.

#### Example Migration

In our example, we add a mandatory `tenant_id` column to the search index.

```php
return new class extends Migration {
public function up(): void
{
Schema::table('scout_index', function (Blueprint $table) {
$table->uuid('tenant_id');
});
}

public function down(): void
{
Schema::table('scout_index', function (Blueprint $table) {
$table->dropColumn(['tenant_id']);
});
}
};
```

#### Indexing Example

The `tenant_id` is added during indexing for each model:

```php
class User extends Model
{
public function toSearchableArray(): array
{
return [
'id' => $this->id,
'name' => $this->name,
'tenant_id' => new StandaloneField($this->tenant_id),
];
}
}
```

#### Search Example

The `tenant_id` is filtered during search based on the `$tenantId`, which may for example be taken from the HTTP request:

```php
User::search('Max Mustermann')
->where('tenant_id', $tenantId)
->get();
```

## Limitations

Obviously, this package does not provide a search engine which (even remotely) brings the performance and quality a professional search engine
like Elasticsearch offers. This solution is meant for smaller to medium-sized projects which are in need of a rather simple-to-setup solution.

Also worth noting, the following Scout features are currently not implemented:
- Soft Deletes
- Search with custom conditions using `User::search('Max')->where('city', 'Bregenz')`
- Search custom index using `User::search('Mustermann')->within('users_without_admins')`
- Search with custom order using `User::search('Musterfrau')->orderBy('age', 'desc')`
- Implementing this feature would be difficult in combination with the scoring algorithm. Only the result of the database query could be ordered, while this could then lead to issues with pagination.
Expand Down
32 changes: 29 additions & 3 deletions src/DatabaseIndexer.php
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,14 @@ public function index(Collection|array $models): void
// Normalize the searchable data of the model. First, all inputs are converted to their
// lower case counterpart. Then the input for each attribute is tokenized and the resulting
// tokens are stemmed. The result is an array of models with a list of stemmed words.
$rowsToInsert = [];
$rowsToInsert = [];
$standaloneFieldsToInsert = [];
foreach ($models as $model) {
$stems = Arr::flatten($this->normalizeSearchableData($model->toSearchableArray()));
$searchableArray = $model->toSearchableArray();
$searchableData = array_filter($searchableArray, fn ($value) => ! $value instanceof StandaloneField);
$standaloneFields = array_filter($searchableArray, fn ($value) => $value instanceof StandaloneField);

$stems = Arr::flatten($this->normalizeSearchableData($searchableData));

$terms = [];
foreach ($stems as $stem) {
Expand All @@ -60,13 +65,34 @@ public function index(Collection|array $models): void
}

foreach ($terms as $term => $hits) {
$rowsToInsert[] = [
$row = [
'document_type' => $model->searchableAs(),
'document_id' => $model->getKey(),
'term' => (string) $term,
'length' => mb_strlen((string) $term),
'num_hits' => $hits,
];

foreach ($standaloneFields as $key => /** @var StandaloneField $value */ $value) {
$row[$key] = $value->value;

if (! in_array($key, $standaloneFieldsToInsert)) {
$standaloneFieldsToInsert[] = $key;
}
}

$rowsToInsert[] = $row;
}
}

// Ensure that all rows have the same standalone fields or a null replacement.
if (! empty($standaloneFieldsToInsert)) {
foreach ($rowsToInsert as $key => $row) {
foreach ($standaloneFieldsToInsert as $standaloneFieldToInsert) {
if (! array_key_exists($standaloneFieldToInsert, $row)) {
$rowsToInsert[$key][$standaloneFieldToInsert] = null;
}
}
}
}

Expand Down
21 changes: 21 additions & 0 deletions src/DatabaseSeeker.php
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,11 @@ private function createSearchQuery(Builder $builder, array $keywords): QueryBuil
->table('matches_with_score')
->withExpression('documents_in_index', function (QueryBuilder $query) use ($builder) {
$query->from($this->databaseHelper->indexTable())
->when(! empty(self::getWhereConditions($builder)), function (QueryBuilder $query) use ($builder) {
foreach (self::getWhereConditions($builder) as $key => $value) {
$query->where($key, $value);
}
})
->whereRaw("document_type = '{$builder->model->searchableAs()}'")
->select([
'document_type',
Expand All @@ -125,6 +130,11 @@ private function createSearchQuery(Builder $builder, array $keywords): QueryBuil
})
->withExpression('document_index', function (QueryBuilder $query) use ($builder) {
$query->from($this->databaseHelper->indexTable())
->when(! empty(self::getWhereConditions($builder)), function (QueryBuilder $query) use ($builder) {
foreach (self::getWhereConditions($builder) as $key => $value) {
$query->where($key, $value);
}
})
->whereRaw("document_type = '{$builder->model->searchableAs()}'")
->select([
'id',
Expand Down Expand Up @@ -212,4 +222,15 @@ private function getTokenizedStemsFromSearchString(string $searchString): array

return array_map(fn ($word) => $this->stemmer->stem($word), $words);
}

/**
* Returns the filtered where conditions of the given builder which are supported by this search engine.
*
* @param Builder $builder
* @return array
*/
private static function getWhereConditions(Builder $builder): array
{
return array_filter($builder->wheres, fn ($key) => $key !== '__soft_deleted', ARRAY_FILTER_USE_KEY);
}
}
20 changes: 20 additions & 0 deletions src/StandaloneField.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<?php

declare(strict_types=1);

namespace Namoshek\Scout\Database;

use Laravel\Scout\Builder;

/**
* A wrapper for indexable data which can be used to mark a field as standalone.
* Such fields will be indexed in separate database columns and are filterable with {@see Builder::where()} for exact matches.
*
* @package Namoshek\Scout\Database
*/
class StandaloneField
{
public function __construct(public mixed $value)
{
}
}
1 change: 1 addition & 0 deletions tests/DatabaseIndexerTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ public function test_removing_all_entities_of_model_from_search_does_not_affect_
User::removeAllFromSearch();

$this->assertDatabaseCount('scout_index', 3);
$this->assertDatabaseMissing('scout_index', ['document_type' => 'user']);
$this->assertDatabaseHas('scout_index', [
'document_type' => 'post',
'document_id' => 1,
Expand Down

0 comments on commit 2c24bf7

Please sign in to comment.