Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Laravel / Eloquent #5

Closed
chrisvidal opened this issue Dec 9, 2021 · 14 comments
Closed

Laravel / Eloquent #5

chrisvidal opened this issue Dec 9, 2021 · 14 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@chrisvidal
Copy link

Hi,
would that be possible to use a DB as an index instead of a json file? Or I am misunderstanding something.
My tech stack is MySQL 5.7.x, OctoberCMS/Laravel with Eloquent

@k-samuel k-samuel self-assigned this Dec 9, 2021
@k-samuel k-samuel added the help wanted Extra attention is needed label Dec 9, 2021
@k-samuel
Copy link
Owner

k-samuel commented Dec 9, 2021

Hi, of course. Json file used only for example. Moreover, it is even better to cache in Redis/etc.
Index file data is an associative array. You may choose the storage option at your discretion.

The only thing to understand. Index data must be fully loaded into script memory before starting the search.
$searchIndex->setData($indexData);

@chrisvidal
Copy link
Author

chrisvidal commented Dec 10, 2021

Thanks for your reply.

Let me see if I understand properly:

  • I need to maintain the DB index table: do I need to use the $searchIndex->addRecord() method or?
    • if I need to handle my own Index table in the DB, what rules and structure does it need to follow?
  • when a search occurs on the frontend, not sure what is supposed to be done here on backend:
    • do myself a select * from index then loop through all the results to build an array and do the ->setData($indexData)?
  • I can see the use of the filters in the code example; I am not clear if we can search for a string as well in some fields like name and description for instance?

well, obviously, I have no clue how to use this lib, and would love some more doc for a real case like mine.

thanks

@k-samuel
Copy link
Owner

k-samuel commented Dec 10, 2021

Working with faceted search consists of 2 stages.

Stage 1. Building the index.
We need to collect the faceted index in the background (similar to how Sphinx does it).

Fetch all rows from the database that need to be indexed and pass them to a faceted index.

 $index->add(...)

Then save the resulting index to the database.

$data = $index->getData();
$yourDbStorage->saveIndexData($data);

You need to re-index periodically, depending on how often your data changes.
Sometimes it is enough to do this once a day, there are times that you need to re-index in the background every 10 minutes.

Stage 2. Search
When a user comes to the site and makes a request, we fetch the index data from the DB/cache and do

$index->setData($yourDbStorage->loadIndexData());

Then we search

$search->find(...);

Faceted search itself is not full-text and cannot search for incomplete matches.
It only searches for a complete match of the attribute value.

It is designed to filter by attributes values.

How to be with fultext-search: #3

@chrisvidal
Copy link
Author

Thanks a lot @k-samuel for this.

I totally get the concept of it all. thanks.

It is the concrete implementation of the index that I have difficulty to envion.

How does this look like in reality $yourDbStorage->saveIndexData($data); ?
$data is an array. How can I save this array?

I was thinking to sync the index whenever a record is updated but it does not seem to be the principle at use here?

@k-samuel
Copy link
Owner

k-samuel commented Dec 10, 2021

The easiest option in pseudocode

$sql = ' INSERT INTO my_facet_table (`category`,`index_data`) VALUES ("my_index_name", "'.json_encode($index->getData()).'")';

$query->execute($sql);

where index_data is LONGTEXT or JSON column

At this moment realtime index option is not implemented. Only full indexing.

@k-samuel
Copy link
Owner

k-samuel commented Dec 10, 2021

Workaround. You can create update queue for records and launch indexer after update event. For example after uploading pricelist or batch update of product records.

@k-samuel
Copy link
Owner

k-samuel commented Dec 10, 2021

Thanks for your reply.

Let me see if I understand properly:

* I need to maintain the DB index table: do I need to use the `$searchIndex->addRecord()` method or?

* * if I need to handle my own Index table in the DB, what rules and structure does it need to follow?

* when a search occurs on the frontend, not sure what is supposed to be done here on backend:
  
  * do myself a `select * from index` then loop through all the results to build an array and do the `->setData($indexData)`?

* I can see the use of the filters in the code example; I am not clear if we can search for a string as well in some fields like `name` and `description` for instance?

well, obviously, I have no clue how to use this lib, and would love some more doc for a real case like mine.

thanks

The index data is a black box by design. Think of it as the binary data of the index itself. After collecting them, you need to save them somewhere in order to later reuse.

Try running the example https://github.com/k-samuel/faceted-search/tree/master/examples and see how the code works.

Attribute values ​​are passed as filters to the search, which can be found from $search->findAcceptableFiltersCount / findAcceptableFilters

These are exact matches of values. If you pass the "green" color to the filter, then the attribute should contain exactly that value.

Full-text search by part of the title or phrase is not provided, this is a slightly different direction.

I hope this helps.

@chrisvidal
Copy link
Author

yes it helps Thanks @k-samuel

I can spot 3 use cases in my app:

  1. search for a term
  2. search for a term with additional filters
  3. only filters

For 1./ I am thinking to create the index in real time after executing the SQL query to retrieve the records that match the searched term, then create the index because I'd like to use the findAcceptableFiltersCount for frontend

For 2./ similar to 1./ I guess

For 3./ I will use an existing index maintained and refresh every XX minutes.

@k-samuel
Copy link
Owner

k-samuel commented Dec 11, 2021

Sounds good and Interesting. Before that, I did not think that the index can be built on the basis of the query results.

For the first option, there is an alternative way - to get a list of identifiers that you found when searching for a term and pass them to findAcceptableFiltersCount as the second parameter. In this case, you do not have to rebuild the index every time.

$filters = $search->findAcceptableFiltersCount([],$foundIds);

@chrisvidal
Copy link
Author

thanks, it sounds great so far.

I am trying your idea

$foundIds is a flat array of Ids.
$filters is an empty array.

$data = $search->findAcceptableFiltersCount($filters, $foundIds); returns an empty array;

any idea why? probably the way I build the index?

@chrisvidal
Copy link
Author

...

    $companies = $query->get();

    foreach ($companies as $item) {
           $recordId = $item->id;
            $foundIds[] = $recordId;
            $itemData = [
                'capabilities' => $item['capabilities']->map(function ($value) use ($item) {
                    return $value->name;
                })->toArray(),
                'certificates' => $item['certificates']->map(function ($value) use ($item) {
                    return $value->name;
                })->toArray(),
                'tacs' => $item['tacs']->map(function ($value) use ($item) {
                    return $value->name;
                })->toArray(),
                'industries' => $item['industries']->map(function ($value) use ($item) {
                    return $value->name;
                })->toArray(),
                'enterprise_type' => $item->enterprise_type ? $item->enterprise_type->name : null,
                'country' => $item->country ? $item->country->name : null,
            ];
            $searchIndex->addRecord($recordId, $itemData);
        }

    $search = new Search($searchIndex);

   $result = [
       'filters' => $this->findFilters($search, $filters, $foundIds),
       'results' => ['data' => $results, 'count' => count($results), 'limit' => $pageLimit],
   ];

@k-samuel
Copy link
Owner

k-samuel commented Dec 13, 2021

It looks like a bug.
Here is a fix for this case (rare use case). f52b904

I am planing to release of the new version of the library with fix today at 20:00 Moscow time.

Thank you for finding the problem and helping to improve the project.

  • Fixed in v1.3.4 and v2.0.0

@chrisvidal
Copy link
Author

awesome! i updated to 1.3.4 and the acceptable filters are now available. Thanks @k-samuel

Is the v2.0 ready for use? What does it bring?

@k-samuel
Copy link
Owner

k-samuel commented Dec 14, 2021

It changes index structure (incompatible change) https://github.com/k-samuel/faceted-search/blob/master/changelog.md
Yes, ready to go. We can start a new thread for discussion. If the question according to the database is closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants