Use the pgvector extension with Laravel Scout for vector similarity search.
To see a full example showing how to use this package check out benbjurstrom/pgvector-scout-demo.
composer require benbjurstrom/pgvector-scout
php artisan vendor:publish --tag="scout-config"
php artisan vendor:publish --tag="pgvector-scout-config"
This is the contents of the published pgvector-scout.php
config file. By default it contains 3 different indexes, one for OpenAI, one for Google Gemini, and one for testing. The rest of this guide will use the OpenAI index as an example.
return [
/*
|--------------------------------------------------------------------------
| Embedding Index Configurations
|--------------------------------------------------------------------------
|
| Here you can define the configuration for different embedding indexes.
| Each index can have its own specific configuration options.
|
*/
'indexes' => [
'openai' => [
'handler' => Handlers\OpenAiHandler::class,
'model' => 'text-embedding-3-small',
'dimensions' => 256, // See Reducing embedding dimensions https://platform.openai.com/docs/guides/embeddings#use-cases
'url' => 'https://api.openai.com/v1',
'api_key' => env('OPENAI_API_KEY'),
'table' => 'openai_embeddings',
],
'gemini' => [
'handler' => Handlers\GeminiHandler::class,
'model' => 'text-embedding-004',
'dimensions' => 256,
'url' => 'https://generativelanguage.googleapis.com/v1beta',
'api_key' => env('GEMINI_API_KEY'),
'table' => 'gemini_embeddings',
'task' => 'SEMANTIC_SIMILARITY', // https://ai.google.dev/api/embeddings#tasktype
],
'ollama' => [
'handler' => Handlers\OllamaHandler::class,
'model' => 'nomic-embed-text',
'dimensions' => 768,
'url' => 'http://localhost:11434/api/embeddings',
'api_key' => 'none',
'table' => 'ollama_embeddings',
],
'fake' => [ // Used for testing
'handler' => Handlers\FakeHandler::class,
'model' => 'fake',
'dimensions' => 3,
'url' => 'https://example.com',
'api_key' => '123',
'table' => 'fake_embeddings',
],
],
];
SCOUT_DRIVER=pgvector
OPENAI_API_KEY=your-api-key
php artisan scout:index openai
php artisan migrate
Add the HasEmbeddings
and Searchable
traits to your model. Additionally add a searchableAs()
method that returns the name of your index. Finally implement toSearchableArray()
with the content from the model you want converted into an embedding.
use BenBjurstrom\PgvectorScout\Models\Concerns\HasEmbeddings;
use Laravel\Scout\Searchable;
class YourModel extends Model
{
use HasEmbeddings, Searchable;
/**
* Get the name of the index associated with the model.
*/
public function searchableAs(): string
{
return 'openai';
}
/**
* Get the indexable content for the model.
*/
public function toSearchableArray(): array
{
return [
'title' => $this->title,
'content' => $this->content,
];
}
}
Laravel Scout uses eloquent model observers to automatically keep your search index in sync anytime your Searchable models change.
This package uses this functionality automatically generate embeddings for your models when they are saved or updated; or remove them when your models are deleted.
If you want to manually generate embeddings for existing models you can use the artisan command below. See the Scout documentation for more information.
artisan scout:import "App\Models\YourModel"
You can use the typical Scout syntax to search your models. For example:
$results = YourModel::search('your search query')->get();
Note that the text of your query will be converted into a vector embedding using the model index's configured handler. It's important that the same model is used for both indexing and searching.
You can also pass an existing embedding vector as a search parameter. This can be useful to find related models. For example:
$vector = $someModel->embedding->vector;
$results = YourModel::search($vector)->get();
All search queries will be ordered by similarity to the given input and include the embedding relationship. The value of the nearest neighbor search can be accessed as follows:
$results = YourModel::search('your search query')->get();
$results->first()->embedding->neighbor_distance; // 0.26834 (example value)
The larger the distance the less similar the result is to the input.
By default this package uses OpenAI to generate embeddings. To do this it uses the OpenAiHandler class paired with the openai index found in the packages config file.
You can generate embeddings from other providers by adding a custom Handler. A handler is a simple class defined in the HandlerContract that takes a string, a config object, and returns a Pgvector\Laravel\Vector
object.
Whatever api calls or logic is needed to turn a string into a vector should be defined in the handle
method of your custom handler.
If you need to pass api keys, embedding dimensions, or any other configuration to your handler you can define them in the config/pgvector-scout.php
file.
If you're using DBngin for local development you can install the pgvector extention by doing the following:
- Add PostgreSQL to your path:
export PATH=/Users/Shared/DBngin/postgresql/14.3/bin:$PATH
- Then install pgvector:
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make && make install
The MIT License (MIT). Please see License File for more information.