Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions demo/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ symfony console mcp:server
- **Agents**: blog, stream, youtube, wikipedia, audio
- **Platform**: OpenAI integration
- **Store**: ChromaDB vector store
- **Indexer**: Text embedding model
- **Ingester**: Text embedding model

### Chat Pattern
- `Chat` class: Message flow and session management
Expand All @@ -76,4 +76,4 @@ symfony console mcp:server
- OpenAI GPT-4o-mini default model
- ChromaDB on port 8080
- LiveComponents for real-time UI
- Symfony DI and best practices
- Symfony DI and best practices
8 changes: 4 additions & 4 deletions demo/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This is a Symfony 7.3 demo application showcasing AI integration capabilities us

### Core Components
- **Chat Systems**: Multiple specialized chat implementations in `src/` (Blog, YouTube, Wikipedia, Audio, Stream)
- **Twig LiveComponents**: Interactive UI components using Symfony UX for real-time chat interfaces
- **Twig LiveComponents**: Interactive UI components using Symfony UX for real-time chat interfaces
- **AI Agents**: Configured agents with different models, tools, and system prompts
- **Vector Store**: ChromaDB integration for embedding storage and similarity search
- **MCP Tools**: Model Context Protocol tools for extending agent capabilities
Expand All @@ -36,7 +36,7 @@ composer install
echo "OPENAI_API_KEY='sk-...'" > .env.local

# Initialize vector store
symfony console ai:store:index blog -vv
symfony console ai:store:ingest blog -vv

# Test vector store
symfony console ai:store:retrieve blog "Week of Symfony"
Expand Down Expand Up @@ -81,7 +81,7 @@ symfony console mcp:server
- **Agents**: Multiple pre-configured agents (blog, stream, youtube, wikipedia, audio)
- **Platform**: OpenAI integration with API key from environment
- **Store**: ChromaDB vector store for similarity search
- **Indexer**: Text embedding model configuration
- **Ingester**: Text embedding model configuration

### Chat Implementations
Each chat type follows the pattern:
Expand All @@ -100,4 +100,4 @@ Chat history stored in Symfony sessions with component-specific keys (e.g., 'blo
- ChromaDB runs on port 8080 (mapped from container port 8000)
- Application follows Symfony best practices with dependency injection
- LiveComponents provide real-time UI updates without custom JavaScript
- MCP server enables tool integration for AI agents
- MCP server enables tool integration for AI agents
2 changes: 1 addition & 1 deletion demo/config/packages/ai.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ ai:
openai:
platform: 'ai.platform.openai'
model: 'text-embedding-ada-002'
indexer:
ingester:
blog:
loader: 'Symfony\AI\Store\Document\Loader\RssFeedLoader'
source: 'https://feeds.feedburner.com/symfony/blog'
Expand Down
38 changes: 19 additions & 19 deletions docs/bundles/ai-bundle.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ Advanced Example with Multiple Agents
mistral_embeddings:
platform: 'ai.platform.mistral'
model: 'mistral-embed'
indexer:
ingester:
default:
loader: 'Symfony\AI\Store\Document\Loader\InMemoryLoader'
vectorizer: 'ai.vectorizer.openai_embeddings'
Expand Down Expand Up @@ -721,26 +721,26 @@ The ``ai:store:drop`` command drops the infrastructure for a store (e.g., remove
This command only works with stores that implement ``ManagedStoreInterface``.
Not all store types support drop operations.

``ai:store:index``
~~~~~~~~~~~~~~~~~~
``ai:store:ingest``
~~~~~~~~~~~~~~~~~~~

The ``ai:store:index`` command indexes documents into a store using a configured indexer.
The ``ai:store:ingest`` command ingests documents into a store using a configured ingester.

.. code-block:: terminal

$ php bin/console ai:store:index <indexer>
$ php bin/console ai:store:ingest <ingester>

# Index using the default indexer
$ php bin/console ai:store:index default
# Ingest using the default ingester
$ php bin/console ai:store:ingest default

# Override the configured source with a single file
$ php bin/console ai:store:index blog --source=/path/to/file.txt
$ php bin/console ai:store:ingest blog --source=/path/to/file.txt

# Override with multiple sources
$ php bin/console ai:store:index blog --source=/path/to/file1.txt --source=/path/to/file2.txt
$ php bin/console ai:store:ingest blog --source=/path/to/file1.txt --source=/path/to/file2.txt

The ``--source`` (or ``-s``) option allows you to override the source(s) configured in your indexer.
This is useful for ad-hoc indexing operations or testing different data sources.
The ``--source`` (or ``-s``) option allows you to override the source(s) configured in your ingester.
This is useful for ad-hoc ingesting operations or testing different data sources.

Usage
-----
Expand Down Expand Up @@ -935,7 +935,7 @@ Vectorizers
-----------

Vectorizers are components that convert text documents into vector embeddings for storage and retrieval.
They can be configured once and reused across multiple indexers, providing better maintainability and consistency.
They can be configured once and reused across multiple ingesters, providing better maintainability and consistency.

Configuring Vectorizers
~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -961,15 +961,15 @@ Vectorizers are defined in the ``vectorizer`` section of your configuration:
platform: 'ai.platform.mistral'
model: 'mistral-embed'

Using Vectorizers in Indexers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Using Vectorizers in Ingesters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Once configured, vectorizers can be referenced by name in indexer configurations:
Once configured, vectorizers can be referenced by name in ingester configurations:

.. code-block:: yaml

ai:
indexer:
ingester:
documents:
loader: 'Symfony\AI\Store\Document\Loader\TextFileLoader'
vectorizer: 'ai.vectorizer.openai_small'
Expand All @@ -988,14 +988,14 @@ Once configured, vectorizers can be referenced by name in indexer configurations
Benefits of Configured Vectorizers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* **Reusability**: Define once, use in multiple indexers
* **Consistency**: Ensure all indexers using the same vectorizer have identical embedding configuration
* **Reusability**: Define once, use in multiple ingesters
* **Consistency**: Ensure all ingesters using the same vectorizer have identical embedding configuration
* **Maintainability**: Change vectorizer settings in one place

Retrievers
----------

Retrievers are the opposite of indexers. While indexers populate a vector store with documents,
Retrievers are the opposite of ingesters. While ingesters populate a vector store with documents,
retrievers allow you to search for documents in a store based on a query string.
They vectorize the query and retrieve similar documents from the store.

Expand Down
14 changes: 8 additions & 6 deletions docs/components/store.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,21 @@ implemented by different concrete and vendor-specific implementations, so called
On top of those bridges, the Store component provides higher level features to populate and query those stores with and
for documents.

Indexing
--------
Ingesting
---------

One higher level feature is the :class:`Symfony\\AI\\Store\\Indexer`. The purpose of this service is to populate a store with documents.
One higher level feature is the :class:`Symfony\\AI\\Store\\Ingester`. The purpose of this service is to populate a store with documents.
Therefore it accepts one or multiple :class:`Symfony\\AI\\Store\\Document\\TextDocument` objects, converts them into embeddings and stores them in the
used vector store::

use Symfony\AI\Store\Document\TextDocument;
use Symfony\AI\Store\Indexer;
use Symfony\AI\Store\Ingester;

$indexer = new Indexer($platform, $model, $store);
$document = new TextDocument('This is a sample document.');
$indexer->index($document);
$document = [new TextDocument('This is a sample document.')];
$loader = new InMemoryLoader($documents)
$indexer = new Ingester($loader, new Indexer($vectorizer, $store));
$indexer->index();

You can find more advanced usage in combination with an Agent using the store for RAG in the examples folder.

Expand Down
15 changes: 9 additions & 6 deletions docs/cookbook/rag-implementation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,17 +89,20 @@ Use a vectorizer to convert documents into embeddings and store them::
use Symfony\AI\Store\Document\Loader\InMemoryLoader;
use Symfony\AI\Store\Document\Vectorizer;
use Symfony\AI\Store\Indexer;
use Symfony\AI\Store\Ingester;

$platform = PlatformFactory::create(env('OPENAI_API_KEY'));
$vectorizer = new Vectorizer($platform, 'text-embedding-3-small');
$indexer = new Indexer(
$ingester = new Ingester(
new InMemoryLoader($documents),
$vectorizer,
$store
new Indexer(
$vectorizer,
$store
),
);
$indexer->index($documents);
$ingester->ingest();

The indexer handles:
The ingester handles:

* Loading documents from the source
* Generating vector embeddings
Expand Down Expand Up @@ -324,7 +327,7 @@ Index documents in batches for better performance::

$batchSize = 100;
foreach (array_chunk($documents, $batchSize) as $batch) {
$indexer->index($batch);
$ingester->ingest(options: $batch);
}

Caching Embeddings
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,31 @@
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer;
use Symfony\AI\Store\Document\Vectorizer;
use Symfony\AI\Store\Indexer;
use Symfony\AI\Store\Ingester;
use Symfony\AI\Store\InMemory\Store as InMemoryStore;

require_once dirname(__DIR__).'/bootstrap.php';

$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client());
$store = new InMemoryStore();
$vectorizer = new Vectorizer($platform, 'text-embedding-3-small');
$indexer = new Indexer(
$ingester = new Ingester(
loader: new TextFileLoader(),
vectorizer: $vectorizer,
store: $store,
source: [
dirname(__DIR__, 2).'/fixtures/movies/gladiator.md',
dirname(__DIR__, 2).'/fixtures/movies/inception.md',
dirname(__DIR__, 2).'/fixtures/movies/jurassic-park.md',
],
transformers: [
new TextReplaceTransformer(search: '## Plot', replace: '## Synopsis'),
new TextSplitTransformer(chunkSize: 500, overlap: 100),
],
indexer: new Indexer(
vectorizer: $vectorizer,
store: $store,
transformers: [
new TextReplaceTransformer(search: '## Plot', replace: '## Synopsis'),
new TextSplitTransformer(chunkSize: 500, overlap: 100),
],
),
);

$indexer->index();
$ingester->ingest([
dirname(__DIR__, 2).'/fixtures/movies/gladiator.md',
dirname(__DIR__, 2).'/fixtures/movies/inception.md',
dirname(__DIR__, 2).'/fixtures/movies/jurassic-park.md',
]);

$vector = $vectorizer->vectorize('Roman gladiator revenge');
$results = $store->query($vector);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer;
use Symfony\AI\Store\Document\Vectorizer;
use Symfony\AI\Store\Indexer;
use Symfony\AI\Store\Ingester;
use Symfony\AI\Store\InMemory\Store as InMemoryStore;
use Symfony\Component\Uid\Uuid;

Expand All @@ -38,17 +39,18 @@
),
];

$indexer = new Indexer(
$ingester = new Ingester(
loader: new InMemoryLoader($documents),
vectorizer: $vectorizer,
store: $store,
source: null,
transformers: [
new TextSplitTransformer(chunkSize: 100, overlap: 20),
],
indexer: new Indexer(
vectorizer: $vectorizer,
store: $store,
transformers: [
new TextSplitTransformer(chunkSize: 100, overlap: 20),
],
),
);

$indexer->index();
$ingester->ingest();

$vector = $vectorizer->vectorize('machine learning artificial intelligence');
$results = $store->query($vector);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer;
use Symfony\AI\Store\Document\Vectorizer;
use Symfony\AI\Store\Indexer;
use Symfony\AI\Store\Ingester;
use Symfony\AI\Store\InMemory\Store as InMemoryStore;
use Symfony\Component\HttpClient\HttpClient;

Expand All @@ -22,20 +23,21 @@
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client());
$store = new InMemoryStore();
$vectorizer = new Vectorizer($platform, 'text-embedding-3-small');
$indexer = new Indexer(
$ingester = new Ingester(
loader: new RssFeedLoader(HttpClient::create()),
vectorizer: $vectorizer,
store: $store,
source: [
'https://feeds.feedburner.com/symfony/blog',
'https://www.tagesschau.de/index~rss2.xml',
],
transformers: [
new TextSplitTransformer(chunkSize: 500, overlap: 100),
],
indexer: new Indexer(
vectorizer: $vectorizer,
store: $store,
transformers: [
new TextSplitTransformer(chunkSize: 500, overlap: 100),
],
)
);

$indexer->index();
$ingester->ingest([
'https://feeds.feedburner.com/symfony/blog',
'https://www.tagesschau.de/index~rss2.xml',
]);

$vector = $vectorizer->vectorize('Week of Symfony');
$results = $store->query($vector);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
use Symfony\AI\Store\Document\Transformer\TextTrimTransformer;
use Symfony\AI\Store\Document\Vectorizer;
use Symfony\AI\Store\Indexer;
use Symfony\AI\Store\Ingester;
use Symfony\AI\Store\InMemory\Store as InMemoryStore;
use Symfony\Component\Uid\Uuid;

Expand Down Expand Up @@ -56,18 +57,19 @@
new TextContainsFilter('SPAM:', caseSensitive: true),
];

$indexer = new Indexer(
$ingester = new Ingester(
loader: new InMemoryLoader($documents),
vectorizer: $vectorizer,
store: $store,
source: null,
filters: $filters,
transformers: [
new TextTrimTransformer(),
],
indexer: new Indexer(
vectorizer: $vectorizer,
store: $store,
filters: $filters,
transformers: [
new TextTrimTransformer(),
],
),
);

$indexer->index();
$ingester->ingest();

$vector = $vectorizer->vectorize('technology artificial intelligence');
$results = $store->query($vector);
Expand Down
5 changes: 3 additions & 2 deletions examples/memory/mariadb.php
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
use Symfony\AI\Store\Document\TextDocument;
use Symfony\AI\Store\Document\Vectorizer;
use Symfony\AI\Store\Indexer;
use Symfony\AI\Store\Ingester;
use Symfony\Component\Uid\Uuid;

require_once dirname(__DIR__).'/bootstrap.php';
Expand Down Expand Up @@ -57,8 +58,8 @@
// create embeddings for documents as preparation of the chain memory
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client());
$vectorizer = new Vectorizer($platform, $embeddings = 'text-embedding-3-small');
$indexer = new Indexer(new InMemoryLoader($documents), $vectorizer, $store, logger: logger());
$indexer->index($documents);
$ingester = new Ingester(new InMemoryLoader($documents), new Indexer($vectorizer, $store, logger: logger()), logger: logger());
$ingester->ingest();

// Execute a chat call that is utilizing the memory
$embeddingsModel = $platform->getModelCatalog()->getModel($embeddings);
Expand Down
Loading