-
-
Notifications
You must be signed in to change notification settings - Fork 103
[Examples][Store] Implement indexing pipeline #465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
chr-hertel
merged 1 commit into
symfony:main
from
OskarStark:feature/document-indexing-pipeline
Sep 8, 2025
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
<?php | ||
|
||
/* | ||
* This file is part of the Symfony package. | ||
* | ||
* (c) Fabien Potencier <fabien@symfony.com> | ||
* | ||
* For the full copyright and license information, please view the LICENSE | ||
* file that was distributed with this source code. | ||
*/ | ||
|
||
use Symfony\AI\Platform\Bridge\OpenAi\Embeddings; | ||
use Symfony\AI\Platform\Bridge\OpenAi\PlatformFactory; | ||
use Symfony\AI\Store\Bridge\Local\InMemoryStore; | ||
use Symfony\AI\Store\Document\Loader\TextFileLoader; | ||
use Symfony\AI\Store\Document\Transformer\TextReplaceTransformer; | ||
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer; | ||
use Symfony\AI\Store\Document\Vectorizer; | ||
use Symfony\AI\Store\Indexer; | ||
|
||
require_once dirname(__DIR__).'/bootstrap.php'; | ||
|
||
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client()); | ||
$store = new InMemoryStore(); | ||
$vectorizer = new Vectorizer($platform, new Embeddings('text-embedding-3-small')); | ||
$indexer = new Indexer( | ||
loader: new TextFileLoader(), | ||
vectorizer: $vectorizer, | ||
store: $store, | ||
source: [ | ||
dirname(__DIR__, 2).'/fixtures/movies/gladiator.md', | ||
dirname(__DIR__, 2).'/fixtures/movies/inception.md', | ||
dirname(__DIR__, 2).'/fixtures/movies/jurassic-park.md', | ||
], | ||
transformers: [ | ||
new TextReplaceTransformer(search: '## Plot', replace: '## Synopsis'), | ||
new TextSplitTransformer(chunkSize: 500, overlap: 100), | ||
], | ||
); | ||
|
||
$indexer->index(); | ||
|
||
$vector = $vectorizer->vectorize('Roman gladiator revenge'); | ||
$results = $store->query($vector); | ||
foreach ($results as $i => $document) { | ||
echo sprintf("%d. %s\n", $i + 1, substr($document->id, 0, 40).'...'); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
<?php | ||
|
||
/* | ||
* This file is part of the Symfony package. | ||
* | ||
* (c) Fabien Potencier <fabien@symfony.com> | ||
* | ||
* For the full copyright and license information, please view the LICENSE | ||
* file that was distributed with this source code. | ||
*/ | ||
|
||
use Symfony\AI\Platform\Bridge\OpenAi\Embeddings; | ||
use Symfony\AI\Platform\Bridge\OpenAi\PlatformFactory; | ||
use Symfony\AI\Store\Bridge\Local\InMemoryStore; | ||
use Symfony\AI\Store\Document\Loader\InMemoryLoader; | ||
use Symfony\AI\Store\Document\Metadata; | ||
use Symfony\AI\Store\Document\TextDocument; | ||
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer; | ||
use Symfony\AI\Store\Document\Vectorizer; | ||
use Symfony\AI\Store\Indexer; | ||
use Symfony\Component\Uid\Uuid; | ||
|
||
require_once dirname(__DIR__).'/bootstrap.php'; | ||
|
||
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client()); | ||
$store = new InMemoryStore(); | ||
$vectorizer = new Vectorizer($platform, new Embeddings('text-embedding-3-small')); | ||
|
||
$documents = [ | ||
new TextDocument( | ||
Uuid::v4(), | ||
'Artificial Intelligence is transforming the way we work and live. Machine learning algorithms can now process vast amounts of data and make predictions with remarkable accuracy.', | ||
new Metadata(['title' => 'AI Revolution']) | ||
), | ||
new TextDocument( | ||
Uuid::v4(), | ||
'Climate change is one of the most pressing challenges of our time. Renewable energy sources like solar and wind power are becoming increasingly important for a sustainable future.', | ||
new Metadata(['title' => 'Climate Action']) | ||
), | ||
]; | ||
|
||
$indexer = new Indexer( | ||
loader: new InMemoryLoader($documents), | ||
vectorizer: $vectorizer, | ||
store: $store, | ||
source: null, | ||
transformers: [ | ||
new TextSplitTransformer(chunkSize: 100, overlap: 20), | ||
], | ||
); | ||
|
||
$indexer->index(); | ||
|
||
$vector = $vectorizer->vectorize('machine learning artificial intelligence'); | ||
$results = $store->query($vector); | ||
foreach ($results as $i => $document) { | ||
echo sprintf("%d. %s\n", $i + 1, substr($document->id, 0, 40).'...'); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Gladiator (2000) | ||
|
||
**IMDB**: https://www.imdb.com/title/tt0172495/ | ||
|
||
**Director:** Ridley Scott | ||
|
||
## Cast | ||
|
||
- **Russell Crowe** as Maximus Decimus Meridius | ||
- **Joaquin Phoenix** as Emperor Commodus | ||
- **Connie Nielsen** as Lucilla | ||
- **Oliver Reed** as Proximo | ||
- **Derek Jacobi** as Senator Gracchus | ||
- **Djimon Hounsou** as Juba | ||
- **Richard Harris** as Marcus Aurelius | ||
- **Ralf Möller** as Hagen | ||
- **Tommy Flanagan** as Cicero | ||
- **David Schofield** as Falco | ||
|
||
## Plot | ||
|
||
A former Roman General sets out to exact vengeance against the corrupt emperor who murdered his family and sent him into slavery. | ||
|
||
**Maximus Decimus Meridius** is a powerful Roman general beloved by the people and the aging Emperor **Marcus Aurelius**. As Marcus Aurelius lies dying, he makes known his wish that Maximus should succeed him and return Rome to the former glory of the Republic rather than the corrupt Empire it has become. | ||
|
||
However, Marcus Aurelius's son **Commodus** learns of his father's plan and murders him before he can publicly name Maximus as his successor. Commodus then orders the execution of Maximus and his family. Maximus escapes the execution but arrives at his farm too late to save his wife and son. | ||
|
||
Wounded and devastated, Maximus is captured by slave traders and forced to become a gladiator. Under the training of **Proximo**, a former gladiator, Maximus becomes a skilled fighter and eventually makes his way to the **Colosseum** in Rome, where he gains fame and the crowd's favor. | ||
|
||
Using his newfound popularity with the people, Maximus seeks to avenge the murder of his family and fulfill his promise to Marcus Aurelius to restore Rome to a republic. The film culminates in a final confrontation between Maximus and Commodus in the arena. | ||
|
||
The film explores themes of *honor*, *revenge*, *political corruption*, and the struggle between personal desires and duty to the greater good. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Inception (2010) | ||
|
||
**IMDB**: https://www.imdb.com/title/tt1375666/ | ||
|
||
**Director:** Christopher Nolan | ||
|
||
## Cast | ||
|
||
- **Leonardo DiCaprio** as Dom Cobb | ||
- **Marion Cotillard** as Mal Cobb | ||
- **Tom Hardy** as Eames | ||
- **Elliot Page** as Ariadne | ||
- **Ken Watanabe** as Saito | ||
- **Dileep Rao** as Yusuf | ||
- **Cillian Murphy** as Robert Fischer Jr. | ||
- **Tom Berenger** as Peter Browning | ||
- **Michael Caine** as Professor Stephen Miles | ||
- **Lukas Haas** as Nash | ||
|
||
## Plot | ||
|
||
A skilled thief is given a chance at redemption if he can successfully perform inception, the act of planting an idea in someone's subconscious. | ||
|
||
**Dom Cobb** is a skilled thief who specializes in *extraction* - stealing secrets from people's subconscious minds while they dream. This unique skill has made him a valuable player in the world of corporate espionage, but it has also cost him everything he loves. Cobb's rare ability has made him a coveted player in this treacherous new world of corporate espionage, but it has also made him an international fugitive and cost him everything he has ever loved. | ||
|
||
Now Cobb is being offered a chance at redemption. One last job could give him his life back but only if he can accomplish the impossible - **inception**. Instead of the perfect heist, Cobb and his team of specialists have to pull off the reverse: their task is not to steal an idea but to plant one. If they succeed, it could be the perfect crime. | ||
|
||
The film explores themes of *reality*, *dreams*, *memory*, and the nature of consciousness through multiple layers of dream states, creating a complex narrative structure that challenges both characters and audience to question what is real. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Jurassic Park (1993) | ||
|
||
**IMDB**: https://www.imdb.com/title/tt0107290/ | ||
|
||
**Director:** Steven Spielberg | ||
|
||
## Cast | ||
|
||
- **Sam Neill** as Dr. Alan Grant | ||
- **Laura Dern** as Dr. Ellie Sattler | ||
- **Jeff Goldblum** as Dr. Ian Malcolm | ||
- **Richard Attenborough** as John Hammond | ||
- **Bob Peck** as Robert Muldoon | ||
- **Martin Ferrero** as Donald Gennaro | ||
- **BD Wong** as Dr. Henry Wu | ||
- **Joseph Mazzello** as Tim Murphy | ||
- **Ariana Richards** as Lex Murphy | ||
- **Wayne Knight** as Dennis Nedry | ||
|
||
## Plot | ||
|
||
During a preview tour, a theme park suffers a major power breakdown that allows its cloned dinosaur exhibits to run amok. | ||
|
||
Billionaire **John Hammond** has created a theme park on a remote island where he has successfully cloned dinosaurs from ancient DNA found in prehistoric mosquitoes preserved in amber. Before opening to the public, Hammond invites a select group of people to tour the park, including paleontologist **Dr. Alan Grant**, paleobotanist **Dr. Ellie Sattler**, and mathematician **Dr. Ian Malcolm**. | ||
|
||
The tour begins smoothly, but things quickly go wrong when the park's computer systems are sabotaged by the disgruntled programmer **Dennis Nedry**, who is attempting to steal dinosaur embryos. The security systems fail, and the dinosaurs break free from their enclosures. | ||
|
||
As the island descends into chaos, the visitors must survive encounters with various dangerous dinosaurs, including the intelligent and deadly **Velociraptors** and the massive **Tyrannosaurus Rex**. Dr. Grant finds himself responsible for Hammond's grandchildren, Tim and Lex, as they attempt to reach safety. | ||
|
||
The film explores themes of *scientific ethics*, the *hubris of trying to control nature*, and the *unintended consequences of genetic engineering*. It questions whether humans have the right to resurrect extinct species and whether scientific advancement should be pursued without considering the potential risks and moral implications. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.