-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Structures mostly done, cleanup in docs/hub (subfolders)
- Loading branch information
1 parent
cd4d0e7
commit 5c2c30e
Showing
116 changed files
with
1,288 additions
and
608 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Structures | ||
|
||
Structures allow dynamically define the shape of data to be extracted | ||
by LLM. | ||
|
||
Use `Structure::define()` to define the structure and pass it to Instructor | ||
as response model. | ||
|
||
If `Structure` instance has been provided as a response model, Instructor | ||
returns an array in the shape you defined. | ||
|
||
`Structure::define()` accepts array of `Field` objects. | ||
|
||
Following types of fields are currently supported: | ||
- Field::bool() - bool value | ||
- Field::int() - int value | ||
- Field::string() - string value | ||
- Field::float() - float value | ||
- Field::enum() - enum value | ||
- Field::structure() - for nesting structures | ||
|
||
Fields can be marked as optional with `$field->optional(bool $isOptional = true)`. | ||
By default, all defined fields are required. | ||
|
||
You can also provide extra instructions for LLM by `$field->description(string | ||
$description)`. | ||
|
||
Instructor also supports validation for structures. | ||
|
||
You can define field validator with: | ||
- `$field->validator(callable $validator)` - $validator has to return an instance of `ValidationResult` | ||
- `$field->validIf(callable $condition, string $message)` - $condition has to return false if validation has not succeeded, $message with be provided to LLM as explanation for self-correction of the next extraction attempt | ||
|
||
```php | ||
<?php | ||
global $events; | ||
ini_set('display_errors', 1); | ||
ini_set('display_startup_errors', 1); | ||
error_reporting(E_ALL); | ||
|
||
$loader = require 'vendor/autoload.php'; | ||
$loader->add('Cognesy\\Instructor\\', __DIR__ . '../../src/'); | ||
|
||
use Cognesy\Instructor\Extras\Structure\Field; | ||
use Cognesy\Instructor\Extras\Structure\Structure; | ||
use Cognesy\Instructor\Instructor; | ||
|
||
enum Role : string { | ||
case Manager = 'manager'; | ||
case Line = 'line'; | ||
} | ||
|
||
$structure = Structure::define([ | ||
'name' => Field::string('Name of the person'), | ||
'age' => Field::int('Age of the person')->validIf(fn($value) => $value > 0, "Age has to be positive number"), | ||
'address' => Field::structure(Structure::define([ | ||
'street' => Field::string('Street name')->optional(), | ||
'city' => Field::string('City name'), | ||
'zip' => Field::string('Zip code')->optional(), | ||
]), 'Address of the person'), | ||
'role' => Field::enum(Role::class, 'Role of the person'), | ||
// 'properties' => Field::array(Structure::define([ | ||
// 'name' => Field::string('Name of the property'), | ||
// 'value' => Field::string('Value of the property'), | ||
// ]), 'Additional properties of the person'), | ||
], 'Person', 'A person object'); | ||
|
||
$text = "Jane Doe lives in Springfield. She is 25 years old and works as a line worker. McDonald's in Ney York is located at 456 Elm St, NYC, 12345."; | ||
|
||
$person = (new Instructor)->respond( | ||
messages: $text, | ||
responseModel: $structure, | ||
); | ||
|
||
dump($person); | ||
?> | ||
``` |
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion
2
docs/hub/l_l_m_support_azure_o_a_i.md → .../hub/api_support/llm_support_azure_oai.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Basic use | ||
|
||
Instructor allows you to use large language models to extract information | ||
from the text (or content of chat messages), while following the structure | ||
you define. | ||
|
||
LLM does not 'parse' the text to find and retrieve the information. | ||
Extraction leverages LLM ability to comprehend provided text and infer | ||
the meaning of the information it contains to fill fields of the | ||
response object with values that match the types and semantics of the | ||
class fields. | ||
|
||
The simplest way to use the Instructor is to call the `respond` method | ||
on the Instructor instance. This method takes a string (or an array of | ||
strings in the format of OpenAI chat messages) as input and returns a | ||
data extracted from provided text (or chat) using the LLM inference. | ||
|
||
Returned object will contain the values of fields extracted from the text. | ||
|
||
The format of the extracted data is defined by the response model, which | ||
in this case is a simple PHP class with some public properties. | ||
|
||
```php | ||
<?php | ||
$loader = require 'vendor/autoload.php'; | ||
$loader->add('Cognesy\\Instructor\\', __DIR__ . '../../src/'); | ||
|
||
use Cognesy\Instructor\Instructor; | ||
|
||
// Step 1: Define a class that represents the structure and semantics | ||
// of the data you want to extract | ||
class User { | ||
public int $age; | ||
public string $name; | ||
} | ||
|
||
// Step 2: Get the text (or chat messages) you want to extract data from | ||
$text = "Jason is 25 years old and works as an engineer."; | ||
print("Input text:\n"); | ||
print($text . "\n\n"); | ||
|
||
// Step 3: Extract structured data using default language model API (OpenAI) | ||
print("Extracting structured data using LLM...\n\n"); | ||
$user = (new Instructor)->respond( | ||
messages: $text, | ||
responseModel: User::class, | ||
); | ||
|
||
// Step 4: Now you can use the extracted data in your application | ||
print("Extracted data:\n"); | ||
dump($user); | ||
|
||
assert(isset($user->name)); | ||
assert(isset($user->age)); | ||
?> | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Basic use with HandlesExtraction trait | ||
|
||
Instructor provides `HandlesExtraction` trait that you can use to enable | ||
extraction capabilities directly on class via static `extract()` method. | ||
|
||
`extract()` method returns an instance of the class with the data extracted | ||
using the Instructor. | ||
|
||
`extract()` method has following signature (you can also find it in the | ||
`CanHandleExtraction` interface): | ||
|
||
```php | ||
static public function extract( | ||
string|array $messages, // (required) The message(s) to extract data from | ||
string $model = '', // (optional) The model to use for extraction (otherwise - use default) | ||
int $maxRetries = 2, // (optional) The number of retries in case of validation failure | ||
array $options = [], // (optional) Additional data to pass to the Instructor or LLM API | ||
Instructor $instructor = null // (optional) The Instructor instance to use for extraction | ||
) : static; | ||
``` | ||
|
||
```php | ||
<?php | ||
$loader = require 'vendor/autoload.php'; | ||
$loader->add('Cognesy\\Instructor\\', __DIR__ . '../../src/'); | ||
|
||
use Cognesy\Instructor\Extras\Mixin\HandlesExtraction; | ||
|
||
class User { | ||
use HandlesExtraction; | ||
|
||
public int $age; | ||
public string $name; | ||
} | ||
|
||
$user = User::extract("Jason is 25 years old and works as an engineer."); | ||
dump($user); | ||
|
||
assert(isset($user->name)); | ||
assert(isset($user->age)); | ||
?> | ||
``` |
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Validation | ||
|
||
Instructor uses validation to verify if the response generated by LLM | ||
meets the requirements of your response model. If the response does not | ||
meet the requirements, Instructor will throw an exception. | ||
|
||
Instructor uses Symfony's Validator component to validate the response, | ||
check their documentation for more information on the usage: | ||
https://symfony.com/doc/current/components/validator.html | ||
|
||
Following example demonstrates how to use Symfony Validator's constraints | ||
to validate the email field of response. | ||
|
||
```php | ||
<?php | ||
$loader = require 'vendor/autoload.php'; | ||
$loader->add('Cognesy\\Instructor\\', __DIR__.'../../src/'); | ||
|
||
use Cognesy\Instructor\Instructor; | ||
use Symfony\Component\Validator\Constraints as Assert; | ||
|
||
class UserDetails | ||
{ | ||
public string $name; | ||
#[Assert\Email] | ||
#[Assert\NotBlank] | ||
/** Find user's email provided in the text */ | ||
public string $email; | ||
} | ||
|
||
$caughtException = false; | ||
$user = (new Instructor)->request( | ||
messages: [['role' => 'user', 'content' => "you can reply to me via mail -- Jason"]], | ||
responseModel: UserDetails::class, | ||
)->onError(function($e) use (&$caughtException) { | ||
$caughtException = true; | ||
})->get(); | ||
|
||
dump($user); | ||
|
||
assert($user === null); | ||
assert($caughtException === true); | ||
?> | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Validation across multiple fields | ||
|
||
Sometimes property level validation is not enough - you may want to check values of multiple | ||
properties and based on the combination of them decide to accept or reject the response. | ||
Or the assertions provided by Symfony may not be enough for your use case. | ||
|
||
In such case you can easily add custom validation code to your response model by: | ||
- using `ValidationMixin` | ||
- and defining validation logic in `validate()` method. | ||
|
||
In this example LLM should be able to correct typo in the message (graduation year we provided | ||
is `1010` instead of `2010`) and respond with correct graduation year. | ||
|
||
```php | ||
<?php | ||
$loader = require 'vendor/autoload.php'; | ||
$loader->add('Cognesy\\Instructor\\', __DIR__.'../../src/'); | ||
|
||
use Cognesy\Instructor\Instructor; | ||
use Cognesy\Instructor\Validation\Traits\ValidationMixin; | ||
use Cognesy\Instructor\Validation\ValidationResult; | ||
|
||
class UserDetails | ||
{ | ||
use ValidationMixin; | ||
|
||
public string $name; | ||
public int $birthYear; | ||
public int $graduationYear; | ||
|
||
public function validate() : ValidationResult { | ||
if ($this->graduationYear > $this->birthYear) { | ||
return ValidationResult::valid(); | ||
} | ||
return ValidationResult::fieldError( | ||
field: 'graduationYear', | ||
value: $this->graduationYear, | ||
message: "Graduation year has to be after birth year.", | ||
); | ||
} | ||
} | ||
|
||
$user = (new Instructor)->respond( | ||
messages: [['role' => 'user', 'content' => 'Jason was born in 1990 and graduated in 1010.']], | ||
responseModel: UserDetails::class, | ||
maxRetries: 2 | ||
); | ||
|
||
dump($user); | ||
|
||
assert($user->graduationYear === 2010); | ||
?> | ||
``` |
Oops, something went wrong.