diff --git a/docs/mindsdb_sql/sql/api/finetune.mdx b/docs/mindsdb_sql/sql/api/finetune.mdx index 5cb92df5db2..44bbe9dd4d0 100644 --- a/docs/mindsdb_sql/sql/api/finetune.mdx +++ b/docs/mindsdb_sql/sql/api/finetune.mdx @@ -19,8 +19,10 @@ Here is the syntax: ```sql FINETUNE [MODEL] project_name.model_name -FROM integration_name - (SELECT column_name, ... FROM table_name) +FROM [integration_name | project_name] + (SELECT column_name, ... + FROM [integration_name. | project_name.]table_name + [WHERE incremental_column > LAST]) [USING key = value, ...]; @@ -34,6 +36,7 @@ Where: | `model_name` | Name of the model to be retrained. | | `integration_name` | Name of the integration created using the [`CREATE DATABASE`](/sql/create/databases) statement or file upload. | | `(SELECT column_name, ... FROM table_name)` | Selecting additional data to be used for retraining. | +| `WHERE incremental_column > LAST` | Selecting only newly added data to be used to finetune the model. Learn more about the [`LAST` keyword here](/mindsdb_sql/sql/create/jobs#last). | | `USING key = value` | Optional. The `USING` clause lets you pass multiple parameters to the `FINETUNE` statement. | diff --git a/docs/mindsdb_sql/sql/api/join.mdx b/docs/mindsdb_sql/sql/api/join.mdx index 38890ca1ce3..0a642fd193c 100644 --- a/docs/mindsdb_sql/sql/api/join.mdx +++ b/docs/mindsdb_sql/sql/api/join.mdx @@ -22,7 +22,8 @@ FROM integration_name.table_name_1 [AS] d1 [JOIN ...] JOIN project_name.model_name_1 [AS] m1 [JOIN project_name.model_name_2 [AS] m2] -[JOIN ...]; +[JOIN ...] +[ON d1.input_data = m1.expected_argument]; ``` Where: @@ -34,6 +35,40 @@ Where: | `project_name.model_name_1` | Name of the model table used to make predictions. | | `project_name.model_name_2` | Optionally, you can join arbitrary number of models. | +### Mapping input data to model arguments + +If the input data contains a column named `question` and the model requires an argument named `input`, you can map these columns, as explained below. + +We have a model that expects to receive `input`: + +```sql +CREATE MODEL model_name +PREDICT answer +USING + engine = 'openai', + prompt_template = 'provide answers to an input from a user: {{input}}'; +``` + +We have an input data table that has the following columns: + +```sql ++----+-------------------------------------------+ +| id | question | ++----+-------------------------------------------+ +| 1 | How many planets are in the solar system? | +| 2 | How many stars are in the solar system? | ++----+-------------------------------------------+ +``` + +Now if you want to get answers to these questions using the model, you need to join the input data table with the model and map the `question` column onto the `input` argument. + +```sql +SELECT * +FROM input_table AS d +JOIN model_name AS m +ON d.question = m.input; +``` + ## Example 1 Let's join the `home_rentals` table with the `home_rentals_model` model using this statement: diff --git a/docs/mindsdb_sql/sql/api/retrain.mdx b/docs/mindsdb_sql/sql/api/retrain.mdx index f308c5045bc..4e7c217a44e 100644 --- a/docs/mindsdb_sql/sql/api/retrain.mdx +++ b/docs/mindsdb_sql/sql/api/retrain.mdx @@ -15,8 +15,9 @@ Here is the syntax: ```sql RETRAIN [MODEL] project_name.predictor_name -[FROM integration_name - (SELECT column_name, ... FROM table_name) +[FROM [integration_name | project_name] + (SELECT column_name, ... + FROM [integration_name. | project_name.]table_name) PREDICT target_name USING engine = 'engine_name', tag = 'tag_name', diff --git a/docs/mindsdb_sql/sql/create/jobs.mdx b/docs/mindsdb_sql/sql/create/jobs.mdx index 1ac3d4e5ae9..d84b84f6ce7 100644 --- a/docs/mindsdb_sql/sql/create/jobs.mdx +++ b/docs/mindsdb_sql/sql/create/jobs.mdx @@ -136,6 +136,30 @@ From this point on, whenever you add new records into the `fruit_data` table, it If you want to clear context for the `LAST` keyword in the editor, then run `set context = 0` or `set context = null`. +### Conditional Jobs + +Here is how you can create a conditional job that will execute periodically only if there is new data available: + +```sql +CREATE JOB conditional_job ( + + FINETUNE MODEL model_name + FROM ( + SELECT * + FROM datasource.table_name + WHERE incremental_column > LAST + ) +) +EVERY 1 min +IF ( + SELECT * + FROM datasource.table_name + WHERE incremental_column > LAST +); +``` + +The above job will be triggered every minute, but it will execute its task (i.e. finetuning the model) only if there is new data available. + ## Examples ### Example 1: Retrain a Model diff --git a/docs/mindsdb_sql/sql/create/model.mdx b/docs/mindsdb_sql/sql/create/model.mdx index 0dccb0f596a..498482e9777 100644 --- a/docs/mindsdb_sql/sql/create/model.mdx +++ b/docs/mindsdb_sql/sql/create/model.mdx @@ -21,9 +21,10 @@ Here is the full syntax: ```sql CREATE [OR REPLACE] MODEL [IF NOT EXISTS] project_name.predictor_name -[FROM integration_name +[FROM [integration_name | project_name] (SELECT [sequential_column,] [partition_column,] column_name, ... - FROM table_name)] + FROM [integration_name. | project_name.]table_name + [JOIN model_name])] PREDICT target_column diff --git a/docs/sdks/python/agents.mdx b/docs/sdks/python/agents.mdx index d8a4269175d..bfca2e70f81 100644 --- a/docs/sdks/python/agents.mdx +++ b/docs/sdks/python/agents.mdx @@ -45,7 +45,7 @@ agent = server.agents.create( ) ``` -Or use an exasting model: +Or use an existing model: ```python model = server.models.get('existing_model')