Skip to content

Commit

Permalink
Merge pull request #901 from mikkelhegn/ai-api
Browse files Browse the repository at this point in the history
Document inferencing option defaults
  • Loading branch information
mikkelhegn committed Sep 22, 2023
2 parents 1495e63 + c77f016 commit d965286
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion content/spin/serverless-ai-api-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ The set of operations is common across all supporting language SDKs:
| Operation | Parameters | Returns | Behavior |
|:-----|:----------------|:-------|:----------------|
| `infer` | model`string`<br /> prompt`string`| `string` | The `infer` is performed on a specific model.<br /> <br />The name of the model is the first parameter provided (i.e. `llama2-chat`, `codellama-instruct`, or other; passed in as a `string`).<br /> <br />The second parameter is a prompt; passed in as a `string`.<br />|
| `infer_with_options` | model`string`<br /> prompt`string`<br /> params`list` | `string` | The `infer_with_options` is performed on a specific model.<br /> <br />The name of the model is the first parameter provided (i.e. `llama2-chat`, `codellama-instruct`, or other; passed in as a `string`).<br /><br /> The second parameter is a prompt; passed in as a `string`.<br /><br /> The third parameter is a mix of float and unsigned integers relating to inferencing parameters in this order: <br />- `max-tokens` (unsigned 32 integer) Note: the backing implementation may return less tokens. <br /> - `repeat-penalty` (float 32) The amount the model should avoid repeating tokens. <br /> - `repeat-penalty-last-n-token-count` (unsigned 32 integer) The number of tokens the model should apply the repeat penalty to. <br /> - `temperature` (float 32) The randomness with which the next token is selected. <br /> - `top-k` (unsigned 32 integer) The number of possible next tokens the model will choose from. <br /> - `top-p` (float 32) The probability total of next tokens the model will choose from. <br /><br /> The result from `infer_with_options` is a `string` |
| `infer_with_options` | model`string`<br /> prompt`string`<br /> params`list` | `string` | The `infer_with_options` is performed on a specific model.<br /> <br />The name of the model is the first parameter provided (i.e. `llama2-chat`, `codellama-instruct`, or other; passed in as a `string`).<br /><br /> The second parameter is a prompt; passed in as a `string`.<br /><br /> The third parameter is a mix of float and unsigned integers relating to inferencing parameters in this order: <br /><br />- `max-tokens` (unsigned 32 integer) Note: the backing implementation may return less tokens. <br /> Default is 100<br /><br /> - `repeat-penalty` (float 32) The amount the model should avoid repeating tokens. <br /> Default is 1.1<br /><br /> - `repeat-penalty-last-n-token-count` (unsigned 32 integer) The number of tokens the model should apply the repeat penalty to. <br /> Default is 64<br /><br /> - `temperature` (float 32) The randomness with which the next token is selected. <br /> Default is 0.8<br /><br /> - `top-k` (unsigned 32 integer) The number of possible next tokens the model will choose from. <br /> Default is 40<br /><br /> - `top-p` (float 32) The probability total of next tokens the model will choose from. <br /> Default is 0.9<br /><br /> The result from `infer_with_options` is a `string` |
| `generate-embeddings` | model`string`<br /> prompt`list<string>`| `string` | The `generate-embeddings` is performed on a specific model.<br /> <br />The name of the model is the first parameter provided (i.e. `all-minilm-l6-v2`, passed in as a `string`).<br /> <br />The second parameter is a prompt; passed in as a `list` of `string`s.<br /><br /> The result from `generate-embeddings` is a two-dimension array containing float32 type values only |

The exact detail of calling these operations from your application depends on your language:
Expand Down

0 comments on commit d965286

Please sign in to comment.